Rate This Document
Findability
Accuracy
Completeness
Readability

Sample 6: NUMA Refined Analysis

Introduction

In the non-uniform memory access (NUMA) architecture, the Kunpeng DevKit System Profiler can be used to perform NUMA refined analysis. It collects the NUMA performance of all processes in the system and identifies top N (top 10 for example) processes with the poorest NUMA performance and their hotspot memory area. It generates statistics matrix about memory access between NUMA nodes and identifies unbalanced memory access between nodes, based on which tuning suggestions are provided.

Setting Up the Environment

  1. Check whether a compatible OS is installed on the server. Use the Kunpeng DevKit Compatibility Checker to view the details.
  2. Check that the system supports the Statistical Profiling Extension (SPE) and Performance Monitoring Unit (PMU) and configure the SPE environment.
  3. Check that the Kunpeng DevKit System Profiler has been installed on the server.

NUMA Refined Analysis

  1. Create a NUMA refined analysis task and start the task.

    Click next to the System Profiler and select General analysis. On the task creation page that is displayed, select NUMA Refined , set the required parameters, and click OK to start the NUMA refined analysis task.

    Figure 1 Creating a NUMA refined analysis task
    Table 1 Task parameters

    Parameter

    Description

    Analysis Object

    Set it to System.

    Sampling Duration (s)

    The default value is 30.

    Report Interval (s)

    The default value is 10.

    Sampling Interval (Instructions)

    The default value is 163840.

  2. View the analysis results.

    The Summary tab page displays tuning suggestions and scores the NUMA status. The NUMA score is used to measure the NUMA memory access status of the entire system. The score ranges from 0 to 1. If the score is 1, all memory access operations are local. A score closer to 0 indicates more cross-NUMA remote access.

    Click to locate the latest report. When all reports are generated, this icon becomes unavailable. Click a time point on the timeline to view the data of a report interval. You can also click General Report to view all the data collected.
    Figure 2 Result overview
  3. View process details.
    Figure 3 Top 10 memory NUMA access processes
    In the Top 10 Memory NUMA Access Processes area, click a process or thread ID to view the process or thread details. On the details page, you can search for and view the function access details in the Process Function Memory Access or Thread Function Memory Access area.
    Figure 4 Process overview
    Figure 5 Process function memory access

Tuning Suggestions

You are advised to bind processes that access a large amount of node memory to cores based on the bandwidth to improve system performance.