Introduction
The System Profiler is a performance analysis tool for Kunpeng-powered servers. It collects performance data of processor hardware, operating system (OS), processes, threads, and functions, analyzes system performance metrics, locates system bottlenecks and hotspot functions, and provides tuning suggestions. This tool helps quickly locate and handle software performance problems.
Task Type |
Task Subtype |
Description |
Supported Platform |
|---|---|---|---|
General analysis |
Hotspot function analysis |
The tool analyzes C/C++ program code, identifies performance bottlenecks, and displays hotspot functions. It also displays the function call relationship in flame graphs and provides the tuning path. |
Kunpeng |
System component analysis |
NUMA refined analysis |
This analysis is based on the Arm Statistical Profiling Extension ( |
|
I/O analysis |
The tool analyzes the storage I/O performance. By analyzing block storage devices, the tool obtains performance data such as the number of I/O operations, I/O data size, I/O queue depth, and I/O operation delay, and identifies specific I/O operations, processes, threads, call stacks, and I/O APIs in the application layer. Based on the I/O performance data, the tool provides tuning suggestions. |
||
Dedicated analysis |
Lock and wait analysis |
The tool analyzes the lock and wait functions (including sleep, usleep, mutex, cond, spinlock, rwlock, and semaphore) of glibc and open source software, such as MySQL and OpenMP, associates the processes and call sites to which the lock and wait functions belong, and provides tuning suggestions based on existing experience. |
|
HPC application analysis |
The tool collects Performance Monitor Unit (PMU) events of the system and the key metrics of MPI and MPI+OpenMP applications to help accurately obtain the serial and parallel time of the parallel region and barrier-to-barrier, calibrated 2-layer microarchitecture metrics, instruction distribution, L3 usage, and memory bandwidth. |
||
Comparison analysis |
- |
For the same type of analysis tasks, you can select the same node or different nodes to compare the analysis results. In this way, you can quickly learn the differences between different analysis results, locate performance metric changes, and identify the effect of optimization methods. |
Use Restrictions
Task Type |
Task Subtype |
Description |
|---|---|---|
System component analysis |
NUMA refined analysis |
This function is available on openEuler and CentOS 7.6 with the Statistical Profiling Extension (SPE) feature. The supported openEuler kernel versions are 4.19 and later and the supported CentOS 7.6 kernel versions are 4.14.0-115.el7a.0.1, 4.14.0-115.2.2.el7a, 4.14.0-115.5.1.el7a, 4.14.0-115.6.1.el7a, 4.14.0-115.7.1.el7a, 4.14.0-115.8.2.el7a, and 4.14.0-115.10.1.el7a. This function is unavailable on VMs. |
I/O analysis |
The system kernel supports ftrace collection. |
|
Dedicated analysis |
HPC application analysis |
During OpenMP data collection, the kernel parameters /proc/sys/kernel/kptr_restrict and /proc/sys/kernel/perf_event_paranoid are enabled to collect call graph data and PMU events. After the collection is complete, the two kernel parameters are restored to their original values. |
Lock and wait analysis |
The environment must support the extended Berkeley Packet Filter (eBPF) configuration. |
|
Comparative analysis |
- |
Hotspot function analysis is supported. |