Overview
The code samples in Table 1 described in this document demonstrate the functions of each tool of the Kunpeng DevKit. You can refer to these code samples when analyzing and optimizing your development projects in the Kunpeng DevKit.
Tool |
Scenario |
Description |
Sample Code |
|---|---|---|---|
Porting Advisor |
The Kunpeng Porting Advisor scans the C/C++/Fortran/assembly source code of the x86 platform software. It identifies the SO dependency in the source code, scans the code lines that need to be modified, and provides modification suggestions. It calculates the estimated workload based on the code modification efficiency set by the system, for the leadership to make project decisions based on the estimation. This function is under the first-level menu Source Code Porting. It is available in both the x86 and Kunpeng environments. NOTE:
Do not rescan the assembly source code after porting and modification. A rescan may cause inaccurate analysis results. |
Makefile file_lock.c file_lock.h ksw.c ksw.h interface.s |
|
Sample 2: Inline Assembly Translation (single-instruction and multi-instruction conversions) |
The tool supports the inline assembly function of the assembly translation module. This sample explains how to scan the C/C++ source code of x86-based software, identifies the inline assembly code in the source code, and provides suggestions for adapting the inline assembly code to the Kunpeng platform. |
swap.c gcd.c |
|
The tool supports the full assembly function of the assembly translation module. This sample explains how to scan the source code of x86-based software, identifies the full assembly code in the source code, and provides suggestions for adapting the full assembly code to the Kunpeng platform. |
test.s Makefile main.c |
||
System Profiler |
Sample 1: Matrix Analysis |
The Kunpeng DevKit System Profiler is used to tune the program for calculating the one-dimensional matrix based on the for loop. In this sample, the hotspot function analysis is performed to identify the hotspot function multiply for matrix calculation. Then, NEON instructions are used to tune the program, and the tuning effects are compared. |
multiply.c, multiply_simd.c, multiply_start.sh |
The hotspot function analysis function of the Kunpeng DevKit System Profiler is used to compare the analysis results of miss events accessed by row and by column based on the two-dimensional array loop traversal program. The analysis result indicates that row-wise access can increase the CPU cache hit efficiency. |
cache_hit.c, cache_miss.c, miss_start.sh, hit_start.sh |
||
Sample 3: Frequent Lock Preemption |
Lock preemption and contention frequently occur for multi-thread programs, causing waste of CPU resources. Generally, the public resource contention can be addressed by analyzing and simplifying the service logic. In this sample, the resource scheduling analysis and lock & wait analysis functions of the Kunpeng DevKit System Profiler are used to analyze the service logic. You can reduce the lock size and the number of concurrent threads to reduce lock contention. |
pthread_mutex.c, pthread_atomic.c |
|
Sample 4: MPI Application Analysis |
The HPC application analysis function of the Kunpeng DevKit System Profiler helps you learn about the communication status of the application in each rank. |
ring.c |
|
Sample 5: Long Application Execution Caused by MPI Blocking Communication Functions |
In an MPI/OpenMP hybrid scenario, you can use the HPC application analysis function of the Kunpeng DevKit System Profiler to understand how to tune application performance in each scenario. |
send_recv.cpp |
|
Sample 6: NUMA Refined Analysis |
In the non-uniform memory access (NUMA) architecture, the Kunpeng DevKit System Profiler can be used to perform NUMA refined analysis. It collects the NUMA performance of all processes in the system and identifies top N (top 10 for example) processes with the poorest NUMA performance. It generates statistics matrix about memory access between NUMA nodes and identifies unbalanced memory access between nodes, based on which tuning suggestions are provided. |
None |