Introduction
The best practices described in this document demonstrate the functions of each tool of the Kunpeng DevKit CLI. You can refer to these best practices when analyzing and optimizing your development projects in the Kunpeng DevKit CLI. See Table 1.
Tool |
Feature |
Best Practice |
Description |
|---|---|---|---|
Porting Advisor |
SMARTdenovo is a de novo sequence assembler for PacBio or Oxford Nanopore. It is open source software written in the C language. Use the Kunpeng DevKit Porting Advisor to analyze the SmartDenovo source package, helping to port applications. |
||
Netty is an NIO-based client and server programming framework. Use the Kunpeng DevKit Porting Advisor for assessment before porting the Netty software package. |
|||
System Profiler |
In this practice, the Kunpeng Performance Boundary Analyzer is used to quickly identify the problem scope. Preliminary results indicate a high branch misprediction rate, with the performance bottleneck located in microarchitectural metrics. The System Profiler is then used for deeper microarchitectural analysis, revealing a high CPU misprediction rate for conditional branches. Further inspection of the source code shows that the issue is caused by using data in conditional statements before it has been properly processed. To address this, the data is sorted in the source code, optimizing CPU branch prediction, increasing the branch prediction success rate, and improving overall application performance. |
||
In this practice, the Kunpeng Performance Boundary Analyzer is used to quickly identify the problem scope. It is initially determined that the hotspot functions are frequently invoked by the system and exhibit performance bottlenecks. Subsequently, the System Profiler is used to further analyze these hotspot functions, including examining the call stack through flame graphs. The analysis reveals that the high proportion of I/O system calls is due to the significant overhead of read system calls. Next, the memory mapping (mmap) method is used to reduce data copies and system calls, optimizing the large-file read logic and thereby reducing I/O latency and improving program performance. |
|||
In this practice, the Kunpeng Performance Boundary Analyzer is used to quickly identify performance issues. Preliminary analysis indicates that |