Introduction
Overview
The Kunpeng DevKit command line tool is a toolset that includes the System Migration, Porting Advisor, Affinity Analyzer, System Profiler, Java Profiler, and System Diagnosis tools. This document describes how to obtain, install, and use the Kunpeng DevKit command line tool. The following table lists the supported functions:
Tool |
Description |
|---|---|
System Migration |
Collects information about the software installed in an application system, such as software packages, middleware, and databases. |
Porting Advisor |
Ports software from x86 servers running Linux to Kunpeng servers running Linux, with necessary software scan and analysis capabilities. |
Affinity Analyzer |
Checks software code on the Kunpeng 920 platform to improve code quality and memory access performance. |
System Profiler |
Collects and analyzes performance data in multiple scenarios, and provides tuning suggestions based on the tuning system. |
Python/C Profiler |
Samples Python programs and mixed programs of Python and C/C++ and analyzes call stacks. |
Java Profiler |
Analyzes and optimizes the performance of Java applications running on Kunpeng servers. |
System Diagnosis |
Analyzes exceptions that occur in applications. |
- The System Migration and Porting Advisor tools can run on x86 servers or Kunpeng 920 servers.
- The Affinity Analyzer, Python/C Profiler, System Profiler, Java Profiler, and System Diagnosis tools must run Kunpeng 920 servers.
System Migration
Function |
Description |
|---|---|
Application information collection for system migration |
Collects ledger and component information about the software installed in an application system, such as software packages, middleware, and databases. |
Porting Advisor
Function |
Description |
|---|---|
Source code porting |
Analyzes the portability of software written in C/C++/ASM/Fortran/Go or an interpreted language. |
Software porting assessment |
Analyzes the SO library files in the software installation path in the x86 environment and checks whether these files are compatible with the Kunpeng platform. |
Affinity Analyzer
Function |
Description |
|---|---|
64-bit running mode check |
Identifies the 32-bit applications to be ported to the 64-bit platform and provides modification suggestions. It supports GCC 4.8.5 to GCC 10.3.0. |
Byte alignment check |
Checks the byte alignment of structure variables in the source code. |
Memory consistency check |
Checks for any memory consistency problem when the source code is ported to the Kunpeng platform and provides suggestions on inserting memory barriers. |
Vectorization check |
Checks vectorizable code snippets and provides modification suggestions. |
Matricization check |
Checks matricizable code snippets and provides modification suggestions. |
Build affinity |
Analyzes the content in Makefile and CMakeLists.txt that can be replaced with content in the Kunpeng library, and provides replacement suggestions and function repair. |
Cache line alignment check |
Checks the 128-byte alignment of structure variables in the C/C++ source code to improve memory access performance. |
BC file generation |
A BC file is used for memory consistency check and vectorization check. |
AutoFDO |
Automatic feedback-directed optimization (AutoFDO) is an effective performance improvement measure. |
Data race check |
The data race check function checks for any dynamic memory inconsistency problem when C/C++ source code is running on the Kunpeng platform (specifically Kunpeng 920), and provides the check result and also suggestions on inserting memory barriers. |
System Profiler
Function |
Description |
|---|---|
Microarchitecture analysis |
Obtains the running status of instructions on the CPU pipeline based on Arm performance monitor unit (PMU) events, helping quickly locate performance bottlenecks of the current application on the CPU. You can modify your application to make full use of hardware resources. |
HPC application analysis |
Collects PMU events of the system and the key metrics of OpenMP and MPI applications to help you accurately obtain the serial and parallel times of the parallel region and barrier-to-barrier, calibrated 2-layer microarchitecture metrics, instruction distribution, L3 usage, and memory bandwidth. |
Memory access analysis |
Accesses the PMU events of the cache and memory and analyzes the number of storage access times, hit rate, and bandwidth. |
NUMA refined analysis |
Obtains the refined DDR access, |
Roofline analysis |
Helps pinpoint application bottlenecks on a given hardware platform and optimize the application accordingly. |
Hotspot function analysis |
Analyzes C/C++ program code, identifies performance bottlenecks, and provides details about the top hotspot functions and call stacks. The tool also displays the function call relationship in flame graphs and provides the tuning path. |
Miss event analysis |
Uses the Statistical Profiling Extension (SPE) capability to analyze miss events such as LLC Miss, TLB Miss, Remote Access, and Long Latency Load. You can modify your program to reduce the probability of miss events and improve the program processing performance. |
Hotspot function analysis (Python/C) |
Uses ptrace to sample Python programs and Python & C/C++ hybrid programs, analyzes call stacks, obtains top 20 hotspot functions, and draws flame graphs. |
AI tuning |
Uses Huawei-developed high-performance AI tuning solutions to optimize database and big data application performance. The tool provides optimal parameter settings for automatic tuning. |
Python/C Profiler
Function |
Description |
|---|---|
Hotspot function analysis |
Uses ptrace to sample Python programs and Python & C/C++ hybrid programs, analyzes call stacks, obtains top 20 hotspot functions, and draws flame graphs. |
Java Profiler
Function |
Description |
|---|---|
Hotspot analysis |
Collects stack information about CPU, CYCLES, LOCK, CACHE_MISSES, and ALLOC events at certain points of time, collects statistics on hotspot methods in the current JVM, and displays the information in a flame graph and an inverted flame graph. |
System Diagnosis
Function |
Description |
|---|---|
Memory usage |
Collects performance data about memory allocation and release, and checks whether any allocated memory space has not been released. |
Memory overwriting |
Analyzes memory overwriting problems of applications and provides memory overwriting and access information. |
CPCA
Function |
Description |
|---|---|
Non-SM algorithm detection |
Analyzes cryptographic algorithms that do not comply with CPCA specifications in source code and provides analysis reports. |
Sensitive information scan |
Scans for sensitive file information in specified directories based on the built-in sensitive information library and user-defined rules, and provides detection reports. |
Intended Audience
This document is intended for:
- Kunpeng developers
- Kunpeng software users
- Independent software vendor (ISV) developers