我要评分
获取效率
正确性
完整性
易理解

Introduction

Overview

The Kunpeng DevKit command line tool is a toolset that includes the System Migration, Porting Advisor, Affinity Analyzer, System Profiler, and Java Profiler. This document describes how to obtain, install, and use the Kunpeng DevKit command line tool. The following table lists the supported functions:

Table 1 Functions supported by the Kunpeng DevKit command line tool

Tool

Description

System Migration

Collects information about the software installed in an application system, such as software packages, middleware, and databases.

Porting Advisor

Ports software from x86 servers running Linux to Kunpeng servers running Linux, with necessary software scan and analysis capabilities.

Affinity Analyzer

Checks software code on the Kunpeng 920 platform to improve code quality and memory access performance.

System Profiler

Collects and analyzes performance data in multiple scenarios, and provides tuning suggestions based on the tuning system.

Java Profiler

Analyzes and optimizes the performance of Java programs running on Kunpeng servers.

  • The System Migration and Porting Advisor tools can run on x86 servers or Kunpeng 920 servers.
  • The Affinity Analyzer, System Profiler, and Java Profiler tools must run Kunpeng 920 servers.

The Kunpeng DevKit command line tool performs the following functions:

  • The System Migration tool collects information about the software installed in an application system, such as software packages, middleware, and databases.
    Table 2 Function description

    Function

    Description

    Application information collection for system migration

    Collects ledger and component information about the software installed in an application system, such as software packages, middleware, and databases.

  • The Porting Advisor simplifies the application porting process and supports scanning, analysis, and porting of software from x86 Linux to Kunpeng Linux. This tool can automatically analyze applications and generate guide reports, greatly improving code porting efficiency.
    Table 3 Function description

    Function

    Description

    Source code porting

    Analyzes the portability of software written in C/C++/ASM/Fortran/Go or an interpreted language.

    Software porting assessment

    Analyzes the SO library files in the software installation path in the x86 environment and checks whether these files are compatible with the Kunpeng platform.

  • The Affinity Analyzer checks software code to improve code quality and memory access performance.
    Table 4 Function description

    Function

    Description

    64-bit running mode check

    Identifies the 32-bit applications to be ported to the 64-bit platform and provides modification suggestions. It supports GCC 4.8.5 to GCC 10.3.0.

    Byte alignment check

    Checks the byte alignment of structure variables in the source code.

    Memory consistency check

    Checks for any memory consistency problem when the source code is ported to the Kunpeng platform and provides suggestions on inserting memory barriers.

    Vectorization check

    Checks vectorizable code snippets and provides modification suggestions.

    Matricization check

    Checks matricizable code fragments and provides modification suggestions.

    Build affinity

    Analyzes the content in Makefile and CMakeLists.txt that can be replaced with content in the Kunpeng library, and provides replacement suggestions and function repair.

    Cache line alignment check

    Checks the 128-byte alignment of structure variables in the C/C++ source code to improve memory access performance.

    BC file generation

    A BC file is used for memory consistency check and vectorization check.

  • The System Profiler is a performance analysis tool for Kunpeng-powered servers. It collects performance data of processor hardware, operating system (OS), processes/threads, and functions, analyzes system performance metrics, locates system bottlenecks and hotspot functions, and provides tuning suggestions.
    Table 5 Function description

    Function

    Description

    Microarchitecture analysis

    Obtains the running status of instructions on the CPU pipeline based on Arm performance monitor unit (PMU) events, helping quickly locate performance bottlenecks of the current application on the CPU. You can modify your application to make full use of hardware resources.

    HPC application analysis

    Collects PMU events of the system and the key metrics of OpenMP and MPI applications to help you accurately obtain the serial and parallel times of the parallel region and barrier-to-barrier, calibrated 2-layer microarchitecture metrics, instruction distribution, L3 usage, and memory bandwidth.

    Memory access analysis

    Accesses the PMU events of the cache and memory and analyzes the number of storage access times, hit rate, and bandwidth.

    NUMA refined analysis

    Obtains the refined DDR access, NUMA access bandwidth matrix, and processes' memory access information based on Arm SPE capabilities.

    Roofline analysis

    Helps pinpoint application bottlenecks on a given hardware platform and optimize the application accordingly.

    Hotspot function analysis

    Analyzes C/C++ program code, identifies performance bottlenecks, and provides details about the top hotspot functions and call stacks. The tool also displays the function call relationship in flame graphs and provides the tuning path.

    Miss event analysis

    Uses the Statistical Profiling Extension (SPE) capability to analyze miss events such as LLC Miss, TLB Miss, Remote Access, and Long Latency Load. You can modify your program to reduce the probability of miss events and improve the program processing performance.

  • The Java Profiler analyzes and optimizes the performance of Java programs running on Kunpeng servers. The tool identifies hotspot functions, locates performance bottlenecks, and provides tuning suggestions.
    Table 6 Function description

    Function

    Description

    Hotspot analysis

    Collects stack information about CPU, CYCLES, LOCK, CACHE_MISSES, and ALLOC events at certain points of time, collects statistics on hotspot methods in the current JVM, and displays the information in a flame graph and an inverted flame graph.

Intended Audience

This document is intended for:

  • Kunpeng developers
  • Kunpeng software users
  • Independent software vendor (ISV) developers