Rate This Document
Findability
Accuracy
Completeness
Readability

Microarchitecture Analysis

Command Function

Obtains the running status of instructions on the CPU pipeline based on Arm performance monitor unit (PMU) events, helping quickly locate performance bottlenecks of the current application on the CPU. You can modify your application to make full use of hardware resources.

Syntax

devkit tuner topdown [-h | --help][-d <DURATION> | --duration=DURATION][-p <PID> | --pid=PID][-r <RANGE> |
--collection-range=RANGE][-c <CPU> | --cpu=CPU][-L <METRIC> | --profile-level=METRIC][-D <DELAY> | --delay=
DELAY][-i <INTERVAL> | --interval=INTERVAL] COMMAND [appDir][appArgs]

The devkit tuner topdown command can be used to collect metrics of a specified application by adding the application path and application parameters to the end of the command. If the -c/--cpu or -p/--pid parameter is specified, the specified metrics are preferentially collected.

Parameter Description

Table 1 Parameter description

Parameter

Option

Description

-h/--help

-

Obtains help information.

-c/--cpu

-

Number of CPU cores to be collected. The value can be 0 or 0, 1, 2 or 0-2.

-d/--duration

-

Collection duration, in seconds. By default collection never ends. You can press Ctrl+\ to cancel the task or press Ctrl+C to stop the collection and start analysis.

-D/--delay

-

Collection delay duration, which defaults to 0 seconds. The maximum value of this parameter cannot exceed the configured collection duration.

-i/--interval

-

Collection interval, which defaults to 1 second. If the collection duration -d is set, the collection interval must be less than or equal to the configured collection duration.

-l/--log-level

0/1/2/3

Log level, which defaults to 1(info).

  • 0(debug)
  • 1(info)
  • 2(warning)
  • 3(error)

-L/--profile-level

1/2/3/4/5/6

Analysis metric, which defaults to 1.

  • 1: Back-End Bound, Bad Speculation, Front-End Bound, and Retiring are collected by default.
  • 2: The Back-End Bound->Core Bound collection is performed. Back-End is the processor portion that performs out-of-order dispatch and execution of micro-ops (uOps) and returns results. Core Bound is a subclass of Back-End Bound. It reflects the ratio of performance bottlenecks due to insufficient CPU execution unit resources.
  • 3: The Back-End Bound->Memory Bound collection is performed. Back-End is the processor portion that performs out-of-order dispatch and execution of uOps and returns results. Memory Bound is a subclass of Back-End Bound. It reflects pipeline stalls due to data read/write waiting.
  • 4: The Back-End Bound->Resource Bound collection is performed (only for Kunpeng 920 processors). Back-End is the processor portion that performs out-of-order dispatch and execution of uOps and returns results. Resource Bound is a subclass of Back-End Bound. It reflects pipeline stalls that occur when uOps are dispatched to an out-of-order execution scheduler due to insufficient resources.
  • 5: Bad Speculation is collected. It reflects pipeline resources waste due to incorrect instruction speculations.
  • 6: Front-End Bound is collected. It is a part of a processor where instructions are fetched and decoded into uOps for the back-end pipeline execution. This metric reflects the proportion of processor front-end resources that are under-utilized.

-o/--output

-

Report file name. The topdown-xxxx-xxxx.tar file is generated in the current path by default. If a file with the same name exists, the file_name--xxxx-xxxx.tar file is generated.

-r/--collection-range

user/kernel/all

Process collection level. When -p/--pid is set to ALL, the option user or kernel can be selected, which means that user-mode processes or kernel-mode processes can be collected. The default value is all.

  • user: user mode
  • kernel: kernel mode
  • all: both modes

-p/--pid

PID/PID1, PID2/ALL

PID of a process to be collected. Separate multiple PIDs with commas (,). The default value is ALL. If both the -p and -c parameters are used, the processes with the specified PIDs are preferentially collected.

--package

-

Indicates whether to import data to the database and generate compressed packages in the specified output path.

Example

  • Collection based on CPUs:
    devkit tuner topdown -c 0-127  -d 3 -o /home/topdown_cpu -L 2 --package

    The -c 0-127 parameter in this command collects CPU cores 0 to 127 with a collection duration of 3 seconds. The -o /home/topdown_cpu and --package parameters generate a report data package named topdown_cpu to a specified path. The -L 2 parameter collects the Back-End Bound->Core Bound instruction data.

    Command output:

       Topdown metrics of 'CPU(s) 0-127':
    
         Instructions                                 564372224
         Cycles                                       814631616
         IPC                                          0.69
         Backend Bound                                30.16%
         |-- Core Bound                               |-- 20.40%
             |-- Divider Stall                            |-- 0.00%
             |-- FSU Stall                                |-- 0.00%
             |-- Exe Ports Stall                          |-- 20.40%
         Bad Speculation                              12.29%
         Frontend Bound                               40.23%
         Retiring                                     17.32%
    
       3421 milliseconds time elapsed
    
    info: The report /home/topdowm.tar is generated successfully.
    info: To view summary report. you can run: devkit report -i /home/topdown.tar
    info: To view detail report. you can import the report to the WebUI or IDE to view details.
  • Collection based on process IDs:
    devkit tuner topdown -p 1891015 -d 3 -o /home/topdown_pid --package

    In this command, -p 1891015 collects the process whose PID is 1891015 with a collection duration of 3 seconds. -o /home/topdown_pid and --package generate a report data package named topdown_pid to a specified path. If -L is not specified, the Back-End Bound, Bad Speculation, Front-End Bound, and Retiring instruction data are collected.

    Command output:

       Topdown metrics of process id '1891015':
    
         Instructions                                 30194684
         Cycles                                       65768472
         IPC                                          0.46
         Backend Bound                                29.20%
         Bad Speculation                              7.12%
         Frontend Bound                               52.20%
         Retiring                                     11.48%
    
       3362 milliseconds time elapsed
    
    info: The report /home/topdowm-pid.tar is generated successfully.
    info: To view summary report. you can run: devkit report -i /home/topdown_pid.tar
    info: To view detail report. you can import the report to the WebUI or IDE to view details.
  • Collection based on applications:
    devkit tuner topdown -d 10 -o /home/topdown_app -L 2 --package /opt/testdemo/falsesharinglong

    The collection duration in this command is 10 seconds. The -o /home/topdown_app and --package parameters generate a report data package named topdown_app to a specified path. The -L 2 parameter collects the Back-End Bound->Core Bound instruction data.

    Command output:

       Topdown metrics of '/opt/testdemo/falsesharinglong':
    
         Instructions                                 59373846528
         Cycles                                       52865875968
         IPC                                          1.12
         Backend Bound                                67.58%
         |-- Core Bound                               |-- 63.31%
             |-- Divider Stall                            |-- 0.00%
             |-- FSU Stall                                |-- 0.00%
             |-- Exe Ports Stall                          |-- 63.31%
         Bad Speculation                              3.56%
         Frontend Bound                               0.78%
         Retiring                                     28.08%
    
       11153 milliseconds time elapsed
    
    info: The report /home/topdown_app.tar is generated successfully.
    info: To view summary report. you can run: devkit report -i /home/topdown_app.tar
    info: To view detail report. you can import the report to the WebUI or IDE to view details.

The command output is the overview about the microarchitecture analysis task. For the time sequence information, you can use the --package parameter to generate a TAR package and import the package to the WebUI for visualized information. For details, see contents about importing tasks in Task Management.