Rate This Document
Findability
Accuracy
Completeness
Readability

Hotspot Function Analysis

Command Function

Analyzes C/C++ program code, identifies performance bottlenecks, and provides details about the top hotspot functions and call stacks. The tool also displays the function call relationship in flame graphs and provides the tuning path.

Syntax

devkit tuner hotspot [-h] [-c {n | n,m | n-m}] [-d <sec>] [-D <sec>] [-f n] [-l {0, 1, 2, 3}] [-i <sec>] [-r {user, kernel, all}] [-o] [-s] [-p {PID1 | PID1,PID2 | ALL}] [-g] [--package] [--long-name] [--dwarf] [workload workload...]

Parameter Description

Table 1 Parameter description

Parameter

Option

Description

-h/--help

-

Obtains help information.

-c/--cpu

-

Number of CPU cores to be collected. The value can be 0 or 0, 1, 2 or 0-2.

-d/--duration

-

Collection duration, in seconds. By default collection never ends. You can press Ctrl+\ to cancel the task or press Ctrl+C to stop the collection and start analysis.

-D/--delay

-

Sampling delay, which defaults to 0 seconds.

-i/--interval

-

Collection interval, in seconds. The minimum value is 1 second and the maximum value cannot exceed the collection duration. The default value is the collection duration. If this parameter is not set, no subreports are generated. It specifies the time taken to collect data in each subreport.

-l/--log-level

0/1/2/3

Log level, which defaults to 1(info).

  • 0(debug)
  • 1(info)
  • 2(warning)
  • 3(error)

-f/--frequency

-

Sampling frequency, which defaults to 200 times per second. The minimum value is 1 time per second.

-o/--output

-

Report file name. Reports are generated in the current directory by default.

-r/--collection-range

user/kernel/all

Collection mode, which defaults to all.

  • all: collects user-mode and kernel-mode performance data.
  • user: collects user-mode performance data.
  • kernel: collects kernel-mode performance data.

-s/--src-dir

-

Source code working directory, which is used to search for and associate source code. You can import a task to the web client to facilitate the display.

-g

-

Displays call stack information. If the -g option is enabled, a flame graph HTML file is generated in the user directory by default.

-p/--pid

PID/PID1, PID2/ALL

PID of a process to be collected. Separate multiple PIDs with commas (,). By default, all processes are collected. If both the -p and -c parameters are used, the processes with the specified PIDs are preferentially collected.

--long-name

-

Indicates whether to display detailed function and module information. If this parameter is not set, the module or function information is displayed in a simple manner by default.

--dwarf

-

Indicates whether to generate C/C++ source code or assembly code files.

-t/--top

-

Number of data records to be displayed in the report. The minimum value is 1.

--package

-

Indicates whether to import data to the database and generate compressed packages in the specified output path.

Example

devkit tuner hotspot -c 0-127 -d 3 -i 1 -o /home/hotspot_cpu -g --package --long-name

Command output:

Hotspot Summary Report-1                                Time:2024/05/22 17:10:22
================================================================================

────────────────────────────────────────────────────────────────────────
  Function                                                                     Cycles    Module                                     Cycles(%)
────────────────────────────────────────────────────────────────────────
  arch_cpu_idle                                                           109,478,705    [kernel]                                     14.22
  __do_softirq                                                             81,866,664    [kernel]                                     10.63
  UNKNOWN                                                                  30,688,513    libc.so.6                                     3.99
  rcu_report_qs_rdp                                                        28,416,897    [kernel]                                      3.69
  UNKNOWN                                                                  27,848,831    libnss_sss.so.2                               3.62
  finish_task_switch                                                       27,210,492    [kernel]                                      3.53
...
...
Hotspot Summary Report-2                                Time:2024/05/22 17:10:23
================================================================================

────────────────────────────────────────────────────────────────────────
  Function                                                                     Cycles    Module                                     Cycles(%)
────────────────────────────────────────────────────────────────────────
 arch_cpu_idle                                                             93,959,194    [kernel]                                     12.94
  std::pair<std::_Rb_tree_itera***long const, elf::sym> const&)            85,861,067    libsym.so                                    11.82
  malloc                                                                   46,479,307    libc.so.6                                     6.40
  std::_Sp_counted_base<(__gnu_***_Lock_policy)2>::_M_release()            38,285,738    libtuner.so                                   5.27
  copy_page                                                                33,935,533    [kernel]                                      4.67
  std::_Rb_tree<unsigned long, ***ned long const, elf::sym> >*)            25,090,288    libsym.so                                     3.45
  std::_Rb_tree<unsigned long, ***ned long const, elf::sym> >*)            24,942,214    libsym.so                                     3.43
  finish_task_switch                                                       23,804,264    [kernel]                                      3.28
  arch_cpu_idle                                                           109,478,705    [kernel]                                     
...
...
Hotspot Summary Report-ALL                              Time:2024/05/22 17:10:22
================================================================================

────────────────────────────────────────────────────────────────────────
  Function                                                                     Cycles    Module                                     Cycles(%)
────────────────────────────────────────────────────────────────────────
 arch_cpu_idle                                                            203,437,899    [kernel]                                     13.60
  std::pair<std::_Rb_tree_itera***long const, elf::sym> const&)            85,861,067    libsym.so                                     5.74
  __do_softirq                                                             82,521,516    [kernel]                                      5.51
  malloc                                                                   56,735,095    libc.so.6                                     3.79
  finish_task_switch                                                       51,014,756    [kernel]                                      3.41
  UNKNOWN                                                                  47,753,528    libc.so.6                                     3.19
  UNKNOWN                                                                  38,325,780    libnss_sss.so.2                               2.56
  std::_Sp_counted_base<(__gnu_***_Lock_policy)2>::_M_release()            38,285,738    libtuner.so                                   2.56
  filemap_map_pages                                                        34,252,979    [kernel]                                      2.29
  copy_page                                                                33,935,533    [kernel]                                      2.27
  ...
  ...
  copyout                                                                     455,387    [kernel]                                      0.03
  el0_da                                                                      455,387    [kernel]                                      0.03
  rcu_gp_init                                                                 381,264    [kernel]                                      0.03
────────────────────────────────────────────────────────────────────────
2402 milliseconds time elapsed

The callstack log /home/callstack-20240606-160259.log is generated successfully.
The flamegraph html /home/Flamegraph-20240606-160259.html is generated successfully.
The report /home/hotspot_cpu1.tar is generated successfully.
To view summary report. you can run: devkit report -i /home/hotspot_cpu.tar
To view detail report. you can import the report to the WebUI or IDE to view details.

By default, the flame graph HTML file is generated in the user directory. You can view the file using your browser.

Figure 1 Flame graph HTML file