Hotspot Function Analysis
Command Function
Analyzes C/C++ program code, identifies performance bottlenecks, and provides details about the top hotspot functions and call stacks. The tool also displays the function call relationship in flame graphs and provides the tuning path.
Syntax
devkit tuner hotspot [-h] [-c {n | n,m | n-m}] [-d <sec>] [-D <sec>] [-f n] [-l {0, 1, 2, 3}] [-i <sec>] [-r {user, kernel, all}] [-o] [-s] [-p {PID1 | PID1,PID2 | ALL}] [-g] [--package] [--long-name] [--dwarf] [workload workload...]
Parameter Description
|
Parameter |
Option |
Description |
|---|---|---|
|
-h/--help |
- |
Obtains help information. |
|
-c/--cpu |
- |
Number of CPU cores to be collected. The value can be 0 or 0, 1, 2 or 0-2. |
|
-d/--duration |
- |
Collection duration, in seconds. By default collection never ends. You can press Ctrl+\ to cancel the task or press Ctrl+C to stop the collection and start analysis. |
|
-D/--delay |
- |
Sampling delay, which defaults to 0 seconds. |
|
-i/--interval |
- |
Collection interval, in seconds. The minimum value is 1 second and the maximum value cannot exceed the collection duration. The default value is the collection duration. If this parameter is not set, no subreports are generated. It specifies the time taken to collect data in each subreport. |
|
-l/--log-level |
0/1/2/3 |
Log level, which defaults to 1(info).
|
|
-f/--frequency |
- |
Sampling frequency, which defaults to 200 times per second. The minimum value is 1 time per second. |
|
-o/--output |
- |
Report file name. Reports are generated in the current directory by default. |
|
-r/--collection-range |
user/kernel/all |
Collection mode, which defaults to all.
|
|
-s/--src-dir |
- |
Source code working directory, which is used to search for and associate source code. You can import a task to the web client to facilitate the display. |
|
-g |
- |
Displays call stack information. If the -g option is enabled, a flame graph HTML file is generated in the user directory by default. |
|
-p/--pid |
PID/PID1, PID2/ALL |
PID of a process to be collected. Separate multiple PIDs with commas (,). By default, all processes are collected. If both the -p and -c parameters are used, the processes with the specified PIDs are preferentially collected. |
|
--long-name |
- |
Indicates whether to display detailed function and module information. If this parameter is not set, the module or function information is displayed in a simple manner by default. |
|
--dwarf |
- |
Indicates whether to generate C/C++ source code or assembly code files. |
|
-t/--top |
- |
Number of data records to be displayed in the report. The minimum value is 1. |
|
--package |
- |
Indicates whether to import data to the database and generate compressed packages in the specified output path. |
Example
devkit tuner hotspot -c 0-127 -d 3 -i 1 -o /home/hotspot_cpu -g --package --long-name
Command output:
Hotspot Summary Report-1 Time:2024/05/22 17:10:22 ================================================================================ ──────────────────────────────────────────────────────────────────────── Function Cycles Module Cycles(%) ──────────────────────────────────────────────────────────────────────── arch_cpu_idle 109,478,705 [kernel] 14.22 __do_softirq 81,866,664 [kernel] 10.63 UNKNOWN 30,688,513 libc.so.6 3.99 rcu_report_qs_rdp 28,416,897 [kernel] 3.69 UNKNOWN 27,848,831 libnss_sss.so.2 3.62 finish_task_switch 27,210,492 [kernel] 3.53 ... ... Hotspot Summary Report-2 Time:2024/05/22 17:10:23 ================================================================================ ──────────────────────────────────────────────────────────────────────── Function Cycles Module Cycles(%) ──────────────────────────────────────────────────────────────────────── arch_cpu_idle 93,959,194 [kernel] 12.94 std::pair<std::_Rb_tree_itera***long const, elf::sym> const&) 85,861,067 libsym.so 11.82 malloc 46,479,307 libc.so.6 6.40 std::_Sp_counted_base<(__gnu_***_Lock_policy)2>::_M_release() 38,285,738 libtuner.so 5.27 copy_page 33,935,533 [kernel] 4.67 std::_Rb_tree<unsigned long, ***ned long const, elf::sym> >*) 25,090,288 libsym.so 3.45 std::_Rb_tree<unsigned long, ***ned long const, elf::sym> >*) 24,942,214 libsym.so 3.43 finish_task_switch 23,804,264 [kernel] 3.28 arch_cpu_idle 109,478,705 [kernel] ... ... Hotspot Summary Report-ALL Time:2024/05/22 17:10:22 ================================================================================ ──────────────────────────────────────────────────────────────────────── Function Cycles Module Cycles(%) ──────────────────────────────────────────────────────────────────────── arch_cpu_idle 203,437,899 [kernel] 13.60 std::pair<std::_Rb_tree_itera***long const, elf::sym> const&) 85,861,067 libsym.so 5.74 __do_softirq 82,521,516 [kernel] 5.51 malloc 56,735,095 libc.so.6 3.79 finish_task_switch 51,014,756 [kernel] 3.41 UNKNOWN 47,753,528 libc.so.6 3.19 UNKNOWN 38,325,780 libnss_sss.so.2 2.56 std::_Sp_counted_base<(__gnu_***_Lock_policy)2>::_M_release() 38,285,738 libtuner.so 2.56 filemap_map_pages 34,252,979 [kernel] 2.29 copy_page 33,935,533 [kernel] 2.27 ... ... copyout 455,387 [kernel] 0.03 el0_da 455,387 [kernel] 0.03 rcu_gp_init 381,264 [kernel] 0.03 ──────────────────────────────────────────────────────────────────────── 2402 milliseconds time elapsed The callstack log /home/callstack-20240606-160259.log is generated successfully. The flamegraph html /home/Flamegraph-20240606-160259.html is generated successfully. The report /home/hotspot_cpu1.tar is generated successfully. To view summary report. you can run: devkit report -i /home/hotspot_cpu.tar To view detail report. you can import the report to the WebUI or IDE to view details.
By default, the flame graph HTML file is generated in the user directory. You can view the file using your browser.