Hotspot Function Analysis
Command Function
Analyzes C/C++ program code, identifies performance bottlenecks, and provides details about the top hotspot functions and call stacks. The tool also displays the function call relationship in flame graphs and provides the tuning path.
Syntax
devkit tuner hotspot [-h] [-c {n | n,m | n-m}] [-d <sec>] [-D <sec>] [-f n] [-l {0, 1, 2, 3}] [-i <sec>] [-r {user, kernel, all}] [-o] [-s] [-p {PID1 | PID1,PID2 | ALL}] [-g] [--package] [--long-name] [--dwarf] [workload workload...]
[workload workload...] can be used to collect data of a specified application. Replace [workload workload...] in the command with the application path and application parameter.
Parameter Description
|
Parameter |
Option |
Description |
|---|---|---|
|
-h/--help |
- |
Obtains help information. |
|
-c/--cpu |
- |
Number of CPU cores to be collected. The value can be 0 or 0, 1, 2 or 0-2. |
|
-d/--duration |
- |
Collection duration, in seconds. The minimum value is 1 second. By default collection never ends. You can press Ctrl+\ to cancel the task or press Ctrl+C to stop the collection and start analysis. |
|
-D/--delay |
- |
Collection delay, which defaults to 0 seconds and must be less than the collection duration. |
|
-i/--interval |
- |
Collection interval, in seconds. The minimum value is 1 second and the maximum value cannot exceed the collection duration. The default value is the collection duration. If this parameter is not set, no subreports are generated. It specifies the time taken to collect data in each subreport. |
|
-l/--log-level |
0/1/2/3 |
Log level, which defaults to 1.
|
|
-f/--frequency |
- |
Sampling frequency, which defaults to 200 times per second. The minimum value is 1 time per second. |
|
-e/--event |
- |
Events to be collected. You can run the devkit tuner hotspot list command to see what events can be collected. |
|
-o/--output |
- |
Report file name. Reports are generated in the current directory by default. |
|
-r/--collection-range |
user/kernel/all |
Collection mode, which defaults to all.
|
|
-s/--src-dir |
- |
Source code working directory, which is used to search for and associate source code. You can import a task to the web client to facilitate the display. |
|
-g |
- |
Displays call stack information. If the -g option is enabled, a flame graph HTML file is generated in the user directory by default. |
|
-p/--pid |
PID/PID1, PID2/ALL |
ID of a process to be collected. Separate multiple PIDs with commas (,). By default, all processes are collected. If both the -p and -c parameters are used, the processes with the specified PIDs are preferentially collected. |
|
--long-name |
- |
Indicates whether to display detailed function and module information. If this parameter is not set, the module or function information is displayed in a simple manner by default. |
|
--dwarf |
- |
Indicates whether to generate C/C++ source code or assembly code files. |
|
-t/--top |
- |
Number of data records to be displayed in the report. The minimum value is 1. |
|
--package |
- |
Indicates whether to import data to the database and generate compressed packages in the specified output path. |
Example
devkit tuner hotspot -c 0-127 -d 3 -i 1 -o /home/hotspot_cpu -g --package --long-name
Command output:
Hotspot Summary Report-1 Time:2024/07/19 10:12:32
================================================================================
────────────────────────────────────────────────────────────────────
Function cycles Module cycles(%)
────────────────────────────────────────────────────────────────────
__do_softirq 108,999,839 [kernel] 57.88
arch_cpu_idle 55,335,310 [kernel] 29.38
avc_lookup 8,693,198 [kernel] 4.62
0xfd950 3,706,419 /home/devkit/libsqlite3/libsqlite3.so.0.8.6 1.97
dput 3,706,419 [kernel] 1.97
__set_current_blocked 3,041,886 [kernel] 1.62
smp_call_function_single 2,763,855 [kernel] 1.47
__clock_gettime 1,135,231 /usr/lib64/libc.so.6 0.60
0x7eab4 879,665 /usr/lib64/libc.so.6 0.47
generic_exec_single 67,298 [kernel] 0.04
────────────────────────────────────────────────────────────────────
Hotspot Summary Report-2 Time:2024/07/19 10:12:33
================================================================================
────────────────────────────────────────────────────────────────────
Function cycles Module cycles(%)
───────────────────────────────────────────────────────────────────── std::pair<std::_Rb_tree_iterator<std::pair<unsigned long cons 81,259,412 /root/DevKit-CLI-24.0.RC3-Linux-
Kunpeng/tuner/lib/libsym.so 14.11
t, elf::sym> >, bool> std::_Rb_tree<unsigned long, std::pair<
unsigned long const, elf::sym>, std::_Select1st<std::pair<uns
igned long const, elf::sym> >, std::less<unsigned long>, std:
:allocator<std::pair<unsigned long const, elf::sym> > >::_M_i
nsert_unique<std::pair<unsigned long const, elf::sym> const&>
(std::pair<unsigned long const, elf::sym> const&)
malloc 76,662,049 /usr/lib64/libc.so.6 13.32
KUNPENG_SYM::SymbolResolve::RecordElf(char const*) 38,279,588 /root/DevKit-CLI-24.0.RC3-Linux-
Kunpeng/tuner/lib/libsym.so 6.65
std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release 25,561,443 /root/DevKit-CLI-24.0.RC3-Linux-
Kunpeng/tuner/libtuner.so 4.44
()
...
...
...
rt6_probe 735,855 [kernel] 0.13
flush_smp_call_function_from_idle 530,160 [kernel] 0.09
G1YoungRemSetSamplingClosure::do_heap_region(HeapRegion*) 406,124 /home/bisheng-jdk17/lib/server/libjvm.so 0.07
─────────────────────────────────────────────────────────────────────Hotspot Summary Report-3 Time:2024/07/19 10:12:34
================================================================================
────────────────────────────────────────────────────────────────────
Function cycles Module cycles(%)
─────────────────────────────────────────────────────────────────────────────────────
0x8e09c14 132,887,377 /home/bisheng-jdk17/lib/libzip.so 21.48
std::pair<std::_Rb_tree_iterator<std::pair<unsigned long cons 57,971,051 /root/DevKit-CLI-24.0.RC3-Linux-Kunpeng/tuner/lib/libsym.so 9.37
t, elf::sym> >, bool> std::_Rb_tree<unsigned long, std::pair<
unsigned long const, elf::sym>, std::_Select1st<std::pair<uns
igned long const, elf::sym> >, std::less<unsigned long>, std:
:allocator<std::pair<unsigned long const, elf::sym> > >::_M_i
nsert_unique<std::pair<unsigned long const, elf::sym> const&>
(std::pair<unsigned long const, elf::sym> const&)
0x8e09a4c 33,494,056 /home/bisheng-jdk17/lib/libzip.so 5.41
0x8e09a84 31,358,880 /home/bisheng-jdk17/lib/libzip.so 5.07
arch_cpu_idle 21,190,896 [kernel] 3.43
...
...
...
0xffff800008f78d80 781,761 [kernel] 0.13
ldsem_down_read_trylock 738,684 [kernel] 0.12
─────────────────────────────────────────────────────────────────────────────────────
Hotspot Summary Report-ALL Time:2024/07/19 10:12:32
================================================================================
─────────────────────────────────────────────────────────────────────────────────────
Function cycles Module cycles(%)
─────────────────────────────────────────────────────────────────────────────────────
std::pair<std::_Rb_tree_iterator<std::pair<unsigned long cons 139,230,463 /root/DevKit-CLI-24.0.RC3-Linux-Kunpeng/tuner/lib/libsym.so 10.07
t, elf::sym> >, bool> std::_Rb_tree<unsigned long, std::pair<
unsigned long const, elf::sym>, std::_Select1st<std::pair<uns
igned long const, elf::sym> >, std::less<unsigned long>, std:
:allocator<std::pair<unsigned long const, elf::sym> > >::_M_i
nsert_unique<std::pair<unsigned long const, elf::sym> const&>
(std::pair<unsigned long const, elf::sym> const&)
...
...
...
G1YoungRemSetSamplingClosure::do_heap_region(HeapRegion*) 406,124 /home/bisheng-jdk17/lib/server/libjvm.so 0.03
─────────────────────────────────────────────────────────────────────────────────────
3348 milliseconds time elapsed
Callstack is saved to /home/callstack-20240719-101232.log
Flamegraph is saved to /home/Flamegraph-20240719-101232.html
The report /home/hotspot_cpu1.tar is generated successfully.
To view summary report. you can run: devkit report -i /home/hotspot_cpu1.tar
To view detail report. you can import the report to the WebUI or IDE to view details.
By default, the flame graph HTML file is generated in the user directory. You can view the file using your browser.