Miss Event Analysis
Command Function
Uses the Statistical Profiling Extension (SPE) capability to analyze miss events such as LLC Miss, TLB Miss, Remote Access, and Long Latency Load. You can modify your program to reduce the probability of miss events and improve the program processing performance.
Syntax
1 | devkit tuner miss [-h] [-c {n | n,m | n-m}] [-d <sec>] [-P n] [-D <sec>] [-l {0, 1, 2, 3}] [-m {1, 2, 3, 4}] [-L n] [-i <sec>] [-r {user, kernel, all}] [-o] [-s] [-p {PID1 | PID1,PID2 | ALL}] [--package] [--long-name] [--dwarf] [workload workload...] |
The tool can collect data of a specified application. Replace [workload workload...] in the command with the application path and application parameter.
Parameter Description
Parameter |
Option |
Description |
|---|---|---|
-h/--help |
- |
Obtains help information. |
-c/--cpu |
- |
Number of CPU cores to be collected. The value can be 0 or 0, 1, 2 or 0-2. |
-d/--duration |
- |
Collection duration, in seconds. The minimum value is 1 second. By default collection never ends. You can press Ctrl+\ to cancel the task or press Ctrl+C to stop the collection and start analysis. |
-P/--period |
- |
Interval of sampling the number of instructions, which defaults to 8092. The value ranges from 1024 to 4,294,967,295. |
-D/--delay |
- |
Collection delay, which defaults to 0 seconds and must be less than the collection duration. |
-i/--interval |
- |
Collection interval, in seconds. The minimum value is 1 second and the maximum value cannot exceed the collection duration. The default value is the collection duration. If this parameter is not set, no subreports are generated. It specifies the time taken to collect data in each subreport. |
-l/--log-level |
0/1/2/3 |
Log level, which defaults to 1.
|
-m/--metric |
1/2/3/4 |
Data collection level, which defaults to 1 (LLC Miss).
|
-L/--latency |
- |
Minimum delay (clock cycle), which defaults to 0. This parameter can be set when collecting Long Latency Load data. |
-r/--collection-range |
user/kernel/all |
Collection mode, which defaults to all.
|
-o/--output |
- |
Report package name and output path. If you enter a name only, the report package is generated in the current directory by default. This option must be used together with --package. |
-s/--src-dir |
- |
C/C++ source code working directory, which is used to search for and associate source code. You can import a task to the web client to facilitate the display. |
-p/--pid |
PID/PID1, PID2/ALL |
ID of a process to be collected. Separate multiple PIDs with commas (,). By default, all processes are collected. If both the -p and -c parameters are used, the processes with the specified PIDs are preferentially collected. |
--package |
- |
Indicates whether to generate a report data package. If you do not set the package name or path, the miss-timestamp.tar package is generated in the current directory by default. |
--long-name |
- |
Indicates whether to display detailed function and module information. If this parameter is not set, the module or function information is displayed in a simple manner by default. |
-t/--top |
- |
Number of data records to be displayed in the report, which defaults to 10. The minimum value is 1. |
--dwarf |
- |
Indicates whether to generate C/C++ source code or assembly code files. |
Example
- Collecting system data:
1devkit tuner miss -c 0-127 -d 5 -o /home/miss_report -m 1 --package
The -c 0-127 parameter in this command collects CPU cores 0 to 127 with a collection duration of 5 seconds. The -o /home/miss_report and --package parameters generate a report data package named miss_report to a specified path. The -m 1 parameter collects LLC Miss events.
Command output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Miss Summary Report-all Time:2024/05/22 17:56:33 ================================================================================ ────────────────────────────────────────────────────────────────── Function Module LLC Miss ────────────────────────────────────────────────────────────────── UNKNOWN /home/devkit/lib/libpython3.9.so.1.0 14,196,736 (11.53%) _PyEval_EvalFrameDefault /home/devkit/lib/libpython3.9.so.1.0 11,321,344 (9.20%) UNKNOWN /usr/bin/devkit/tuner/lib/libsym.so 4,702,208 (3.82%) _perf_ioctl [kernel] 4,587,520 (3.73%) UNKNOWN /usr/lib64/libc-2.28.so 4,046,848 (3.29%) std::pair<std::_Rb_tree_***const, elf::sym> const&) /usr/bin/devkit/tuner/lib/libsym.so 3,694,592 (3.00%) UNKNOWN /home/devkit/libsqlite3/libsqlite3.so.0.8.6 3,588,096 (2.91%) seq_put_hex_ll [kernel] 3,080,192 (2.50%) _nohz_idle_balance [kernel] 1,941,504 (1.58%) __audit_syscall_exit [kernel] 1,933,312 (1.57%) ────────────────────────────────────────────────────────────────── 5509 milliseconds time elapsed The report /home/miss_report.tar is generated successfully. To view summary report. you can run: devkit report -i /home/miss_report.tar To view detail report. you can import the report to the WebUI or IDE to view details.
- Collecting application data:
1devkit tuner miss -d 5 --package /opt/testdemo/cache_miss
The preceding command collects /opt/testdemo/cache_miss data. The -d 5 parameter indicates a collection duration of 5 seconds. The --package parameter generates a report data package in the tool directory. By default, the package is named in the format of miss plus timestamp.
Command output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Miss Summary Report-all Time:2024/06/11 11:16:15 ================================================================================ ────────────────────────────────────────────────────────────────── Function Module LLC Miss ────────────────────────────────────────────────────────────────── main /opt/testdemo/cache_miss 74,964,992 (60.74%) copy_page [kernel] 33,554,432 (27.19%) change_protection_range [kernel] 6,815,744 (5.52%) UNKNOWN [kernel] 3,784,704 (3.07%) handle_percpu_devid_irq [kernel] 2,490,368 (2.02%) propagate_protected_usage [kernel] 917,504 (0.74%) page_counter_charge [kernel] 720,896 (0.58%) queued_spin_lock_slowpath [kernel] 32,768 (0.03%) account_system_index_time [kernel] 24,576 (0.02%) trigger_load_balance [kernel] 24,576 (0.02%) ────────────────────────────────────────────────────────────────── 6222 milliseconds time elapsed If *** is displayed in Function or Module, use --long-name to show full name. The report /usr/bin/devkit/miss-20240611-111608.tar is generated successfully. To view summary report. you can run: devkit report -i /usr/bin/devkit/miss-20240611-111608.tar To view detail report. you can import the report to the WebUI or IDE to view details.
- Collecting based on PIDs:
1devkit tuner miss -d 5 --package -p 414192
The -p 414192 parameter collects information about the process whose ID is 414192.
Command output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Miss Summary Report-all Time:2024/06/11 11:18:28 ================================================================================ ────────────────────────────────────────────────────────────────── Function Module LLC Miss ────────────────────────────────────────────────────────────────── UNKNOWN /usr/lib64/libpthread-2.28.so 32,505,856 (99.42%) queued_spin_lock_slowpath [kernel] 57,344 (0.18%) available_idle_cpu [kernel] 16,384 (0.05%) cpu_load_update_active [kernel] 16,384 (0.05%) futex_wait [kernel] 16,384 (0.05%) get_futex_value_locked [kernel] 16,384 (0.05%) trigger_load_balance [kernel] 16,384 (0.05%) __list_del_entry_valid [kernel] 8,192 (0.03%) fun2 /opt/testdemo/pthread_mutex_long 8,192 (0.03%) futex_wake [kernel] 8,192 (0.03%) ────────────────────────────────────────────────────────────────── 5976 milliseconds time elapsed If *** is displayed in Function or Module, use --long-name to show full name. The report /usr/bin/devkit/miss-20240611-111822.tar is generated successfully. To view summary report. you can run: devkit report -i /usr/bin/devkit/miss-20240611-111822.tar To view detail report. you can import the report to the WebUI or IDE to view details.
- Viewing the report:
1devkit report -i /usr/bin/devkit/miss-20240611-111822.tar
Command output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
────────────────────────────────────────────────────────────────── Function Module LLC Miss ────────────────────────────────────────────────────────────────── UNKNOWN /usr/lib64/libpthread-2.28.so 32,505,856 (99.42%) queued_spin_lock_slowpath [kernel] 57,344 (0.18%) available_idle_cpu [kernel] 16,384 (0.05%) cpu_load_update_active [kernel] 16,384 (0.05%) futex_wait [kernel] 16,384 (0.05%) get_futex_value_locked [kernel] 16,384 (0.05%) trigger_load_balance [kernel] 16,384 (0.05%) __list_del_entry_valid [kernel] 8,192 (0.03%) fun2 /opt/testdemo/pthread_mutex_long 8,192 (0.03%) futex_wake [kernel] 8,192 (0.03%) ────────────────────────────────────────────────────────────────── 5976 milliseconds time elapsed