Rate This Document
Findability
Accuracy
Completeness
Readability

Examples of Using the System Methodology Profiler

The following example uses a compressed package and assumes that you have switched to the tool directory. It describes how to use the tool on a Kunpeng 920 server to collect and analyze system performance data and detect metric differences.

Figure 1 Overall process
  1. Run the collection command.
    1
    ./ksys collect -d 10 -i 1 -o /home/test/
    
    • Set the collection duration to 10 seconds and the collection interval to 1 second. Set the JSON file directory to /home/test/.
    • After the collection is complete, the summary data is printed but not saved to the JSON file.

    Command output:

      1
      2
      3
      4
      5
      6
      7
      8
      9
     10
     11
     12
     13
     14
     15
     16
     17
     18
     19
     20
     21
     22
     23
     24
     25
     26
     27
     28
     29
     30
     31
     32
     33
     34
     35
     36
     37
     38
     39
     40
     41
     42
     43
     44
     45
     46
     47
     48
     49
     50
     51
     52
     53
     54
     55
     56
     57
     58
     59
     60
     61
     62
     63
     64
     65
     66
     67
     68
     69
     70
     71
     72
     73
     74
     75
     76
     77
     78
     79
     80
     81
     82
     83
     84
     85
     86
     87
     88
     89
     90
     91
     92
     93
     94
     95
     96
     97
     98
     99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
    112
    113
    114
    115
    116
    117
    118
    119
    120
    121
    122
    123
    124
    125
    126
    127
    128
    129
    130
    131
    132
    133
    134
    135
    136
    137
    138
    139
    140
    141
    142
    143
    144
    145
    146
    147
    148
    149
    150
    151
    152
    153
    154
    155
    156
    157
    158
    159
    160
    161
    162
    163
    164
    165
    166
    167
    168
    169
    170
    171
    172
    Hotspot data collection is disabled. Refer to /home/ksys-xxx-Linux-aarch64/config.yaml for details
    Starting to collect data. You can press Ctrl+C to stop the task.
    Starting to parse data. You can press Ctrl+C stop the task.
    Progress: 2/2 | Sub-progress(spe data): 10/10.
    Starting to parse the data. This may take some time. You can press Ctrl+C to forcibly stop the task.
    ======================================================================CPU Metrics======================================================================
    Common Microarchitecture Metrics Summary Data (System wide)
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+---------+
    | IPC  | PATH LENGTH  | MPKI | BPKI | L1D MPKI | L1I MPKI | L2D MPKI | L2I MPKI | DTLB MPKI | ITLB MPKI | CPU-NUM |
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+---------+
    | 0.33 | 101152422747 | 3.62 | 1.22 |     3.66 |     5.88 |     2.59 |     0.43 |       4.7 |      0.27 |     256 |
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+---------+
    
    Topdown Summary Data (System wide)
    +----------------------------+-------+
    |           Metric           | Value |
    +----------------------------+-------+
    | Retiring(%)                |  5.83 |
    | Frontend Bound(%)          | 11.24 |
    |   Fetch Latency Bound(%)   | 10.86 |
    |   Fetch Bandwidth Bound(%) |  0.38 |
    | Bad Speculation(%)         |  0.73 |
    |   Branch Mispredicts(%)    |  0.58 |
    |   Machine Clears(%)        |  0.15 |
    | Backend Bound(%)           |  82.2 |
    |   Core Bound(%)            | 39.48 |
    |   Memory Bound(%)          | 42.72 |
    | CPU-NUM                    |   256 |
    +----------------------------+-------+
    
    OS Metrics Summary Data (System wide)
    +------------------+------------+-------------+---------+
    | context-switches | migrations | page-faults | CPU-NUM |
    +------------------+------------+-------------+---------+
    |           309603 |       2277 |      200448 |     256 |
    +------------------+------------+-------------+---------+
    
    INSTRUCTION Summary Data (System wide)
    +----------------------------------+-------+
    |              Metric              | Value |
    +----------------------------------+-------+
    | Memory(%)                        | 26.94 |
    |   Load(%)                        | 23.41 |
    |   Store(%)                       |  3.53 |
    | Integer(%)                       | 49.53 |
    | Floating Point(%)                |  0.02 |
    | Advanced SIMD(%)                 |  0.13 |
    | Crypto(%)                        |   0.0 |
    | Branches(%)                      | 23.24 |
    |   Immediate(%)                   | 21.27 |
    |   Return(%)                      |  0.84 |
    |   Indirect(%)                    |  1.14 |
    | Barriers(%)                      |  0.08 |
    |   Instruction Synchronization(%) |  0.01 |
    |   Data Synchronization(%)        |   0.0 |
    |   Data Memory(%)                 |  0.06 |
    | Not Retired(%)                   |  0.06 |
    +----------------------------------+-------+
    
    Load_avg Summary Data (System wide)
    +--------------+--------------+---------------+
    | recent 1 min | recent 5 min | recent 15 min |
    +--------------+--------------+---------------+
    |         0.56 |         0.44 |          0.35 |
    +--------------+--------------+---------------+
    
    Softirqs Summary Data (System wide)
    +----------+----------+---------+---------+---------+
    | NET_TX/s | NET_RX/s | BLOCK/s | SCHED/s | CPU-NUM |
    +----------+----------+---------+---------+---------+
    |        0 |        0 |       0 |       4 |     256 |
    +----------+----------+---------+---------+---------+
    
    CPU_stat Summary Data (System wide)
    +----------------+--------------+-------------------+
    | ctx_switches/s | interrupts/s | soft_interrupts/s |
    +----------------+--------------+-------------------+
    |        24710.0 |      58114.0 |            2207.0 |
    +----------------+--------------+-------------------+
    
    CPU_freq Summary Data (System wide)
    +--------------+----------+----------+
    | current(MHz) | min(MHz) | max(MHz) |
    +--------------+----------+----------+
    |      2899.65 |    400.0 |   2900.0 |
    +--------------+----------+----------+
    
    CPU_percent Summary Data (System wide)
    +-------+-------+---------+-------+---------+------+----------+--------+--------+-------------+
    | %user | %nice | %system | %idle | %iowait | %irq | %softirq | %steal | %guest | %guest_nice |
    +-------+-------+---------+-------+---------+------+----------+--------+--------+-------------+
    |  0.12 |   0.0 |    0.16 | 98.74 |    0.01 | 0.13 |     0.01 |    0.0 |    0.0 |         0.0 |
    +-------+-------+---------+-------+---------+------+----------+--------+--------+-------------+
    
    =================================================================Memory Access Metrics=================================================================
    DDRC Summary Data (System wide)
    +---------------------------------+--------+-----------------+-----------------+
    |              DEVICE             |  NUMA  | ddrc_rd_bw MB/s | ddrc_wr_bw MB/s |
    +---------------------------------+--------+-----------------+-----------------+
    | DDRC DEVICE hisi_sccl3_ddrc0_0  | Node 0 |           22.28 |            9.12 |
    | DDRC DEVICE hisi_sccl3_ddrc0_1  | Node 0 |           22.67 |            9.46 |
    | DDRC DEVICE hisi_sccl3_ddrc2_0  | Node 0 |           22.17 |             8.7 |
    | DDRC DEVICE hisi_sccl3_ddrc2_1  | Node 0 |           22.59 |            9.55 |
    | DDRC DEVICE hisi_sccl3_ddrc3_0  | Node 0 |           22.67 |            9.22 |
    | DDRC DEVICE hisi_sccl3_ddrc3_1  | Node 0 |           22.53 |            9.48 |
    | DDRC DEVICE hisi_sccl3_ddrc5_0  | Node 0 |           22.44 |            8.76 |
    | DDRC DEVICE hisi_sccl3_ddrc5_1  | Node 0 |            22.3 |            9.09 |
    | DDRC DEVICE hisi_sccl1_ddrc0_0  | Node 1 |           20.33 |           12.86 |
    | DDRC DEVICE hisi_sccl1_ddrc0_1  | Node 1 |           20.02 |           13.66 |
    | DDRC DEVICE hisi_sccl1_ddrc2_0  | Node 1 |           20.93 |           13.06 |
    | DDRC DEVICE hisi_sccl1_ddrc2_1  | Node 1 |           22.33 |           20.36 |
    | DDRC DEVICE hisi_sccl1_ddrc3_0  | Node 1 |           19.64 |           12.99 |
    | DDRC DEVICE hisi_sccl1_ddrc3_1  | Node 1 |           19.87 |           13.06 |
    | DDRC DEVICE hisi_sccl1_ddrc5_0  | Node 1 |           20.06 |           13.41 |
    | DDRC DEVICE hisi_sccl1_ddrc5_1  | Node 1 |           20.19 |           14.26 |
    | DDRC DEVICE hisi_sccl11_ddrc0_0 | Node 2 |           26.54 |           12.26 |
    | DDRC DEVICE hisi_sccl11_ddrc0_1 | Node 2 |           26.55 |           11.85 |
    | DDRC DEVICE hisi_sccl11_ddrc2_0 | Node 2 |           27.08 |           16.91 |
    | DDRC DEVICE hisi_sccl11_ddrc2_1 | Node 2 |           27.01 |           12.28 |
    | DDRC DEVICE hisi_sccl11_ddrc3_0 | Node 2 |           27.49 |           20.73 |
    | DDRC DEVICE hisi_sccl11_ddrc3_1 | Node 2 |           26.63 |           12.12 |
    | DDRC DEVICE hisi_sccl11_ddrc5_0 | Node 2 |           26.78 |           12.31 |
    | DDRC DEVICE hisi_sccl11_ddrc5_1 | Node 2 |           26.88 |           11.87 |
    | DDRC DEVICE hisi_sccl9_ddrc0_0  | Node 3 |           11.87 |            5.77 |
    | DDRC DEVICE hisi_sccl9_ddrc0_1  | Node 3 |           11.57 |             6.0 |
    | DDRC DEVICE hisi_sccl9_ddrc2_0  | Node 3 |           11.86 |            5.88 |
    | DDRC DEVICE hisi_sccl9_ddrc2_1  | Node 3 |           11.45 |            5.78 |
    | DDRC DEVICE hisi_sccl9_ddrc3_0  | Node 3 |           11.64 |            5.89 |
    | DDRC DEVICE hisi_sccl9_ddrc3_1  | Node 3 |           11.91 |            6.17 |
    | DDRC DEVICE hisi_sccl9_ddrc5_0  | Node 3 |           11.56 |            5.25 |
    | DDRC DEVICE hisi_sccl9_ddrc5_1  | Node 3 |           11.85 |            5.96 |
    +---------------------------------+--------+-----------------+-----------------+
    ...
    
    =======================================================================IO Metrics======================================================================
    PCIE Summary Data (System wide)
    ------------------------------------------------------------------------------------------------
    Note:
        The bandwidth on the PCIe device side differ from the commonly understood bandwidth.
        For more detailed descriptions, please refer to the README.md.
    
    +--------------------------------------------------------------+---------------+---------------+
    |                         PCIE DEVICE                          | rx_rd_bw MB/s | rx_wr_bw MB/s |
    +--------------------------------------------------------------+---------------+---------------+
    | PCIE DEVICE 03:00.0 Signal processing controller: Huawei     |           0.0 |           0.0 |
    | Technologies Co., Ltd. iBMA Virtual Network Adapter (rev 01) |               |               |
    +--------------------------------------------------------------+---------------+---------------+
    | PCIE DEVICE 02:00.0 VGA compatible controller: Huawei        |           0.0 |           0.0 |
    | Technologies Co., Ltd. Hi171x Series [iBMC Intelligent       |               |               |
    | Management system chip w/VGA support] (rev 01)               |               |               |
    +--------------------------------------------------------------+---------------+---------------+
    
    PA Summary Data (System wide)
    -----------------------------------------------------------------
    Note:
        PA (Protocol Adapter) can be used to collect CPU-CPU and CPU-GPU bandwidth.
        For more detailed descriptions, please refer to the README.md.
    
    +---------------------------+-----------------+-----------------+
    |         PA DEVICE         | PA2Ring_bw MB/s | Ring2PA_bw MB/s |
    +---------------------------+-----------------+-----------------+
    | PA DEVICE hisi_sicl8_pa0  |             0.0 |             0.0 |
    | PA DEVICE hisi_sicl0_pa0  |             0.0 |             0.0 |
    | PA DEVICE hisi_sicl10_pa0 |          234.47 |          102.51 |
    | PA DEVICE hisi_sicl2_pa0  |           194.2 |          125.32 |
    +---------------------------+-----------------+-----------------+
    
    ...
    ...
    ...
    
    Data saved successfully at /home/test/2025_11_18_15_17_25_report.json
    

    After the collection is complete, a terminal report and a JSON performance data file (/home/test/2025_11_18_15_17_25_report.json) are generated. The terminal report displays multi-dimensional metrics such as CPU and memory access. It can be observed that the context switch frequency of the current server is high (ctx_switches/s is 24710.0), and the DDRC bandwidth is low (between 0 MB/s and 30 MB/s). This indicates that compute-intensive services are running in the current environment.

  2. Analyze the generated performance data file and generate an Excel report.
    1
    ./ksys report -i /home/test/2025_11_18_15_17_25_report.json -o /home/test/
    
    • 2025_11_18_15_17_25_report.json is the JSON file generated by running the ksys collect command.
    • After the analysis is complete, the summary data is printed and saved together with the time series data to an Excel file.
    • The time series data is displayed in a line chart or area chart. The time lines in each chart are aligned.

    Command output:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    Analyzing system data... Please wait.
    ======================================================================CPU Metrics======================================================================
    Common Microarchitecture Metrics Summary Data (System wide)
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+---------+
    | IPC  | PATH LENGTH  | MPKI | BPKI | L1D MPKI | L1I MPKI | L2D MPKI | L2I MPKI | DTLB MPKI | ITLB MPKI | CPU-NUM |
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+---------+
    | 0.33 | 101152422747 | 3.62 | 1.22 |     3.66 |     5.88 |     2.59 |     0.43 |       4.7 |      0.27 |     256 |
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+---------+
    
    Topdown Summary Data (System wide)
    +----------------------------+-------+
    |           Metric           | Value |
    +----------------------------+-------+
    | Retiring(%)                |  5.83 |
    | Frontend Bound(%)          | 11.24 |
    |   Fetch Latency Bound(%)   | 10.86 |
    |   Fetch Bandwidth Bound(%) |  0.38 |
    | Bad Speculation(%)         |  0.73 |
    |   Branch Mispredicts(%)    |  0.58 |
    |   Machine Clears(%)        |  0.15 |
    | Backend Bound(%)           |  82.2 |
    |   Core Bound(%)            | 39.48 |
    |   Memory Bound(%)          | 42.72 |
    | CPU-NUM                    |   256 |
    +----------------------------+-------+
    
    OS Metrics Summary Data (System wide)
    +------------------+------------+-------------+---------+
    | context-switches | migrations | page-faults | CPU-NUM |
    +------------------+------------+-------------+---------+
    |           309603 |       2277 |      200448 |     256 |
    +------------------+------------+-------------+---------+
    
    INSTRUCTION Summary Data (System wide)
    +----------------------------------+-------+
    |              Metric              | Value |
    +----------------------------------+-------+
    | Memory(%)                        | 26.94 |
    |   Load(%)                        | 23.41 |
    |   Store(%)                       |  3.53 |
    | Integer(%)                       | 49.53 |
    | Floating Point(%)                |  0.02 |
    | Advanced SIMD(%)                 |  0.13 |
    | Crypto(%)                        |   0.0 |
    | Branches(%)                      | 23.24 |
    |   Immediate(%)                   | 21.27 |
    |   Return(%)                      |  0.84 |
    |   Indirect(%)                    |  1.14 |
    | Barriers(%)                      |  0.08 |
    |   Instruction Synchronization(%) |  0.01 |
    |   Data Synchronization(%)        |   0.0 |
    |   Data Memory(%)                 |  0.06 |
    | Not Retired(%)                   |  0.06 |
    +----------------------------------+-------+
    
    Load_avg Summary Data (System wide)
    +--------------+--------------+---------------+
    | recent 1 min | recent 5 min | recent 15 min |
    +--------------+--------------+---------------+
    |         0.56 |         0.44 |          0.35 |
    +--------------+--------------+---------------+
    
    Softirqs Summary Data (System wide)
    +----------+----------+---------+---------+---------+
    | NET_TX/s | NET_RX/s | BLOCK/s | SCHED/s | CPU-NUM |
    +----------+----------+---------+---------+---------+
    |        0 |        0 |       0 |       4 |     256 |
    +----------+----------+---------+---------+---------+
    
    CPU_stat Summary Data (System wide)
    +----------------+--------------+-------------------+
    | ctx_switches/s | interrupts/s | soft_interrupts/s |
    +----------------+--------------+-------------------+
    |        24710.0 |      58114.0 |            2207.0 |
    +----------------+--------------+-------------------+
    
    CPU_freq Summary Data (System wide)
    +--------------+----------+----------+
    | current(MHz) | min(MHz) | max(MHz) |
    +--------------+----------+----------+
    |      2899.65 |    400.0 |   2900.0 |
    +--------------+----------+----------+
    
    CPU_percent Summary Data (System wide)
    +-------+-------+---------+-------+---------+------+----------+--------+--------+-------------+
    | %user | %nice | %system | %idle | %iowait | %irq | %softirq | %steal | %guest | %guest_nice |
    +-------+-------+---------+-------+---------+------+----------+--------+--------+-------------+
    |  0.12 |   0.0 |    0.16 | 98.74 |    0.01 | 0.13 |     0.01 |    0.0 |    0.0 |         0.0 |
    +-------+-------+---------+-------+---------+------+----------+--------+--------+-------------+
    
    ...
    ...
    ...
    
    Save statistics and time series data to an Excel file. Please wait.
    The report has been saved to /home/test/2025_11_18_15_58_18_report.xlsx
    

    After the analysis task is complete, a terminal report and an Excel file (/home/test/2025_11_18_15_17_25_report.json) are generated. The Excel file contains multi-dimensional time series data (such as CPU and device data), and a visualized time series chart is generated.

  3. Run the comparison command to compare the performance differences and generate a comparison report.
    1
    ./ksys diff -i /home/test/2025_11_18_15_17_25_report.json /home/test/2025_11_18_16_11_28_report.json -o /home/test
    

    2025_11_18_15_17_25_report.json and 2025_11_18_16_11_28_report.json are JSON files generated by running the ksys collect command. The data after comparison is saved in the Excel file in the /home/test/ directory.

    Command output:

      1
      2
      3
      4
      5
      6
      7
      8
      9
     10
     11
     12
     13
     14
     15
     16
     17
     18
     19
     20
     21
     22
     23
     24
     25
     26
     27
     28
     29
     30
     31
     32
     33
     34
     35
     36
     37
     38
     39
     40
     41
     42
     43
     44
     45
     46
     47
     48
     49
     50
     51
     52
     53
     54
     55
     56
     57
     58
     59
     60
     61
     62
     63
     64
     65
     66
     67
     68
     69
     70
     71
     72
     73
     74
     75
     76
     77
     78
     79
     80
     81
     82
     83
     84
     85
     86
     87
     88
     89
     90
     91
     92
     93
     94
     95
     96
     97
     98
     99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
    112
    113
    114
    115
    116
    117
    118
    119
    120
    121
    122
    123
    124
    125
    126
    127
    128
    129
    130
    131
    132
    133
    134
    135
    136
    137
    138
    139
    140
    141
    142
    143
    144
    145
    146
    147
    148
    149
    150
    151
    152
    153
    154
    155
    156
    157
    158
    159
    160
    161
    162
    163
    164
    165
    166
    167
    168
    169
    170
    171
    172
    173
    174
    175
    176
    177
    178
    ======================================================================System Info======================================================================
    System Architecture diff:
    +--------------+-------------------------------+-------------------------------+------+
    |    Metric    |             Before            |             After             | Diff |
    +--------------+-------------------------------+-------------------------------+------+
    | Cpu Type     | Kunpeng920 high-performance   | Kunpeng920 high-performance   | N/A  |
    | Model Name   | HUAWEI Kunpeng 920 V200 7270Z | HUAWEI Kunpeng 920 V200 7270Z | N/A  |
    | Vendor ID    | HiSilicon                     | HiSilicon                     | N/A  |
    | Hyper Thread | True                          | True                          | N/A  |
    | CPU-NUM      |                           256 |                           256 | N/A  |
    +--------------+-------------------------------+-------------------------------+------+
    
    ======================================================================CPU Metrics======================================================================
    Common Microarchitecture Metrics diff:
    +-------------+--------------+--------------+---------+
    |    Metric   |    Before    |    After     |   Diff  |
    +-------------+--------------+--------------+---------+
    | IPC         |         0.33 |         0.33 | +0.00%  |
    | PATH LENGTH | 101152422747 | 126264627429 | +24.83% |
    | MPKI        |         3.62 |         3.18 | -12.15% |
    | BPKI        |         1.22 |         0.91 | -25.41% |
    | L1D MPKI    |         3.66 |         3.46 | -5.46%  |
    | L1I MPKI    |         5.88 |         4.49 | -23.64% |
    | L2D MPKI    |         2.59 |         2.13 | -17.76% |
    | L2I MPKI    |         0.43 |         0.25 | -41.86% |
    | DTLB MPKI   |          4.7 |         3.37 | -28.30% |
    | ITLB MPKI   |         0.27 |         0.16 | -40.74% |
    +-------------+--------------+--------------+---------+
    
    Topdown diff:
    +----------------------------+--------+-------+---------+
    |           Metric           | Before | After |   Diff  |
    +----------------------------+--------+-------+---------+
    | Retiring(%)                |   5.83 |  5.06 | -13.21% |
    | Frontend Bound(%)          |  11.24 |  8.31 | -26.07% |
    |   Fetch Bandwidth Bound(%) |   0.38 |  0.49 | +28.95% |
    |   Fetch Latency Bound(%)   |  10.86 |  7.82 | -27.99% |
    | Bad Speculation(%)         |   0.73 |  0.94 | +28.77% |
    |   Branch Mispredicts(%)    |   0.58 |  0.72 | +24.14% |
    |   Machine Clears(%)        |   0.15 |  0.22 | +46.67% |
    | Backend Bound(%)           |   82.2 | 85.69 | +4.25%  |
    |   Core Bound(%)            |  39.48 | 38.03 | -3.67%  |
    |   Memory Bound(%)          |  42.72 | 47.66 | +11.56% |
    +----------------------------+--------+-------+---------+
    
    OS Metrics diff:
    +------------------+--------+--------+---------+
    |      Metric      | Before | After  |   Diff  |
    +------------------+--------+--------+---------+
    | context-switches | 309603 | 313974 | +1.41%  |
    | migrations       |   2277 |   2652 | +16.47% |
    | page-faults      | 200448 | 120183 | -40.04% |
    +------------------+--------+--------+---------+
    
    INSTRUCTION diff:
    +----------------------------------+--------+-------+---------+
    |              Metric              | Before | After |   Diff  |
    +----------------------------------+--------+-------+---------+
    | Memory(%)                        |  26.94 | 26.86 | -0.30%  |
    |   Load(%)                        |  23.41 | 24.02 | +2.61%  |
    |   Store(%)                       |   3.53 |  2.83 | -19.83% |
    | Integer(%)                       |  49.53 |  50.1 | +1.15%  |
    | Floating Point(%)                |   0.02 |  0.03 | +50.00% |
    | Advanced SIMD(%)                 |   0.13 |  0.13 | +0.00%  |
    | Crypto(%)                        |    0.0 |   0.0 | +0.00%  |
    | Branches(%)                      |  23.24 | 22.74 | -2.15%  |
    |   Immediate(%)                   |  21.27 | 21.24 | -0.14%  |
    |   Return(%)                      |   0.84 |  0.66 | -21.43% |
    |   Indirect(%)                    |   1.14 |  0.83 | -27.19% |
    | Barriers(%)                      |   0.08 |  0.07 | -12.50% |
    |   Instruction Synchronization(%) |   0.01 |  0.01 | +0.00%  |
    |   Data Synchronization(%)        |    0.0 |   0.0 | +0.00%  |
    |   Data Memory(%)                 |   0.06 |  0.06 | +0.00%  |
    | Not Retired(%)                   |   0.06 |  0.06 | +0.00%  |
    +----------------------------------+--------+-------+---------+
    
    Load_avg diff:
    +---------------+--------+-------+---------+
    |     Metric    | Before | After |   Diff  |
    +---------------+--------+-------+---------+
    | recent 1 min  |   0.56 |   0.4 | -28.57% |
    | recent 5 min  |   0.44 |  0.48 | +9.09%  |
    | recent 15 min |   0.35 |  0.45 | +28.57% |
    +---------------+--------+-------+---------+
    
    Softirqs diff:
    +----------+--------+-------+---------+
    |  Metric  | Before | After |   Diff  |
    +----------+--------+-------+---------+
    | NET_TX/s |      0 |     0 | +0.00%  |
    | NET_RX/s |      0 |     0 | +0.00%  |
    | BLOCK/s  |      0 |     0 | +0.00%  |
    | SCHED/s  |      4 |     3 | -25.00% |
    +----------+--------+-------+---------+
    
    CPU_stat diff:
    +-------------------+---------+---------+---------+
    |       Metric      |  Before |  After  |   Diff  |
    +-------------------+---------+---------+---------+
    | ctx_switches/s    | 24710.0 | 23522.0 | -4.81%  |
    | interrupts/s      | 58114.0 | 55823.0 | -3.94%  |
    | soft_interrupts/s |  2207.0 |  1702.0 | -22.88% |
    +-------------------+---------+---------+---------+
    
    CPU_freq diff:
    +--------------+---------+---------+--------+
    |    Metric    |  Before |  After  |  Diff  |
    +--------------+---------+---------+--------+
    | current(MHz) | 2899.65 | 2899.65 | +0.00% |
    | min(MHz)     |   400.0 |   400.0 | +0.00% |
    | max(MHz)     |  2900.0 |  2900.0 | +0.00% |
    +--------------+---------+---------+--------+
    
    CPU_percent diff:
    +-------------+--------+-------+----------+
    |    Metric   | Before | After |   Diff   |
    +-------------+--------+-------+----------+
    | %user       |   0.12 |  0.08 | -33.33%  |
    | %nice       |    0.0 |   0.0 | +0.00%   |
    | %system     |   0.16 |  0.17 | +6.25%   |
    | %idle       |  98.74 | 98.51 | -0.23%   |
    | %iowait     |   0.01 |   0.0 | -100.00% |
    | %irq        |   0.13 |  0.14 | +7.69%   |
    | %softirq    |   0.01 |  0.01 | +0.00%   |
    | %steal      |    0.0 |   0.0 | +0.00%   |
    | %guest      |    0.0 |   0.0 | +0.00%   |
    | %guest_nice |    0.0 |   0.0 | +0.00%   |
    +-------------+--------+-------+----------+
    
    =================================================================Memory Access Metrics=================================================================
    DDRC summary diff:
    +-----------------------+--------+--------+---------+
    |         Metric        | Before | After  |   Diff  |
    +-----------------------+--------+--------+---------+
    | Total ddrc_rd_bw MB/s | 651.69 | 540.23 | -17.10% |
    | Total ddrc_wr_bw MB/s | 344.07 | 283.02 | -17.74% |
    +-----------------------+--------+--------+---------+
    
    NUMA NODE0 diff:
    +----------+-----------+-----------+---------+
    |  Metric  |   Before  |   After   |   Diff  |
    +----------+-----------+-----------+---------+
    | rx_outer |  740052.0 |  389902.5 | -47.31% |
    | rx_sccl  | 1359007.2 | 1281564.6 | -5.70%  |
    +----------+-----------+-----------+---------+
    
    
    ...
    ...
    ...
    
    ========================================================================Top diff=======================================================================
    Top diff:
    -----------------------------------------------------------------------------------------------------------------
    Note:
        At most 20 Top diffs are listed, please check the generated xlsx file for the rest of report.
    
    +-------------+------------------------------+-------------------+----------+----------+----------+-------------+
    | Table Group |  Metric Type/Metric Device   |       Metric      |  Before  |  After   |   Diff   | Diff(value) |
    +-------------+------------------------------+-------------------+----------+----------+----------+-------------+
    | NUMA        | NUMA NODE2                   | rx_sccl           | 695016.9 | 209223.9 | -69.90%  |    485793.0 |
    | NUMA        | NUMA NODE3                   | rx_outer          | 266775.6 | 690445.5 | +158.81% |    423669.9 |
    | Miss        | Miss Latency L2 Miss Latency | cycles_max        |     2507 |    11460 | +357.12% |        8953 |
    | IO_info     | IO_info Summary              | Total rkB/s       |  1288.25 |  2723.15 | +111.38% |      1434.9 |
    | IO_info     | IO_info Summary              | Total wkB/s       |  1395.23 |   681.08 | -51.19%  |      714.15 |
    | IO_info     | IO_info IO Device sda3       | rkB/s             |   429.15 |   907.45 | +111.45% |       478.3 |
    | IO_info     | IO_info IO Device dm-0       | rkB/s             |   429.15 |   907.45 | +111.45% |       478.3 |
    | IO_info     | IO_info IO Device sda        | rkB/s             |   429.55 |   907.85 | +111.35% |       478.3 |
    | Net_info    | Net_info Summary             | Total txpck/s     |     33.4 |      3.4 | -89.82%  |        30.0 |
    | Net_info    | Net_info Network Device eno1 | txpck/s           |     33.4 |      3.4 | -89.82%  |        30.0 |
    | Net_info    | Net_info Summary             | Total txkB/s      |    26.31 |     0.25 | -99.05%  |       26.06 |
    | Net_info    | Net_info Network Device eno1 | txkB/s            |    26.31 |     0.25 | -99.05%  |       26.06 |
    | Net_info    | Net_info Summary             | Total rxpck/s     |     30.6 |     5.71 | -81.34%  |       24.89 |
    | CPU_percent | CPU_percent                  | %iowait           |     0.01 |      0.0 | -100.00% |        0.01 |
    | INSTRUCTION | INSTRUCTION                  | Floating Point(%) |     0.02 |     0.03 | +50.00%  |        0.01 |
    +-------------+------------------------------+-------------------+----------+----------+----------+-------------+
    
    Data has been saved to /home/test/2025_08_14_16_12_42_diff.xlsx
    

    The data after comparison is stored in the /home/test/2025_08_14_16_12_42_diff.xlsx file. The summary data collected twice is analyzed, and a Top Diff report is generated to identify the metric with the largest difference. The current report shows that the miss latency difference is significant between the two rounds of collection.