Rate This Document
Findability
Accuracy
Completeness
Readability

Collection Command collect

Command Function

Enables one-click collection of multidimensional performance statistics, covering cache misses, memory access, NUMA, microarchitecture, miss latencies, hotspot functions, CPU usage, NIC bandwidth, I/O, memory usage, softirqs, PCIe, PA2Ring, and Ring2PA.

Syntax

1
./ksys collect [-h] [-o OUTPUT] [-d <sec>] [-i <sec>] [-p PID] [-c CONFIG] [-l {0,1,2,3}] ...

The tool can collect data of a specified application, for example, using ./ksys collect [workload...]. Replace [workload...] with the application path and application parameters. If you need to specify collection parameters for an application, place them before [workload...].

Parameter Description

Table 1 Parameter description

Parameter

Option

Description

-h/--help

-

Obtains help information. This parameter is optional.

-o/--output

-

JSON file directory. This parameter is optional. If you do not set this parameter, a JSON file in the Y_M_D_H_M_S_report format is generated in the current directory.

-d/--duration

-

Collection duration, in seconds. The minimum value is 1 second and the default value is 30 seconds. This parameter is optional.

-i/--interval

-

Collection interval, in seconds. The minimum value and default value are both 1 second. This parameter is optional.

NOTE:

It is recommended that the collection interval specified by the -i parameter be less than or equal to one tenth of the collection duration specified by the -d parameter. Otherwise, an alarm is generated.

For example, if the collection duration is 10 seconds, the collection interval should not exceed 1 second.

-p/--pid

-

ID of the collected process. This parameter is optional. If you do not set this parameter, the tool collects system-wide data.

-c/--config

-

Path to the YAML configuration file. This parameter is optional. In this file, you can customize the parameters (such as whether to collect data and the collection frequency) of hotspot function collection and the SPE collection module. For details about the format requirements, see the .yaml file in the tool installation directory.

NOTE:

If the -c parameter is not specified, the config.yaml file in the tool installation directory is read by default during collection.

-l/--log-level

0/1/2/3

Log level, which defaults to 1. This parameter is optional.

  • 0: DEBUG
  • 1: INFO
  • 2: WARNING
  • 3: ERROR

Example

Display the information about collect:

1
./ksys collect -h

Command output:

USAGE
    ksys collect [-h] [-o OUTPUT] [-d <sec>] [-i <sec>] [-p PID] [-c CONFIG] [-l {0,1,2,3}] ...
    Using config.yaml can carry out detailed parameter configuration. Refer to the
    comments in config.yaml for details.

DESCRIPTION
    Create a collection command line task.

POSITIONAL ARGUMENTS
    workload
    Specify workload parameters: contains the application and application parameters.

options:
    -h, --help
    show this help message and exit

    -o OUTPUT, --output OUTPUT
    Output the full path.

    -d <sec>, --duration <sec>
    Duration in seconds for the task collect. The minimum value is 1, If not explicitly specified, the task will collect data continuously. The user can use Ctrl+ \ to cancel the
    task or Ctrl+ C to stop the task collection and enter the analysis.

    -i <sec>, --interval <sec>
    Interval in seconds for sampling. The minimum value is 1 and the default value is 1. The maximum value cannot exceed the collection duration. It is advisable to set the
    interval to no more than one-tenth of the total collection duration. The time for collecting hotspot data in each subreport depends on the interval parameter.

    -p PID, --pid PID
    Analyze existing process. When pid does not exist, the collection will stop halfway. When this option is enabled, the hotspot collection can provide support for on-cpu/off-
    cpu check.

    -c CONFIG, --config CONFIG
    Input the path of the config yaml file. The file format must be consistent with the sample.

    -l {0,1,2,3}, --log-level {0,1,2,3}
    Set the log level (0=DEBUG, 1=INFO, 2=WARNING, 3=ERROR), which defaults to 1(INFO).
  • Collect system performance data.
    1
    ./ksys collect -d 30 -o /home/test/
    

    The collection duration is set to 30 seconds, and a JSON file is generated in /home/test/.

    Command output:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    Hotspot data collection is disabled. Refer to /home/ksys-xxx-Linux-aarch64/config.yaml for details
    Starting to collect data. You can press Ctrl+C to stop the task.
    Starting to parse data. You can press Ctrl+C stop the task.
    Progress: 1/1 | Sub-progress(micro data): 30/30.
    Starting processing and printing data. You can press Ctrl+C to stop and all result will not be saved.
    =======================================================================CPU Metrics========================================================================
    Common Microarchitecture Metrics Summary Data (System wide)
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+---------+
    | IPC  | PATH LENGTH  | MPKI | BPKI | L1D MPKI | L1I MPKI | L2D MPKI | L2I MPKI | DTLB MPKI | ITLB MPKI | CPU-NUM |
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+---------+
    | 0.51 | 563132378640 | 0.93 | 0.63 |     0.93 |     1.01 |     0.77 |     0.16 |     16.86 |      0.14 |     256 |
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+---------+
    
    Topdown Summary Data (System wide)
    +----------------------------+-------+
    |           Metric           | Value |
    +----------------------------+-------+
    | Retiring(%)                |  8.42 |
    | Frontend Bound(%)          |  4.58 |
    |   Fetch Latency Bound(%)   |  3.83 |
    |   Fetch Bandwidth Bound(%) |  0.75 |
    | Bad Speculation(%)         |  0.37 |
    |   Branch Mispredicts(%)    |  0.33 |
    |   Machine Clears(%)        |  0.04 |
    | Backend Bound(%)           | 86.63 |
    |   Core Bound(%)            | 43.12 |
    |   Memory Bound(%)          | 43.51 |
    | CPU-NUM                    |   256 |
    +----------------------------+-------+
    
    OS Metrics Summary Data (System wide)
    +------------------+------------+-------------+---------+
    | context-switches | migrations | page-faults | CPU-NUM |
    +------------------+------------+-------------+---------+
    |            89143 |        279 |       97832 |     256 |
    +------------------+------------+-------------+---------+
    
    INSTRUCTION Summary Data (System wide)
    +----------------------------------+-------+
    |              Metric              | Value |
    +----------------------------------+-------+
    | Memory(%)                        | 30.19 |
    |   Load(%)                        | 24.74 |
    |   Store(%)                       |  5.45 |
    | Integer(%)                       | 41.33 |
    | Floating Point(%)                |  0.51 |
    | Advanced SIMD(%)                 |  0.02 |
    | Crypto(%)                        |   0.0 |
    | Branches(%)                      | 23.63 |
    |   Immediate(%)                   | 19.53 |
    |   Return(%)                      |  1.22 |
    |   Indirect(%)                    |  2.88 |
    | Barriers(%)                      |  0.01 |
    |   Instruction Synchronization(%) |   0.0 |
    |   Data Synchronization(%)        |   0.0 |
    |   Data Memory(%)                 |  0.01 |
    | Not Retired(%)                   |  4.32 |
    +----------------------------------+-------+
    
    Load_avg Summary Data (System wide)
    +--------------+--------------+---------------+
    | recent 1 min | recent 5 min | recent 15 min |
    +--------------+--------------+---------------+
    |         1.02 |         0.88 |          0.83 |
    +--------------+--------------+---------------+
    
    Softirqs Summary Data (System wide)
    +----------+----------+---------+---------+---------+
    | NET_TX/s | NET_RX/s | BLOCK/s | SCHED/s | CPU-NUM |
    +----------+----------+---------+---------+---------+
    |        0 |        0 |       0 |       4 |     256 |
    +----------+----------+---------+---------+---------+
    ...
    ...
    ...
    
    =======================================================================Net Metrics========================================================================
    Net_info Summary Data (System wide)
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    |         IFACE          | rxpck/s | txpck/s | rxkB/s | txkB/s | rxcmp/s | txcmp/s | rxmcst/s | %ifutil |
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    | Network Device eno1    |   42.08 |   51.56 |   5.26 |  23.71 |     0.0 |     0.0 |      0.0 |    0.02 |
    | Network Device eno2    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device eno3    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device eno4    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device docker0 |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    
    Starting to save data. You can press Ctrl+C to stop and all result will not be saved.
    Data saved successfully at /home/test/2025_11_17_11_08_55_report.json
    
  • Collect application performance data.
    1
    ./ksys collect -d 30 -o /home/test/ /home/test/demo/cpu_branch_prediction_before
    

    The data of /home/test/demo/cpu_branch_prediction_before application is collected for 30 seconds and a JSON file is generated in the /home/test/ directory.

    Command output:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    Hotspot data collection is disabled. Refer to /home/ksys-xxx-Linux-aarch64/config.yaml for details
    Starting to collect data. You can press Ctrl+C to stop the task.
    Starting to parse data. You can press Ctrl+C stop the task.
    Progress: 1/1 | Sub-progress(micro data): 24/24.
    Starting processing and printing data. You can press Ctrl+C to stop and all result will not be saved.
    =======================================================================CPU Metrics========================================================================
    Common Microarchitecture Metrics Summary Data (Application level)
    +------+-------------+------+-------+----------+----------+----------+----------+-----------+-----------+
    | IPC  | PATH LENGTH | MPKI |  BPKI | L1D MPKI | L1I MPKI | L2D MPKI | L2I MPKI | DTLB MPKI | ITLB MPKI |
    +------+-------------+------+-------+----------+----------+----------+----------+-----------+-----------+
    | 1.08 | 76692609654 | 0.04 | 17.89 |     0.04 |     0.01 |      0.0 |      0.0 |      0.03 |       0.0 |
    +------+-------------+------+-------+----------+----------+----------+----------+-----------+-----------+
    
    Topdown Summary Data (Application level)
    +----------------------------+-------+
    |           Metric           | Value |
    +----------------------------+-------+
    | Retiring(%)                | 17.94 |
    | Frontend Bound(%)          |  9.68 |
    |   Fetch Latency Bound(%)   |  8.46 |
    |   Fetch Bandwidth Bound(%) |  1.22 |
    | Bad Speculation(%)         | 52.14 |
    |   Branch Mispredicts(%)    | 52.12 |
    |   Machine Clears(%)        |  0.02 |
    | Backend Bound(%)           | 20.24 |
    |   Core Bound(%)            | 14.48 |
    |   Memory Bound(%)          |  5.76 |
    +----------------------------+-------+
    
    OS Metrics Summary Data (Application level)
    +------------------+------------+-------------+
    | context-switches | migrations | page-faults |
    +------------------+------------+-------------+
    |                8 |          2 |          26 |
    +------------------+------------+-------------+
    
    INSTRUCTION Summary Data (Application level)
    +----------------------------------+-------+
    |              Metric              | Value |
    +----------------------------------+-------+
    | Memory(%)                        |  17.0 |
    |   Load(%)                        | 13.32 |
    |   Store(%)                       |  3.68 |
    | Integer(%)                       | 32.77 |
    | Floating Point(%)                |   0.0 |
    | Advanced SIMD(%)                 |   0.0 |
    | Crypto(%)                        |   0.0 |
    | Branches(%)                      |   7.1 |
    |   Immediate(%)                   |   7.1 |
    |   Return(%)                      |   0.0 |
    |   Indirect(%)                    |   0.0 |
    | Barriers(%)                      |   0.0 |
    |   Instruction Synchronization(%) |   0.0 |
    |   Data Synchronization(%)        |   0.0 |
    |   Data Memory(%)                 |   0.0 |
    | Not Retired(%)                   | 43.13 |
    +----------------------------------+-------+
    
    Load_avg Summary Data (System wide)
    +--------------+--------------+---------------+
    | recent 1 min | recent 5 min | recent 15 min |
    +--------------+--------------+---------------+
    |         1.11 |         0.99 |          0.92 |
    +--------------+--------------+---------------+
    
    Softirqs Summary Data (System wide)
    +----------+----------+---------+---------+
    | NET_TX/s | NET_RX/s | BLOCK/s | SCHED/s |
    +----------+----------+---------+---------+
    |        0 |       14 |       0 |     756 |
    +----------+----------+---------+---------+
    ...
    ...
    ...
    IO_info Summary Data (Application level)
    +---------+---------+-----------+
    | kB_rd/s | kB_wr/s | kB_ccwr/s |
    +---------+---------+-----------+
    |     0.0 |     0.0 |       0.0 |
    +---------+---------+-----------+
    
    =======================================================================Net Metrics========================================================================
    Net_info Summary Data (System wide)
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    |         IFACE          | rxpck/s | txpck/s | rxkB/s | txkB/s | rxcmp/s | txcmp/s | rxmcst/s | %ifutil |
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    | Network Device eno1    |    6.48 |     6.9 |   0.51 |   0.92 |     0.0 |     0.0 |     0.04 |     0.0 |
    | Network Device eno2    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device eno3    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device eno4    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device docker0 |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    
    Starting to save data. You can press Ctrl+C to stop and all result will not be saved.
    Data saved successfully at /home/test/2025_11_17_11_22_26_report.json
    
  • Collect process performance data.
    1
    ./ksys collect -d 30 -o /home/test/ -p 1202458
    

    The data of the process whose ID is 1202458 is collected for 30 seconds, and a JSON file is generated in the /home/test/ directory.

    Command output:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    Hotspot data collection is disabled. Refer to /home/ksys-xxx-Linux-aarch64/config.yaml for details
    Starting to collect data. You can press Ctrl+C to stop the task.
    Starting to parse data. You can press Ctrl+C stop the task.
    Progress: 1/1 | Sub-progress(micro data): 30/30.
    Starting processing and printing data. You can press Ctrl+C to stop and all result will not be saved.
    =======================================================================CPU Metrics========================================================================
    Common Microarchitecture Metrics Summary Data (Process level)
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+
    | IPC  | PATH LENGTH  | MPKI | BPKI | L1D MPKI | L1I MPKI | L2D MPKI | L2I MPKI | DTLB MPKI | ITLB MPKI |
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+
    | 3.96 | 256626202834 | 0.28 | 0.32 |     0.28 |     0.14 |     0.63 |     0.03 |     17.34 |      0.01 |
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+
    
    Topdown Summary Data (Process level)
    +----------------------------+-------+
    |           Metric           | Value |
    +----------------------------+-------+
    | Retiring(%)                | 65.22 |
    | Frontend Bound(%)          |  19.3 |
    |   Fetch Latency Bound(%)   |  9.89 |
    |   Fetch Bandwidth Bound(%) |  9.41 |
    | Bad Speculation(%)         |  1.46 |
    |   Branch Mispredicts(%)    |  1.34 |
    |   Machine Clears(%)        |  0.12 |
    | Backend Bound(%)           | 14.02 |
    |   Core Bound(%)            | 13.58 |
    |   Memory Bound(%)          |  0.44 |
    +----------------------------+-------+
    
    OS Metrics Summary Data (Process level)
    +------------------+------------+-------------+
    | context-switches | migrations | page-faults |
    +------------------+------------+-------------+
    |            19514 |         20 |           0 |
    +------------------+------------+-------------+
    
    INSTRUCTION Summary Data (Process level)
    +----------------------------------+-------+
    |              Metric              | Value |
    +----------------------------------+-------+
    | Memory(%)                        | 38.08 |
    |   Load(%)                        |  27.5 |
    |   Store(%)                       | 10.58 |
    | Integer(%)                       | 37.46 |
    | Floating Point(%)                |  1.14 |
    | Advanced SIMD(%)                 |  0.02 |
    | Crypto(%)                        |   0.0 |
    | Branches(%)                      |  21.2 |
    |   Immediate(%)                   | 14.03 |
    |   Return(%)                      |  1.99 |
    |   Indirect(%)                    |  5.18 |
    | Barriers(%)                      |   0.0 |
    |   Instruction Synchronization(%) |   0.0 |
    |   Data Synchronization(%)        |   0.0 |
    |   Data Memory(%)                 |   0.0 |
    | Not Retired(%)                   |   2.1 |
    +----------------------------------+-------+
    
    Load_avg Summary Data (System wide)
    +--------------+--------------+---------------+
    | recent 1 min | recent 5 min | recent 15 min |
    +--------------+--------------+---------------+
    |         1.06 |         0.97 |          0.93 |
    +--------------+--------------+---------------+
    
    Softirqs Summary Data (System wide)
    +----------+----------+---------+---------+
    | NET_TX/s | NET_RX/s | BLOCK/s | SCHED/s |
    +----------+----------+---------+---------+
    |        0 |       16 |       0 |     724 |
    +----------+----------+---------+---------+
    ...
    ...
    ...
    IO_info Summary Data (Process level)
    +---------+---------+-----------+
    | kB_rd/s | kB_wr/s | kB_ccwr/s |
    +---------+---------+-----------+
    |     0.0 | 5751.24 |       0.0 |
    +---------+---------+-----------+
    
    =======================================================================Net Metrics========================================================================
    Net_info Summary Data (System wide)
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    |         IFACE          | rxpck/s | txpck/s | rxkB/s | txkB/s | rxcmp/s | txcmp/s | rxmcst/s | %ifutil |
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    | Network Device eno1    |    6.78 |    8.58 |   0.55 |   1.12 |     0.0 |     0.0 |     0.03 |     0.0 |
    | Network Device eno2    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device eno3    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device eno4    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device docker0 |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    
    Starting to save data. You can press Ctrl+C to stop and all result will not be saved.
    Data saved successfully at /home/test/2025_11_17_15_08_02_report.json
    

    The command output contains system-wide, application-level, and process-level metrics.

    • The tool collects system-wide performance data for the system.
    • The tool collects system-wide and application-level performance data for an application.
    • The tool collects system-wide and process-level performance data for a process.

Metric Description

  • PCIe
    Table 2 PCIe metrics

    Metric

    Description

    rx_rd_bw

    RX read bandwidth, which is CPU-to-device bandwidth. Tests show that this bandwidth is proportional to CPU write bandwidth. For example, 1 MB/s RX read bandwidth may correspond to 30 MB/s CPU write bandwidth.

    rx_wr_bw

    RX write bandwidth, which is device-to-CPU bandwidth. Tests show that this bandwidth is the same as CPU read bandwidth.

  • Protocol adapter (PA)
    Table 3 PA metrics

    Metric

    Description

    PA2Ring_bw

    Bandwidth (MB/s) for data transfer from the PA bus to the Ring bus, reflecting the unidirectional transfer capability from PA to Ring. PA2Ring can be considered as device-to-host traffic or inter-chip traffic (inbound).

    Ring2PA_bw

    Bandwidth (MB/s) for data transfer from the Ring bus to the PA bus, reflecting the unidirectional transfer capability from Ring to PA. Ring2PA can be considered as host-to-device traffic or inter-chip traffic (outbound).