鲲鹏社区首页
中文
注册
开发者
我要评分
获取效率
正确性
完整性
易理解
在线提单
论坛求助

collect采集命令

命令功能

支持采集多维度性能数据,包括Miss、访存统计、NUMA、微架构、Miss Latency、热点函数、CPU usage、NIC bandwidth、I/O、Memory usage、Softirq、PCIe、PA2Ring、Ring2PA数据。

命令格式

1
./ksys collect [-h] [-o OUTPUT] [-d <sec>] [-i <sec>] [-p PID] [-c CONFIG] [-l {0,1,2,3}] ...

工具可采集指定应用,例如:./ksys collect [workload...],其中[workload...]替换为应用路径加应用参数,若还需要对应用指定采集参数,请在[workload...]前指定。

参数说明

表1 参数说明

参数

参数选项

参数说明

-h/--help

-

可选参数,获取帮助信息。

-o/--output

-

可选参数,指定生成JSON文件的目录,不指定则在当前目录下生成格式为“Y_M_D_H_M_S_report”的JSON文件。

-d/--duration

-

可选参数,指定采集时间(以秒为单位),最小值为1秒,默认值为30秒。

-i/--interval

-

可选参数,指定采样间隔(以秒为单位),最小值为1秒,默认值为1秒。

说明:

建议-i参数指定的采样间隔不超过-d指定的采集时间的十分之一,否则会有警告。

例如:采样时间为10秒,采集间隔不超过1秒。

-p/--pid

-

可选参数,指定采集进程,不指定则对系统进行采集。

-c/--config

-

可选参数,指定.yaml配置文件路径。通过指定该文件可自定义热点函数采集、SPE采集模块等的运行参数(如是否采集、采集频率等)。格式要求可参考工具安装目录下的yaml文件。

说明:

若不指定-c参数,采集时默认读取工具安装目录下的config.yaml文件。

-l/--log-level

0/1/2/3

可选参数,设置日志级别,默认为1。

  • 0:日志级别为DEBUG。
  • 1:日志级别为INFO。
  • 2:日志级别为WARNING。
  • 3:日志级别为ERROR。

使用示例

执行以下命令,查看collect命令支持的功能信息:

1
./ksys collect -h

返回信息如下:

USAGE
    ksys collect [-h] [-o OUTPUT] [-d <sec>] [-i <sec>] [-p PID] [-c CONFIG] [-l {0,1,2,3}] ...
    Using config.yaml can carry out detailed parameter configuration. Refer to the
    comments in config.yaml for details.

DESCRIPTION
    Create a collection command line task.

POSITIONAL ARGUMENTS
    workload
    Specify workload parameters: contains the application and application parameters.

options:
    -h, --help
    show this help message and exit

    -o OUTPUT, --output OUTPUT
    Output the full path.

    -d <sec>, --duration <sec>
    Duration in seconds for the task collect. The minimum value is 1, If not explicitly specified, the task will collect data continuously. The user can use Ctrl+ \ to cancel the
    task or Ctrl+ C to stop the task collection and enter the analysis.

    -i <sec>, --interval <sec>
    Interval in seconds for sampling. The minimum value is 1 and the default value is 1. The maximum value cannot exceed the collection duration. It is advisable to set the
    interval to no more than one-tenth of the total collection duration. The time for collecting hotspot data in each subreport depends on the interval parameter.

    -p PID, --pid PID
    Analyze existing process. When pid does not exist, the collection will stop halfway. When this option is enabled, the hotspot collection can provide support for on-cpu/off-
    cpu check.

    -c CONFIG, --config CONFIG
    Input the path of the config yaml file. The file format must be consistent with the sample.

    -l {0,1,2,3}, --log-level {0,1,2,3}
    Set the log level (0=DEBUG, 1=INFO, 2=WARNING, 3=ERROR), which defaults to 1(INFO).
  • 采集系统性能数据。
    1
    ./ksys collect -d 30 -o /home/test/
    

    采集时间指定为30秒,在“/home/test/”下生成JSON文件。

    返回信息片段如下:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    Hotspot data collection is disabled. Refer to /home/ksys-xxx-Linux-aarch64/config.yaml for details
    Starting to collect data. You can press Ctrl+C to stop the task.
    Starting to parse data. You can press Ctrl+C stop the task.
    Progress: 1/1 | Sub-progress(micro data): 30/30.
    Starting processing and printing data. You can press Ctrl+C to stop and all result will not be saved.
    =======================================================================CPU Metrics========================================================================
    Common Microarchitecture Metrics Summary Data (System wide)
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+---------+
    | IPC  | PATH LENGTH  | MPKI | BPKI | L1D MPKI | L1I MPKI | L2D MPKI | L2I MPKI | DTLB MPKI | ITLB MPKI | CPU-NUM |
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+---------+
    | 0.51 | 563132378640 | 0.93 | 0.63 |     0.93 |     1.01 |     0.77 |     0.16 |     16.86 |      0.14 |     256 |
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+---------+
    
    Topdown Summary Data (System wide)
    +----------------------------+-------+
    |           Metric           | Value |
    +----------------------------+-------+
    | Retiring(%)                |  8.42 |
    | Frontend Bound(%)          |  4.58 |
    |   Fetch Latency Bound(%)   |  3.83 |
    |   Fetch Bandwidth Bound(%) |  0.75 |
    | Bad Speculation(%)         |  0.37 |
    |   Branch Mispredicts(%)    |  0.33 |
    |   Machine Clears(%)        |  0.04 |
    | Backend Bound(%)           | 86.63 |
    |   Core Bound(%)            | 43.12 |
    |   Memory Bound(%)          | 43.51 |
    | CPU-NUM                    |   256 |
    +----------------------------+-------+
    
    OS Metrics Summary Data (System wide)
    +------------------+------------+-------------+---------+
    | context-switches | migrations | page-faults | CPU-NUM |
    +------------------+------------+-------------+---------+
    |            89143 |        279 |       97832 |     256 |
    +------------------+------------+-------------+---------+
    
    INSTRUCTION Summary Data (System wide)
    +----------------------------------+-------+
    |              Metric              | Value |
    +----------------------------------+-------+
    | Memory(%)                        | 30.19 |
    |   Load(%)                        | 24.74 |
    |   Store(%)                       |  5.45 |
    | Integer(%)                       | 41.33 |
    | Floating Point(%)                |  0.51 |
    | Advanced SIMD(%)                 |  0.02 |
    | Crypto(%)                        |   0.0 |
    | Branches(%)                      | 23.63 |
    |   Immediate(%)                   | 19.53 |
    |   Return(%)                      |  1.22 |
    |   Indirect(%)                    |  2.88 |
    | Barriers(%)                      |  0.01 |
    |   Instruction Synchronization(%) |   0.0 |
    |   Data Synchronization(%)        |   0.0 |
    |   Data Memory(%)                 |  0.01 |
    | Not Retired(%)                   |  4.32 |
    +----------------------------------+-------+
    
    Load_avg Summary Data (System wide)
    +--------------+--------------+---------------+
    | recent 1 min | recent 5 min | recent 15 min |
    +--------------+--------------+---------------+
    |         1.02 |         0.88 |          0.83 |
    +--------------+--------------+---------------+
    
    Softirqs Summary Data (System wide)
    +----------+----------+---------+---------+---------+
    | NET_TX/s | NET_RX/s | BLOCK/s | SCHED/s | CPU-NUM |
    +----------+----------+---------+---------+---------+
    |        0 |        0 |       0 |       4 |     256 |
    +----------+----------+---------+---------+---------+
    ...
    ...
    ...
    
    =======================================================================Net Metrics========================================================================
    Net_info Summary Data (System wide)
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    |         IFACE          | rxpck/s | txpck/s | rxkB/s | txkB/s | rxcmp/s | txcmp/s | rxmcst/s | %ifutil |
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    | Network Device eno1    |   42.08 |   51.56 |   5.26 |  23.71 |     0.0 |     0.0 |      0.0 |    0.02 |
    | Network Device eno2    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device eno3    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device eno4    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device docker0 |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    
    Starting to save data. You can press Ctrl+C to stop and all result will not be saved.
    Data saved successfully at /home/test/2025_11_17_11_08_55_report.json
    
  • 采集应用性能数据。
    1
    ./ksys collect -d 30 -o /home/test/ /home/test/demo/cpu_branch_prediction_before
    

    “/home/test/demo/cpu_branch_prediction_before”应用进行采集,采集时间指定为30秒,在“/home/tset/”下生成JSON文件。

    返回信息片段如下:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    Hotspot data collection is disabled. Refer to /home/ksys-xxx-Linux-aarch64/config.yaml for details
    Starting to collect data. You can press Ctrl+C to stop the task.
    Starting to parse data. You can press Ctrl+C stop the task.
    Progress: 1/1 | Sub-progress(micro data): 24/24.
    Starting processing and printing data. You can press Ctrl+C to stop and all result will not be saved.
    =======================================================================CPU Metrics========================================================================
    Common Microarchitecture Metrics Summary Data (Application level)
    +------+-------------+------+-------+----------+----------+----------+----------+-----------+-----------+
    | IPC  | PATH LENGTH | MPKI |  BPKI | L1D MPKI | L1I MPKI | L2D MPKI | L2I MPKI | DTLB MPKI | ITLB MPKI |
    +------+-------------+------+-------+----------+----------+----------+----------+-----------+-----------+
    | 1.08 | 76692609654 | 0.04 | 17.89 |     0.04 |     0.01 |      0.0 |      0.0 |      0.03 |       0.0 |
    +------+-------------+------+-------+----------+----------+----------+----------+-----------+-----------+
    
    Topdown Summary Data (Application level)
    +----------------------------+-------+
    |           Metric           | Value |
    +----------------------------+-------+
    | Retiring(%)                | 17.94 |
    | Frontend Bound(%)          |  9.68 |
    |   Fetch Latency Bound(%)   |  8.46 |
    |   Fetch Bandwidth Bound(%) |  1.22 |
    | Bad Speculation(%)         | 52.14 |
    |   Branch Mispredicts(%)    | 52.12 |
    |   Machine Clears(%)        |  0.02 |
    | Backend Bound(%)           | 20.24 |
    |   Core Bound(%)            | 14.48 |
    |   Memory Bound(%)          |  5.76 |
    +----------------------------+-------+
    
    OS Metrics Summary Data (Application level)
    +------------------+------------+-------------+
    | context-switches | migrations | page-faults |
    +------------------+------------+-------------+
    |                8 |          2 |          26 |
    +------------------+------------+-------------+
    
    INSTRUCTION Summary Data (Application level)
    +----------------------------------+-------+
    |              Metric              | Value |
    +----------------------------------+-------+
    | Memory(%)                        |  17.0 |
    |   Load(%)                        | 13.32 |
    |   Store(%)                       |  3.68 |
    | Integer(%)                       | 32.77 |
    | Floating Point(%)                |   0.0 |
    | Advanced SIMD(%)                 |   0.0 |
    | Crypto(%)                        |   0.0 |
    | Branches(%)                      |   7.1 |
    |   Immediate(%)                   |   7.1 |
    |   Return(%)                      |   0.0 |
    |   Indirect(%)                    |   0.0 |
    | Barriers(%)                      |   0.0 |
    |   Instruction Synchronization(%) |   0.0 |
    |   Data Synchronization(%)        |   0.0 |
    |   Data Memory(%)                 |   0.0 |
    | Not Retired(%)                   | 43.13 |
    +----------------------------------+-------+
    
    Load_avg Summary Data (System wide)
    +--------------+--------------+---------------+
    | recent 1 min | recent 5 min | recent 15 min |
    +--------------+--------------+---------------+
    |         1.11 |         0.99 |          0.92 |
    +--------------+--------------+---------------+
    
    Softirqs Summary Data (System wide)
    +----------+----------+---------+---------+
    | NET_TX/s | NET_RX/s | BLOCK/s | SCHED/s |
    +----------+----------+---------+---------+
    |        0 |       14 |       0 |     756 |
    +----------+----------+---------+---------+
    ...
    ...
    ...
    IO_info Summary Data (Application level)
    +---------+---------+-----------+
    | kB_rd/s | kB_wr/s | kB_ccwr/s |
    +---------+---------+-----------+
    |     0.0 |     0.0 |       0.0 |
    +---------+---------+-----------+
    
    =======================================================================Net Metrics========================================================================
    Net_info Summary Data (System wide)
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    |         IFACE          | rxpck/s | txpck/s | rxkB/s | txkB/s | rxcmp/s | txcmp/s | rxmcst/s | %ifutil |
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    | Network Device eno1    |    6.48 |     6.9 |   0.51 |   0.92 |     0.0 |     0.0 |     0.04 |     0.0 |
    | Network Device eno2    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device eno3    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device eno4    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device docker0 |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    
    Starting to save data. You can press Ctrl+C to stop and all result will not be saved.
    Data saved successfully at /home/test/2025_11_17_11_22_26_report.json
    
  • 采集进程性能数据。
    1
    ./ksys collect -d 30 -o /home/test/ -p 1202458
    

    采集时间指定30秒,在“/home/tset/”下生成JSON文件,对进程号为1202458的进程进行采集。

    返回信息片段如下:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    Hotspot data collection is disabled. Refer to /home/ksys-xxx-Linux-aarch64/config.yaml for details
    Starting to collect data. You can press Ctrl+C to stop the task.
    Starting to parse data. You can press Ctrl+C stop the task.
    Progress: 1/1 | Sub-progress(micro data): 30/30.
    Starting processing and printing data. You can press Ctrl+C to stop and all result will not be saved.
    =======================================================================CPU Metrics========================================================================
    Common Microarchitecture Metrics Summary Data (Process level)
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+
    | IPC  | PATH LENGTH  | MPKI | BPKI | L1D MPKI | L1I MPKI | L2D MPKI | L2I MPKI | DTLB MPKI | ITLB MPKI |
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+
    | 3.96 | 256626202834 | 0.28 | 0.32 |     0.28 |     0.14 |     0.63 |     0.03 |     17.34 |      0.01 |
    +------+--------------+------+------+----------+----------+----------+----------+-----------+-----------+
    
    Topdown Summary Data (Process level)
    +----------------------------+-------+
    |           Metric           | Value |
    +----------------------------+-------+
    | Retiring(%)                | 65.22 |
    | Frontend Bound(%)          |  19.3 |
    |   Fetch Latency Bound(%)   |  9.89 |
    |   Fetch Bandwidth Bound(%) |  9.41 |
    | Bad Speculation(%)         |  1.46 |
    |   Branch Mispredicts(%)    |  1.34 |
    |   Machine Clears(%)        |  0.12 |
    | Backend Bound(%)           | 14.02 |
    |   Core Bound(%)            | 13.58 |
    |   Memory Bound(%)          |  0.44 |
    +----------------------------+-------+
    
    OS Metrics Summary Data (Process level)
    +------------------+------------+-------------+
    | context-switches | migrations | page-faults |
    +------------------+------------+-------------+
    |            19514 |         20 |           0 |
    +------------------+------------+-------------+
    
    INSTRUCTION Summary Data (Process level)
    +----------------------------------+-------+
    |              Metric              | Value |
    +----------------------------------+-------+
    | Memory(%)                        | 38.08 |
    |   Load(%)                        |  27.5 |
    |   Store(%)                       | 10.58 |
    | Integer(%)                       | 37.46 |
    | Floating Point(%)                |  1.14 |
    | Advanced SIMD(%)                 |  0.02 |
    | Crypto(%)                        |   0.0 |
    | Branches(%)                      |  21.2 |
    |   Immediate(%)                   | 14.03 |
    |   Return(%)                      |  1.99 |
    |   Indirect(%)                    |  5.18 |
    | Barriers(%)                      |   0.0 |
    |   Instruction Synchronization(%) |   0.0 |
    |   Data Synchronization(%)        |   0.0 |
    |   Data Memory(%)                 |   0.0 |
    | Not Retired(%)                   |   2.1 |
    +----------------------------------+-------+
    
    Load_avg Summary Data (System wide)
    +--------------+--------------+---------------+
    | recent 1 min | recent 5 min | recent 15 min |
    +--------------+--------------+---------------+
    |         1.06 |         0.97 |          0.93 |
    +--------------+--------------+---------------+
    
    Softirqs Summary Data (System wide)
    +----------+----------+---------+---------+
    | NET_TX/s | NET_RX/s | BLOCK/s | SCHED/s |
    +----------+----------+---------+---------+
    |        0 |       16 |       0 |     724 |
    +----------+----------+---------+---------+
    ...
    ...
    ...
    IO_info Summary Data (Process level)
    +---------+---------+-----------+
    | kB_rd/s | kB_wr/s | kB_ccwr/s |
    +---------+---------+-----------+
    |     0.0 | 5751.24 |       0.0 |
    +---------+---------+-----------+
    
    =======================================================================Net Metrics========================================================================
    Net_info Summary Data (System wide)
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    |         IFACE          | rxpck/s | txpck/s | rxkB/s | txkB/s | rxcmp/s | txcmp/s | rxmcst/s | %ifutil |
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    | Network Device eno1    |    6.78 |    8.58 |   0.55 |   1.12 |     0.0 |     0.0 |     0.03 |     0.0 |
    | Network Device eno2    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device eno3    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device eno4    |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    | Network Device docker0 |     0.0 |     0.0 |    0.0 |    0.0 |     0.0 |     0.0 |      0.0 |     0.0 |
    +------------------------+---------+---------+--------+--------+---------+---------+----------+---------+
    
    Starting to save data. You can press Ctrl+C to stop and all result will not be saved.
    Data saved successfully at /home/test/2025_11_17_15_08_02_report.json
    

    回显信息中指标区分系统维度(System wide)、应用维度(Application level)和PID维度(Process level)。

    • 对系统进行采集,性能数据为系统维度。
    • 对应用进行采集,性能数据为系统维度和应用维度。
    • 对进程进行采集,性能数据为系统维度和PID维度。

指标说明

  • PCIe
    表2 PCIe指标

    指标

    说明

    rx_rd_bw(RX读带宽)

    指CPU到设备(CPU-to-Device)的带宽,对应从CPU端来看表现为写入带宽。实际测试中与CPU端写操作带宽存在比例关系。(如1MB/s的RX读带宽可能对应30MB/s的CPU写带宽)。

    rx_wr_bw(RX写带宽)

    指设备到CPU(Device-to-CPU)的带宽,对应从CPU端来看表现为读取带宽。实际测试中与CPU端读带宽一致。

  • PA(Protocol Adapter,协议适配器)
    表3 PA指标

    指标

    说明

    PA2Ring_bw(PA到Ring带宽)

    从PA到Ring的数据传输带宽。

    Ring2PA_bw(Ring到PA带宽)

    从Ring到PA的数据传输带宽。