Rate This Document
Findability
Accuracy
Completeness
Readability

Memory Access Statistics Analysis

Command Function

Accesses the PMU events of the cache and memory and analyzes the number of storage access times, hit rate, and bandwidth.

Syntax

devkit tuner memory [-h] [-d <sec>] [-l {0, 1, 2, 3}] [-i <sec>] [-o] [-m {1, 2, 3}] [-p {100, 1000}] [-c {n,m | n-m}] [--package]

Parameter Description

Table 1 Parameter description

Parameter

Option

Description

-h/--help

-

Obtains help information.

-d/--duration

-

Collection duration, which defaults to 30 seconds. The value ranges from 1 to 300 seconds. You can press Ctrl+\ to cancel the task or press Ctrl+C to stop the collection and start analysis.

-l/--log-level

0/1/2/3

Log level, which defaults to 1(info).

  • 0(debug)
  • 1(info)
  • 2(warning)
  • 3(error)

-i/--interval

-

Collection interval, in seconds. The minimum value is 1 second and the maximum value cannot exceed the collection duration. The default value is equal to the collection duration. If this parameter is not set, no subreports are generated. It specifies the time taken to collect data in each subreport.

-m/--metric

1/2/3

Sampling type, which defaults to 1 (ALL).

  • 1 (ALL)
  • 2 (Cache)
  • 3 (DDR)

-o/--output

-

Report file name. Reports are generated in the current directory by default.

-c/--cpu

-

Number of CPU cores to be collected. The value can be 0 or 0, 1, 2 or 0-2. By default, all CPU cores are collected.

-p/--period

100/1000

Data collection interval, which defaults to 1000 ms. The options are or 1000 ms or 100 ms. When Collection Duration is set to 1 second, the default value automatically changes to 100 ms.

--package

-

Indicates whether to import data to the database and generate compressed packages in the specified output path.

Example

devkit tuner memory -d 2 -o /home/memory_result -m 1 --package

In the command, -d indicates that the collection duration is 3 seconds, -o /home/memory_result and --package indicate that a report data package named memory_result is generated in the specified path, and -m 1 indicates that all the cache and DDR access data is collected.

Command output:

Memory Summary Report-ALL                               Time:2024/06/12 09:32:27
================================================================================
System Information
────────────────────────────────────────────────────────────────────
Linux Kernel Version        5.10.0-153.46.0.124.oe2203sp2.aarch64
Cpu Type                    920
NUMA NODE                   0         1         2         3
cpus                        0-31      32-63     64-95     96-127
Percentage of core Cache miss
────────────────────────────────────────────────────────────────────
L1D         1.33%
L1I         1.79%
L2D        28.63%
L2I        32.84%

DDR Bandwidth
────────────────────────────────────────────────────────────────────
ddrc_write        25.92MB/s
ddrc_read         53.76MB/s

Memory metrics of the Cache
────────────────────────────────────────────────────────────────────
1. L1/L2/TLB Access Bandwidth and Hit Rate
Value Format: X|Y = Bandwidth | Hit Rate
────────────────────────────────────────────────────────────────────
  CPU                   L1D                   L1I                  L2D                 L2I       L2D_TLB       L2I_TLB
────────────────────────────────────────────────────────────────────
  all    1660.18MB/s|98.67%    2800.42MB/s|98.21%    375.66MB/s|71.37%    74.38MB/s|67.16%    N/A|99.38%    N/A|98.85%
────────────────────────────────────────────────────────────────────
2. L3 Read Bandwidth and Hit Rate
─────────────────────────────────────────────────────────────────
  NODE    Read Hit Bandwidth    Read Bandwidth    Read Hit Rate
─────────────────────────────────────────────────────────────────
  0                 2.50MB/s         17.70MB/s           14.13%
  1                 5.82MB/s         36.51MB/s           15.95%
  2                 1.80MB/s         14.00MB/s           12.84%
  3                14.40MB/s         48.97MB/s           29.41%
─────────────────────────────────────────────────────────────────

Memory metrics of the DDRC
────────────────────────────────────────────────────────────────────
1. DDRC_ACCESS_BANDWIDTH
Value Format: X|Y = DDR read | DDR write
────────────────────────────────────────────────────────────────────
  NUMANODE               DDRC_0                 DDRC_1               DDRC_2                DDRC_3                   Total
────────────────────────────────────────────────────────────────────
  0           0.00MB/s|0.00MB/s    38.25MB/s|14.04MB/s    0.00MB/s|0.00MB/s     0.00MB/s| 0.00MB/s    38.25MB/s|14.04MB/s
  1           0.00MB/s|0.00MB/s     0.00MB/s| 0.00MB/s    0.00MB/s|0.00MB/s    29.39MB/s| 8.99MB/s    29.39MB/s| 8.99MB/s
  2           0.00MB/s|0.00MB/s     0.00MB/s| 0.00MB/s    0.00MB/s|0.00MB/s    47.16MB/s|17.61MB/s    47.16MB/s|17.61MB/s
  3           0.00MB/s|0.00MB/s    20.47MB/s| 6.70MB/s    0.00MB/s|0.00MB/s     0.00MB/s| 0.00MB/s    20.47MB/s| 6.70MB/s
────────────────────────────────────────────────────────────────────
2. DDRC_ACCESS_COUNT
Value Format: X|Y = read | write
────────────────────────────────────────────────────────────────────
  NUMANODE                Local            Cross-die          Cross-chip                 Total
────────────────────────────────────────────────────────────────────
  0          100,618/s|43,789/s     51,320/s| 5,751/s     5,655/s|1,790/s    157,593/s|51,330/s
  1          117,402/s|50,205/s    151,277/s|15,257/s    19,498/s|2,036/s    288,178/s|67,499/s
  2          235,676/s|74,314/s     64,373/s| 5,972/s    31,126/s|3,295/s    331,176/s|83,582/s
  3           27,753/s|18,038/s     19,332/s| 4,174/s     4,161/s|1,579/s     51,247/s|23,791/s
────────────────────────────────────────────────────────────────────
The report /home/memory_result.tar is generated successfully.
To view summary report. you can run: devkit report -i /home/memory_result.tar
To view detail report. you can import the report to the WebUI or IDE to view details.