Roofline分析可帮助用户在给定硬件平台上,分析出应用程序的瓶颈点位置,从而有针对性的进行优化。
devkit tuner roofline [-h] [-l {0,1,2,3}] [-m {total,region}] [-o <file>] workload ...
参数 |
参数选项 |
说明 |
---|---|---|
-h/--help |
- |
获取帮助信息。 |
-l/--log-level |
0/1/2/3 |
设置日志级别,默认为2。
|
-m/--mode |
total/region |
配置分析范围为整个二进制应用或用户选择的regions;默认为total。
|
-o/--outpath |
- |
配置生成数据包名称。默认为当前所在目录生成,默认文件名为roofline-YYYYMMDD-HMS。 |
采集已划分好region的应用:
devkit tuner roofline -m region /mysharedir/devkit/SystemProfilerBackend/tuner_cli/docs/matrix_multiply_c
返回信息如下:
Note: 1. Roofline task is currently only supported on the 920 platform. 2. The application must be a binary file in ELF format. 3. Roofline task collection needs to ensure the application has finished running. 4. The estimated time of roofline collection is about 3 * application estimated time. RFCOLLECT: Start collection for /mysharedir/devkit/SystemProfilerBackend/tuner_cli/docs/matrix_multiply_c RFCOLLECT: Launch application to collect performance metrics of /mysharedir/devkit/SystemProfilerBackend/tuner_cli/docs/matrix_multiply_c ROOFLINE_EVENTS are initialized. Initialization time: 0.070167 seconds Calculation time: 0.206211 seconds The dimension of the matrices is too large to print. RFCOLLECT: Launch application to do binary instrumentation of /mysharedir/devkit/SystemProfilerBackend/tuner_cli/docs/matrix_multiply_c Initialization time: 0.168616 seconds Calculation time: 2.243492 seconds The dimension of the matrices is too large to print. RFCOLLECT: Launch benchmarks for measuring roofs RFCOLLECT: Processing all collected data RFCOLLECT: Result is captured at /mysharedir/devkit/SystemProfilerBackend/sys_perf/components/sys_tools/rfcollect-20240424-143840.json RFCOLLECT: Run "rfreport /mysharedir/devkit/SystemProfilerBackend/sys_perf/components/sys_tools/rfcollect-20240424-143840.json" to get report. Get roofline report ... The roofline json report: /mysharedir/devkit/SystemProfilerBackend/sys_perf/components/sys_tools/roofline-20240424-143840.json The roofline html report: /mysharedir/devkit/SystemProfilerBackend/sys_perf/components/sys_tools/roofline-20240424-143840.html
任务将生成JSON文件和HTML文件,如需进行数据分析可直接使用JSON文件;查看性能数据可在浏览器中打开HTML文件,也可通过Know-how查看如何使用roofline分析进行调优。