Roofline性能模型是一个以吞吐量为导向的性能模型(HPC领域使用广泛), 模型中的“roofline”表示应用程序的性能不能超过服务器的硬件能力, 程序中的每个函数和每个循环都受到服务器的硬件限制。根据Roofline结果,可以快速获取当前模型的性能瓶颈点以及性能瓶颈通路。
在给定硬件平台上,分析出应用程序的瓶颈点位置,从而有针对性的进行优化。
1
|
devkit tuner roofline [-h] [-l {0,1,2,3}] [-m {total,region}] [-o <file>] [--hbm-mode {cache,flat}] workload ... |
参数 |
参数选项 |
说明 |
---|---|---|
-h/--help |
- |
获取帮助信息。 |
-l/--log-level |
0/1/2/3 |
设置日志级别,默认为2。
说明:
新增功能采用更合理设计,默认等级调整为2(WARNING)。
|
-m/--mode |
total/region |
配置分析范围为整个二进制应用或用户选择的regions;默认为total。
|
-o/--outpath |
- |
配置生成数据包名称。默认为当前所在目录生成,默认文件名为roofline-YYYYMMDD-HMS。 |
-hbm-mode |
cache/flat |
配置采集HBM数据的模式;若环境不支持HBM,使用此参数不采集HBM数据。
|
采集整个应用的Roofline数据:
1
|
devkit tuner roofline -m total --hbm-mode cache /devkit/testdemo/matrix_multiply_c |
返回信息如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
Note: 1. Roofline task is currently only supported on the 920 platform. 2. The application must be a binary file in ELF format, and read permissions are required to detect the format of the application. 3. Roofline task collection needs to ensure the application has finished running. 4. The estimated time of roofline collection is about 3 * application estimated time. 5. Roofline analysis is available only on physical machines. 6. You can learn about the roofline profiling method by looking at document /devkit/testdemo/DevKit-CLI-xx.xx.xx-Linux-Kunpeng/tuner/docs/ROOFLINE_KNOW_HOW.MD RFCOLLECT: Start collection for /devkit/testdemo/matrix_multiply_c RFCOLLECT: Launch application to collect performance metrics of /devkit/testdemo/matrix_multiply_c Initialization time: 0.085910 seconds Calculation time: 0.371136 seconds The dimension of the matrices is too large to print. RFCOLLECT: Launch application to do binary instrumentation of /devkit/testdemo/matrix_multiply_c Initialization time: 0.153196 seconds Calculation time: 22.620041 seconds The dimension of the matrices is too large to print. RFCOLLECT: Launch benchmarks for measuring roofs RFCOLLECT: Processing all collected data RFCOLLECT: Result is captured at /devkit/testdemo/DevKit-CLI-xx.xx.xx-Linux-Kunpeng/rfcollect-20241213-222445.json RFCOLLECT: Run "rfreport /devkit/testdemo/DevKit-CLI-xx.xx.xx-Linux-Kunpeng/rfcollect-20241213-222445.json" to get report. Get roofline report ... The roofline json report: /devkit/testdemo/DevKit-CLI-xx.xx.xx-Linux-Kunpeng/roofline-20241213-222445.json The roofline html report: /devkit/testdemo/DevKit-CLI-xx.xx.xx-Linux-Kunpeng/roofline-20241213-222445.html |
任务将生成JSON文件和HTML文件,如需进行数据分析可直接使用JSON文件;查看性能数据可在浏览器中打开HTML文件,也可通过Know-how查看如何使用Roofline分析进行调优;查看HTML文件时,点击右上角按钮可配置显示数据,点击
可切换页面的中英文显示,可在页面上方的“区域”下拉框选择需要显示的region或全部内容。