使用Roofline分析
命令功能
Roofline分析可帮助用户在给定硬件平台上,分析出应用程序的瓶颈点位置,从而有针对性的进行优化。
 
     - 仅支持鲲鹏平台物理机。
 - Roofline使用DBI技术,待分析的应用必须为ELF格式的二进制文件。
 - Roofline采集时需确保应用已完成运行;Roofline采集时长大概为应用运行时长三倍。
 
命令格式
devkit tuner roofline [-h] [-l {0,1,2,3}] [-m {total,region}] [-o <file>] workload ...
    
   参数说明
| 
          参数  | 
        
          参数选项  | 
        
          说明  | 
       
|---|---|---|
| 
          -h/--help  | 
        
          -  | 
        
          获取帮助信息。  | 
       
| 
          -l/--log-level  | 
        
          0/1/2/3  | 
        
          
          设置日志级别,默认为2。
           
  | 
       
| 
          -m/--mode  | 
        
          total/region  | 
        
          配置分析范围为整个二进制应用或用户选择的regions;默认为total。 
  | 
       
| 
          -o/--outpath  | 
        
          -  | 
        
          配置生成数据包名称。默认为当前所在目录生成,默认文件名为roofline-YYYYMMDD-HMS。  | 
       
使用示例
采集已划分好region的应用:
devkit tuner roofline -m region /mysharedir/devkit/SystemProfilerBackend/tuner_cli/docs/matrix_multiply_c
返回信息如下:
Note: 1. Roofline task is currently only supported on the 920 platform. 2. The application must be a binary file in ELF format. 3. Roofline task collection needs to ensure the application has finished running. 4. The estimated time of roofline collection is about 3 * application estimated time. RFCOLLECT: Start collection for /mysharedir/devkit/SystemProfilerBackend/tuner_cli/docs/matrix_multiply_c RFCOLLECT: Launch application to collect performance metrics of /mysharedir/devkit/SystemProfilerBackend/tuner_cli/docs/matrix_multiply_c ROOFLINE_EVENTS are initialized. Initialization time: 0.070167 seconds Calculation time: 0.206211 seconds The dimension of the matrices is too large to print. RFCOLLECT: Launch application to do binary instrumentation of /mysharedir/devkit/SystemProfilerBackend/tuner_cli/docs/matrix_multiply_c Initialization time: 0.168616 seconds Calculation time: 2.243492 seconds The dimension of the matrices is too large to print. RFCOLLECT: Launch benchmarks for measuring roofs RFCOLLECT: Processing all collected data RFCOLLECT: Result is captured at /mysharedir/devkit/SystemProfilerBackend/sys_perf/components/sys_tools/rfcollect-20240424-143840.json RFCOLLECT: Run "rfreport /mysharedir/devkit/SystemProfilerBackend/sys_perf/components/sys_tools/rfcollect-20240424-143840.json" to get report. Get roofline report ... The roofline json report: /mysharedir/devkit/SystemProfilerBackend/sys_perf/components/sys_tools/roofline-20240424-143840.json The roofline html report: /mysharedir/devkit/SystemProfilerBackend/sys_perf/components/sys_tools/roofline-20240424-143840.html
任务将生成JSON文件和HTML文件,如需进行数据分析可直接使用JSON文件,查看性能数据可在浏览器中打开HTML文件。
     图1 Roofline HTML文件
     
    
   
    
     父主题: Roofline分析