构建及配置
统一使用matmul进行案例演示,且示例的代码中已经基于插桩指导增加相关插桩代码。
- 下载并安装KML。
1 2 3 4
unzip -o BoostKit-kml_2.2.0.zip rpm -ivh boostkit-kml-2.2.0-1.aarch64.rpm export KML_INCLUDE=/usr/local/kml/include/ export KML_LIB=/usr/local/kml/lib/kblas/omp/
- 下载Demo并编译matmul。
1 2 3 4 5
git clone xxx cd xxx export RFEVENTS_INCLUDE=/usr/local/devkit/tuner/include/ export RFEVENTS_LIB=/usr/local/devkit/tuner/lib/ make matmul
/usr/local/devkit/tuner为使用鲲鹏DevKit命令行工具的RPM包安装后生成的路径,使用时请根据实际路径自行替换。
其中matmul的Makefile信息如下:
1
gcc -g -march=armv8.2-a+fp16fml+simd -O3 -fopenmp $(SOURCES) -o matmul -DROOFLINE_EVENTS -I$(RFEVENTS_INCLUDE) -L$(RFEVENTS_LIB) -lrfevents -DENABLE_KML -I$(KML_INCLUDE) -L$(KML_LIB) -lkblas
- 配置环境变量LD_LIBRARY_PATH。
1
export LD_LIBRARY_PATH=$KML_INCLUDE:$KML_LIB:/usr/local/devkit/tuner/lib:$LD_LIBRARY_PATH
- 根据环境实际物理核数显式指定OMP_NUM_THREADS。
1
export OMP_NUM_THREADS=128
父主题: Roofline调优分析