我要评分
获取效率
正确性
完整性
易理解

Build and Configuration

This demo uses the matmul binary as an example. Instrumentation code has been added to the demo based on the Instrumentation Guide.
  1. Download and install the KML.
    1
    2
    3
    4
    unzip -o BoostKit-kml_2.2.0.zip 
    rpm -ivh boostkit-kml-2.2.0-1.aarch64.rpm 
    export KML_INCLUDE=/usr/local/kml/include/ 
    export KML_LIB=/usr/local/kml/lib/kblas/omp/
    
  2. Download the demo and compile matmul.
    1
    2
    3
    4
    5
    git clone xxx 
    cd xxx 
    export RFEVENTS_INCLUDE=/usr/local/devkit/tuner/include/ 
    export RFEVENTS_LIB=/usr/local/devkit/tuner/lib/ 
    make matmul
    

    /usr/local/devkit/tuner is an example path generated after installing the RPM package of the Kunpeng DevKit command line tool. Replace it with the actual path.

    matmul has the following Makefile content:

    1
    gcc -g -march=armv8.2-a+fp16fml+simd -O3 -fopenmp $(SOURCES) -o matmul -DROOFLINE_EVENTS -I$(RFEVENTS_INCLUDE) -L$(RFEVENTS_LIB) -lrfevents -DENABLE_KML -I$(KML_INCLUDE) -L$(KML_LIB) -lkblas
    
  3. Set the LD_LIBRARY_PATH environment variable.
    1
    export LD_LIBRARY_PATH=$KML_INCLUDE:$KML_LIB:/usr/local/devkit/tuner/lib:$LD_LIBRARY_PATH
    
  4. Explicitly specify OMP_NUM_THREADS based on the number of physical cores.
    1
    export OMP_NUM_THREADS=128