我要评分
获取效率
正确性
完整性
易理解

Using llvm-autotune

You can write the tuning scripts as required. The following uses the coremark as an example to describe how to perform automatic tuning. The release package of the BiSheng compiler does not contain the coremark. Obtain the coremark from the community. The following is an example of the script for tuning the coremark in 20 iterations:

export AUTOTUNE_DATADIR=/tmp/autotuner_data/
CompileCommand="clang -O2 -o coremark core_list_join.c core_main.c core_matrix.c core_state.c core_util.c posix/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=300000 -I. -Iposix -g -DFLAGS_STR=\"\""

$CompileCommand -fautotune-generate;
llvm-autotune minimize;
for i in $(seq 20)
do
  $CompileCommand -fautotune ;
  time=`{ /usr/bin/time -p ./coremark  0x0 0x0 0x66 300000; } 2>&1 | grep  "real" | awk '{print $2}'`;
  echo "iteration: " $i "cost time:" $time;
  llvm-autotune feedback $time;
done
llvm-autotune finalize;

The steps are as follows:

  1. Configure the environment variable.

    Set the environment variable AUTOTUNE_DATADIR to specify the storage location of tuning-related data. The specified directory must be empty.

    export AUTOTUNE_DATADIR=/tmp/autotuner_data/
  2. Initialize compilation.

    Add the -fautotune-generate option to the BiSheng compiler to generate tuning opportunities.

    cd  examples/coremark/
    clang -O2 -o coremark core_list_join.c core_main.c core_matrix.c core_state.c core_util.c posix/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=300000 -I. -Iposix -g -DFLAGS_STR=\"\" -fautotune-generate

    It is recommended that this option be applied only to hotspot code files that require tuning. If it is applied to too many code files (more than 500), a large number of tuning opportunity files are generated. As a result, the initialization in step 3 may take a long time (several minutes). In addition, the tuning effect is not satisfactory and the convergence time is long due to the huge search scope.

  3. Initialize tuning.

    Run the llvm-autotune command to initialize the tuning task. Generate the initial compilation configuration for the next compilation.

    llvm-autotune minimize

    minimize is to minimize metrics such as program running time. You can also use maximize to maximize metrics such as program throughput.

  4. Start tuning and compilation.

    Add the -fautotune option to BiSheng compiler to read the current AUTOTUNE_DATADIR configuration and start compilation.

    clang -O2 -o coremark core_list_join.c core_main.c core_matrix.c core_state.c core_util.c posix/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=300000 -I. -Iposix -g -DFLAGS_STR=\"\" -fautotune
  5. Obtain performance feedback.

    You can run the program and obtain performance data based on your requirements. Run the llvm-autotune feedback command to obtain the performance feedback data. For example, if you want to perform the tuning based on the coremark running speed, run the following commands:

    time -p ./coremark  0x0 0x0 0x66 300000  2>&1 1>/dev/null | grep real | awk '{print $2}'

    llvm-autotune feedback 31.09

    Before running the llvm-autotune feedback command, you are advised to check whether the compilation in step 4 is normal and whether the compiled program is running properly. If the compilation or running is abnormal, enter the worst value of the tuning target. For example, if you want to minimize the tuning performance, enter llvm-autotune feedback 9999. If you want to maximize the tuning performance, enter 0 or -9999.

    If the input performance feedback is incorrect, the final tuning result may be affected.

  6. Iterate tuning.

    Repeat 4 and 5 to perform tuning iteration based on the specified number of iterations.

  7. Stop tuning.

    After multiple iterations, you can stop the tuning and save the optimal configuration file. The configuration file is saved in the directory specified by the environment variable AUTOTUNE_DATADIR.

    llvm-autotune finalize
  8. Perform the final compilation.

    Use the optimal configuration file obtained in step 7 to perform the final compilation. If the environment variable is not changed, you can directly use the -fautotune option.

    clang -O2 -o coremark core_list_join.c core_main.c core_matrix.c core_state.c core_util.c posix/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=300000 -I. -Iposix -g -DFLAGS_STR=\"\" -fautotune

    Alternatively, you can run the use -mllvm -auto-tuning-input= command to directly point to the configuration file.

    clang -O2 -o coremark core_list_join.c core_main.c core_matrix.c core_state.c core_util.c posix/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=300000 -I. -Iposix -g -DFLAGS_STR=\"\" -mllvm -auto-tuning-input=/tmp/autotuner_data/config.yaml