Tuning Performance Bottlenecks
Procedure
- Modify the source file, as shown in Figure 1. After modification, rename the file to cpu_branch_prediction_after.cpp and upload it to the /home/demo directory.
- Compile the source file.
g++ -o /home/demo/cpu_branch_prediction_after /home/demo/cpu_branch_prediction_after.cpp
- Switch to the installation directory of the Kunpeng Performance Boundary Analyzer. Replace xxx in the command with the actual version.
cd /home/ksys-x.x.x-Linux-aarch64
- Collect the application performance data after optimization.
./ksys collect /home/demo/cpu_branch_prediction_after
Figure 2 Microarchitecture statistics
According to the microarchitecture statistics, the proportion of Branch Mispredicts(%) under Bad Speculation(%) decreases from 54% to less than 1%, resulting in improved application performance.
- Switch to the demo directory and check the runtime of the application after the optimization.
- Switch to the demo directory.
cd /home/demo
- Check the application runtime after the optimization.
time /home/demo/cpu_branch_prediction_after
After the command is executed, it is found that the application runtime decreases from 61 seconds to 21 seconds. As a result of the optimization, the application computation performance improves.
Figure 3 Runtime
- Switch to the demo directory.
Parent topic: Practice 1: Microarchitecture Analysis
