Rate This Document
Findability
Accuracy
Completeness
Readability

Tuning Performance Bottlenecks

Procedure

  1. Modify the source file, as shown in Figure 1. After modification, rename the file to hotspot_io_after.cc and upload it to the /home/demo directory.
    The modified source code is rewritten to use mmap mode, reducing the number of system calls and improving performance.
    Figure 1 Modified source file
  2. Compile the source file.
    g++ -g -O2 -std=c++17 /home/demo/hotspot_io_after.cc -o /home/demo/hotspot_io_after
  3. Switch to the installation directory of the Kunpeng Performance Boundary Analyzer. Replace xxx in the command with the actual version.
    cd /home/ksys-x.x.x-Linux-aarch64
  4. Collect the application performance data after optimization.
    ./ksys collect -d 10 /home/demo/hotspot_io_after /home/demo/tmp.txt
    Figure 2 Hotspot statistics

    According to the hotspot statistics, the call proportion of the process_buffer function increased from 51% to 72%. At the same time, the call proportion of the kernel-mode function copy_page_mc decreased. Overall, the kernel-mode call proportion dropped from approximately 49% to 28%, resulting in improved application computation performance.

  5. Switch to the demo directory and check the runtime of the application after the optimization.
    1. Switch to the demo directory.
      cd /home/demo
    2. Check the application runtime after the optimization.
      time ./hotspot_io_after tmp.txt

      After the command is executed, it is found that the time for reading data in the application decreases from 4.1 seconds to 2 seconds. As a result of the optimization, the application computation performance improves.

      Figure 3 Runtime