Kunpeng System Profiler Lite
The tool collects multi-dimensional performance data with one click, including misses, memory access statistics, NUMA nodes, microarchitecture, miss latency, hotspot functions, CPU usage, NIC bandwidth, I/O, memory usage, and softirq, aligns the data based on the time line, and graphically displays the resource usage from the service layer to the chip layer.
For details about how to use the tool, see the README.md file in the tool package.