Hotspot Function Analysis
Command Function
Uses ptrace to sample Python programs and Python & C/C++ hybrid programs, analyzes call stacks, obtains top 20 hotspot functions, and draws flame graphs.
- The supported Python version is 3.9.X.
- The supported OS is openEuler.
- Only the PID mode is supported for collection and analysis.
Syntax
devkit py-perf hotspot [-h] [-l {0,1,2,3}] [-d <sec>] [-i <msec>] [-p {PID}] [-o <file>] [--native]
Parameter Description
|
Parameter |
Option |
Description |
|---|---|---|
|
-h/--help |
- |
Obtains help information. |
|
-l/--log-level |
0/1/2/3 |
Log level, which defaults to 1.
|
|
-d/--duration |
- |
Collection duration, in seconds. The minimum value is 1 second. By default collection never ends. You can press Ctrl+\ to cancel the task or press Ctrl+C to stop the collection and start analysis. |
|
-i/--interval |
- |
Collection interval, in milliseconds. The value ranges from 1 to 1000 and the default value is 10. |
|
-p/--pid |
PID |
ID of the process whose information is collected. |
|
-o/--output |
- |
Report file name. By default, the report file is generated in the current directory. The default file name format is FlameGraph-YMD-HMS. |
|
--native |
- |
Indicates whether to collect Python and C call stacks. If this parameter is used, Python and C call stacks are collected. Otherwise, only Python call stacks are collected. |
Example
devkit py-perf hotspot -p 2082596 -d 5 -i 100 -o /home/demo/flamegraph --native
- The -p 2082596 parameter collects information about the process whose ID is 2082596. The -d 5 parameter indicates that the collection duration is 5 seconds. The -i 100 parameter indicates that the collection interval is 100 milliseconds. The -o /home/demo/flamegraph parameter indicates that the flamegraph.html file is generated in /home/demo/. The --native parameter indicates the Python and C call stack information is collected.
- The command output contains too much stack information, most of which has been omitted for easy reading. For the detailed information, see the actual command output.
Command output:
Python Hotspot Top20 Summary Report Time:2024/08/13 15:07:16
================================================================================
───────────────────────────────────────────────────────────────────── Function Runtime(ms) Runtime(%)
───────────────────────────────────────────────────────────────────── pthread_condattr_setpshared 4722 82.04
pthread_condattr_setpshared
clone
blas_thread_server 447 7.77
blas_thread_server
pthread_condattr_setpshared
clone
...
...
...
scipy_dgebal_64_ 9 0.16
scipy_dgebal_64_
scipy_dgeev_64_
void eig_wrapper<f2c_doublecomplex, double>(char, char, char**, long const*, long const*) [clone .constprop.0]
generic_wrapped_legacy_loop
ufunc_generic_fastcall
_PyEval_EvalFrameDefault
_PyEval_EvalFrameDefault
dispatcher_vectorcall
_PyEval_EvalFrameDefault
_PyEval_EvalFrameDefault
_PyEval_EvalFrameDefault
_PyEval_EvalFrameDefault
_PyEval_EvalFrameDefault
eig (/usr/local/lib64/python3.9/site-packages/numpy/linalg/_linalg.py:1702)
compute_eigenvalues (/home/jjl/workspace/pyhotstpot_demo/ai-demo/numpy_test_1.py:10)
run (/usr/lib64/python3.9/concurrent/futures/thread.py:58)
_worker (/usr/lib64/python3.9/concurrent/futures/thread.py:83)
run (/usr/lib64/python3.9/threading.py:910)
_bootstrap_inner (/usr/lib64/python3.9/threading.py:973)
_bootstrap (/usr/lib64/python3.9/threading.py:930)
PyCodec_StreamReader
_PyGC_CollectNoFail
_Py_hashtable_destroy
pthread_condattr_setpshared
clone
dgemm_nn 9 0.16
dgemm_nn
─────────────────────────────────────────────────────────────────────The flamegraph html report: /home/demo/flamegraph.html
You can use a browser to view the HTML file of the flame graph. Because the stack name is too long, you can move the mouse pointer over a stack block to view its information. You can also click a stack block to view its details.