Rate This Document
Findability
Accuracy
Completeness
Readability

Hotspot Function Analysis

Command Function

Uses ptrace to sample Python programs and Python & C/C++ hybrid programs, analyzes call stacks, obtains top 20 hotspot functions, and draws flame graphs.

  • The supported Python version is 3.9.X.
  • The supported OS is openEuler.
  • Only the PID mode is supported for collection and analysis.

Syntax

devkit py-perf hotspot [-h] [-l {0,1,2,3}] [-d <sec>] [-i <msec>] [-p {PID}] [-o <file>] [--native]

Parameter Description

Table 1 Parameter description

Parameter

Option

Description

-h/--help

-

Obtains help information.

-l/--log-level

0/1/2/3

Log level, which defaults to 1.
  • 0: DEBUG
  • 1: INFO
  • 2: WARNING
  • 3: ERROR

-d/--duration

-

Collection duration, in seconds. The minimum value is 1 second. By default collection never ends. You can press Ctrl+\ to cancel the task or press Ctrl+C to stop the collection and start analysis.

-i/--interval

-

Collection interval, in milliseconds. The value ranges from 1 to 1000 and the default value is 10.

-p/--pid

PID

ID of the process whose information is collected.

-o/--output

-

Report file name. By default, the report file is generated in the current directory. The default file name format is FlameGraph-YMD-HMS.

--native

-

Indicates whether to collect Python and C call stacks. If this parameter is used, Python and C call stacks are collected. Otherwise, only Python call stacks are collected.

Example

devkit py-perf hotspot -p 2082596 -d 5 -i 100 -o /home/demo/flamegraph --native
  • The -p 2082596 parameter collects information about the process whose ID is 2082596. The -d 5 parameter indicates that the collection duration is 5 seconds. The -i 100 parameter indicates that the collection interval is 100 milliseconds. The -o /home/demo/flamegraph parameter indicates that the flamegraph.html file is generated in /home/demo/. The --native parameter indicates the Python and C call stack information is collected.
  • The command output contains too much stack information, most of which has been omitted for easy reading. For the detailed information, see the actual command output.

Command output:

Python Hotspot Top20 Summary Report                     Time:2024/08/13 15:07:16

================================================================================

─────────────────────────────────────────────────────────────────────  Function                                                                                                     Runtime(ms)     Runtime(%)
─────────────────────────────────────────────────────────────────────  pthread_condattr_setpshared                                                                                        4722          82.04
      pthread_condattr_setpshared
      clone

  blas_thread_server                                                                                                447           7.77 
      blas_thread_server
      pthread_condattr_setpshared
      clone
...
...
...

  scipy_dgebal_64_                                                                                                    9           0.16
      scipy_dgebal_64_
      scipy_dgeev_64_
      void eig_wrapper<f2c_doublecomplex, double>(char, char, char**, long const*, long const*) [clone .constprop.0]
      generic_wrapped_legacy_loop
      ufunc_generic_fastcall
      _PyEval_EvalFrameDefault
      _PyEval_EvalFrameDefault
      dispatcher_vectorcall
      _PyEval_EvalFrameDefault
      _PyEval_EvalFrameDefault
      _PyEval_EvalFrameDefault
      _PyEval_EvalFrameDefault
      _PyEval_EvalFrameDefault
      eig (/usr/local/lib64/python3.9/site-packages/numpy/linalg/_linalg.py:1702)
      compute_eigenvalues (/home/jjl/workspace/pyhotstpot_demo/ai-demo/numpy_test_1.py:10)
      run (/usr/lib64/python3.9/concurrent/futures/thread.py:58)
      _worker (/usr/lib64/python3.9/concurrent/futures/thread.py:83)
      run (/usr/lib64/python3.9/threading.py:910)
      _bootstrap_inner (/usr/lib64/python3.9/threading.py:973)
      _bootstrap (/usr/lib64/python3.9/threading.py:930)
      PyCodec_StreamReader
      _PyGC_CollectNoFail
      _Py_hashtable_destroy
      pthread_condattr_setpshared
      clone

  dgemm_nn                                                                                                           9            0.16
      dgemm_nn

─────────────────────────────────────────────────────────────────────The flamegraph html report: /home/demo/flamegraph.html

You can use a browser to view the HTML file of the flame graph. Because the stack name is too long, you can move the mouse pointer over a stack block to view its information. You can also click a stack block to view its details.

Figure 1 Flame graph