我要评分
获取效率
正确性
完整性
易理解

Failed to Exit a Parallel HPC Application Debugging Task

Symptom

The user fails to exit a parallel HPC application debugging task.

Figure 1 Failure to clear the environment
Figure 2 Too long waiting time for clearing the debugging environment
Figure 3 Failure to exit MPI debugging

Possible Causes

  • Too many ranks are started.
  • The network connectivity is unstable.

Procedure

  1. (Optional) Manually delete the files in the displayed path, that is, xxx.
    1
    rm -f xxx
    
  2. Release process resources.
    1. Check the mpirun process.
      1
      ps -ef | grep mpirun
      
    2. Stop the process. pid indicates the process ID.
      1
      kill -15 {pid}
      
  3. Restart the services.
    1
    2
    systemctl restart gunicorn_framework.service
    systemctl restart gunicorn_plugin.service