我要评分
获取效率
正确性
完整性
易理解

Troubleshooting Process

If an exception or error occurs when Hyper MPI submits a job, use the rollback method to locate the fault and then rectify the fault as follows:

Procedure

  1. Switch to the default algorithm (no algorithm ID is specified).
    • If the fault is rectified, the specified algorithm does not support the current scenario. In this case, contact Huawei technical support.
    • If the fault persists, go to 2.
  2. Switch to the non-coll mode.

    Add the --mca coll ^ucx parameter to the mpirun command for submission. If no fault occurs, Open MPI supports this scenario but Hyper MPI does not.

    If the fault persists, the possible causes are as follows:

    • Open MPI does not support this scenario.
    • The MPI is not properly used. You can check the environment variable settings and MPI installation by referring to "FAQs".