我要评分
获取效率
正确性
完整性
易理解

A File Cannot Be Found or Opened When mpirun Is Running on Multiple Nodes

Symptom

  • When you run the mpirun command on multiple nodes, an error message is displayed indicating that an .so file cannot be found or opened.
    $ mpirun --allow-run-as-root -np 4 -N 2 --hostfile /hf2  /AllReduce
    mpirun: error while loading shared libraries: libopen-rte.so.40: cannot open shared object file: No such file or directory
  • When you run the mpirun command on multiple nodes, an error message is displayed indicating that a file cannot be found or opened.
    $ mpirun --allow-run-as-root -np 4 -N 2 --hostfile /hf2  /AllReduce
    bash: /Hyper-MPI_x.x.x_aarch64_CentOS7.6_GCC9.3_MLNX-OFED5.0/ompi/bin/orted: No such file or directory

Possible Causes

  • An .so file cannot be found or opened.

    The LD_LIBRARY_PATH environment variable has not been configured at the end of the bashrc file.

  • A file cannot be found or opened.

    The Hyper MPI installation paths on the two nodes are different.

Procedure

  • An .so file cannot be found or opened.
    1. Use PuTTY to log in to a job execution node as a common Hyper MPI user, for example, hmpi_user.
    2. Check whether environment variables are correctly configured. For details, see Configuring Environment Variables.
  • A file cannot be found or opened.
    1. Use PuTTY to log in to a job execution node as a common Hyper MPI user, for example, hmpi_user.
    2. You are advised to install Hyper MPI in a mounted shared directory.