Running and Verification
Procedure
- Use PuTTY to log in to the server as the root user.
- Create a working directory.
mkdir -p /path/to/CASE
- Go to the working directory, and copy the test cases and binary files to the working directory.
cd /path/to/CASE cp /path/to/LAMMPS/lammps-5Jun19/bench/in.lj ./ cp /path/to/LAMMPS/lammps-5Jun19/src/lmp_mpi ./
- Start the running.
- Run single-node commands on CentOS.
mpirun --allow-run-as-root -np 96 --mca btl ^openib ./lmp_mpi -in in.lj >>test_OneNode.log
- Run single-node commands on openEuler.
mpirun --allow-run-as-root -np 96 -mca pml ucx -mca btl ^vader,tcp,openib,uct -x UCX_TLS=self,sm --bind-to core --map-by socket --rank-by core -x UCX_BUILTIN_BCAST_ALGORITHM=3 -x UCX_BUILTIN_BARRIER_ALGORITHM=5 -x UCX_BUILTIN_ALLREDUCE_ALGORITHM=10 ./lmp_mpi -in in.lj >> ./test_OneNode.log
Output the result to the test_OneNode.log file in the current directory and check the value of Performance (unit: timesteps/s). A larger value indicates higher performance.
The following is an example of the test result.
Performance: 1134386.210 tau/day, 2625.894 timesteps/s 99.7% CPU use with 96 MPI tasks x no OpenMP threads MPI task timing breakdown: Section | min time | avg time | max time |%varavg| %total --------------------------------------------------------------- Pair | 0.021651 | 0.023624 | 0.025878 | 0.6 | 62.03 Neigh | 0.002734 | 0.0029155 | 0.0031491 | 0.2 | 7.66 Comm | 0.007533 | 0.010058 | 0.012192 | 1.1 | 26.41 Output | 5.6281e-05 | 0.00050681 | 0.00062442 | 0.0 | 1.33 Modify | 0.00056003 | 0.00061947 | 0.00070919 | 0.0 | 1.63 Other | | 0.0003584 | | | 0.94
- Run dual-node commands on CentOS.
mpirun --allow-run-as-root -np 192 -N 96 -x PATH=$PATH -x LD_LIBRARY_PATH=$LD_LIBRARY_PATH -machinefile machinefile --mca btl ^openib ./lmp_mpi -in in.lj >> ./test_TwoNodes.log
Add the host names (for example, n1 and n2) of the two specified compute nodes to the machinefile file, as shown in the following figure:

- Run dual-node commands on openEuler.
mpirun --allow-run-as-root -np 192 -N 96 -machinefile machinefile -x PATH=$PATH -x LD_LIBRARY_PATH=$LD_LIBRARY_PATH -mca pml ucx -mca btl ^vader,tcp,openib,uct --bind-to core --rank-by core ./lmp_mpi -in in.lj >> ./test_TwoNodes.log
Add the host names (for example, n1 and n2) of the two specified compute nodes to the machinefile file, as shown in the following figure:

Output the result to the test_TwoNodes.log file in the current directory and check the value of Performance (unit: timesteps/s). A larger value indicates higher performance.
The following is an example of the test result.
Performance: 1605508.300 tau/day, 3716.454 timesteps/s 91.0% CPU use with 192 MPI tasks x no OpenMP threads MPI task timing breakdown: Section | min time | avg time | max time |%varavg| %total --------------------------------------------------------------- Pair | 0.01035 | 0.01174 | 0.013347 | 0.6 | 43.63 Neigh | 0.0013662 | 0.0014931 | 0.0016205 | 0.1 | 5.55 Comm | 0.01112 | 0.012935 | 0.014469 | 0.6 | 48.07 Output | 7.3471e-05 | 0.00010074 | 0.00017455 | 0.0 | 0.37 Modify | 0.00023517 | 0.00032949 | 0.00040742 | 0.0 | 1.22 Other | | 0.0003095 | | | 1.15
Table 1 Parameter description Parameter
Description
-np
Total number of running MPI processes.
-N
Number of processes running on each server.
-machinefile
Name of the node to be used.
- The dual-node test cases run in the shared directory. If the PATH and LD_LIBRARY_PATH environment variables have been configured in Configuring the Compilation Environment, do not need to configure them again.
- If hyper-threading is not enabled, the np value must be less than or equal to the number of nodes multiplied by the number of CPU cores on each node.
- n1 and n2 are the host names.
- Run single-node commands on CentOS.