Running and Verification

Procedure

Use PuTTY to log in to the server as the root user.
Create a working directory.
```
mkdir -p /path/to/CASE
```

Go to the working directory, and copy the test cases and binary files to the working directory.

cd /path/to/CASE
cp /path/to/LAMMPS/lammps-5Jun19/bench/in.lj  ./
cp /path/to/LAMMPS/lammps-5Jun19/src/lmp_mpi  ./

Start the running.

Run single-node commands on CentOS.

mpirun --allow-run-as-root -np 96 --mca btl ^openib  ./lmp_mpi -in in.lj >>test_OneNode.log

Run single-node commands on openEuler.

mpirun --allow-run-as-root -np 96  -mca pml ucx -mca btl ^vader,tcp,openib,uct -x UCX_TLS=self,sm --bind-to core --map-by socket --rank-by core -x UCX_BUILTIN_BCAST_ALGORITHM=3 -x UCX_BUILTIN_BARRIER_ALGORITHM=5 -x UCX_BUILTIN_ALLREDUCE_ALGORITHM=10  ./lmp_mpi -in in.lj >> ./test_OneNode.log

Output the result to the test_OneNode.log file in the current directory and check the value of Performance (unit: timesteps/s). A larger value indicates higher performance.

The following is an example of the test result.

Performance: 1134386.210 tau/day, 2625.894 timesteps/s
99.7% CPU use with 96 MPI tasks x no OpenMP threads
MPI task timing breakdown:
Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
Pair    | 0.021651   | 0.023624   | 0.025878   |   0.6 | 62.03
Neigh   | 0.002734   | 0.0029155  | 0.0031491  |   0.2 |  7.66
Comm    | 0.007533   | 0.010058   | 0.012192   |   1.1 | 26.41
Output  | 5.6281e-05 | 0.00050681 | 0.00062442 |   0.0 |  1.33
Modify  | 0.00056003 | 0.00061947 | 0.00070919 |   0.0 |  1.63
Other   |            | 0.0003584  |            |       |  0.94

Run dual-node commands on CentOS.

mpirun --allow-run-as-root -np 192 -N 96 -x PATH=$PATH -x LD_LIBRARY_PATH=$LD_LIBRARY_PATH  -machinefile machinefile --mca btl ^openib ./lmp_mpi -in in.lj >> ./test_TwoNodes.log

Add the host names (for example, n1 and n2) of the two specified compute nodes to the machinefile file, as shown in the following figure:

Run dual-node commands on openEuler.

mpirun --allow-run-as-root -np 192 -N 96 -machinefile machinefile  -x PATH=$PATH -x LD_LIBRARY_PATH=$LD_LIBRARY_PATH -mca pml ucx -mca btl ^vader,tcp,openib,uct  --bind-to core  --rank-by core ./lmp_mpi -in in.lj >> ./test_TwoNodes.log

Add the host names (for example, n1 and n2) of the two specified compute nodes to the machinefile file, as shown in the following figure:

Output the result to the test_TwoNodes.log file in the current directory and check the value of Performance (unit: timesteps/s). A larger value indicates higher performance.

The following is an example of the test result.

Performance: 1605508.300 tau/day, 3716.454 timesteps/s
91.0% CPU use with 192 MPI tasks x no OpenMP threads
MPI task timing breakdown:
Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
Pair    | 0.01035    | 0.01174    | 0.013347   |   0.6 | 43.63
Neigh   | 0.0013662  | 0.0014931  | 0.0016205  |   0.1 |  5.55
Comm    | 0.01112    | 0.012935   | 0.014469   |   0.6 | 48.07
Output  | 7.3471e-05 | 0.00010074 | 0.00017455 |   0.0 |  0.37
Modify  | 0.00023517 | 0.00032949 | 0.00040742 |   0.0 |  1.22
Other   |            | 0.0003095  |            |       |  1.15

**Table 1** Parameter description
Parameter	Description
-np	Total number of running MPI processes.
-N	Number of processes running on each server.
-machinefile	Name of the node to be used.

The dual-node test cases run in the shared directory. If the PATH and LD_LIBRARY_PATH environment variables have been configured in Configuring the Compilation Environment, do not need to configure them again.
If hyper-threading is not enabled, the np value must be less than or equal to the number of nodes multiplied by the number of CPU cores on each node.
n1 and n2 are the host names.

Parent topic: LAMMPS 5 Jun 2019 Porting Guide (openEuler 21.03)