Rate This Document
Findability
Accuracy
Completeness
Readability

Coll Mode

The --mca coll ^ucg parameter does not need to be added when using the UCG function of Hyper MPI in coll mode. Currently, the coll mode supports the following MPI collective operations: Allreduce, Bcast, Barrier, Allgatherv, and Scatterv.

In coll mode, the default algorithm or a specified algorithm can be used.

  • The default algorithm implements the automatic algorithm selection function based on the packet length, processes per node (PPN), and node quantity. That is, algorithms are selected based on packet lengths, PPNs, and node quantities.

    PPN stands for processes per node.

  • To specify an algorithm, you need to set the relevant command parameters.

Default Algorithm Mode

In default algorithm mode, Hyper MPI selects algorithms based on the packet lengths, PPNs, and node quantities. You can directly run the mpirun command without adding extra parameters.

  • The default algorithm for Allreduce, Bcast, Allgatherv, and Scatterv collective operations is selected based on the packet length, PPN, and node quantity.
  • The default algorithm for Barrier collective operations is selected based on the PPN and node quantity.

The following command is an example:

mpirun -np 16 -N 2 --hostfile hf test_case

In some scenarios, algorithms might be switched. For details, see Feature Support and Rollback.

Specified Algorithm Mode

  • Method 1: Specify an algorithm using CLI parameters.
    1. Use PuTTY to log in to a job execution node as a common Hyper MPI user, for example, hmpi_user.
    2. Add the following parameters to the mpirun command:

      -x UCG_PLANC_UCX_ALLREDUCE_ATTR=I:nS:200R:0-

  • Method 2: Select an algorithm by setting environment variables.
    1. Use PuTTY to log in to a job execution node as a common Hyper MPI user, for example, hmpi_user.
    2. Write the following command to the ~/.bashrc file on all nodes:

      export UCG_PLANC_UCX_ALLREDUCE_ATTR=I:nS:200R:0-

    3. Run the following command to make the environment variables take effect:

      source ~/.bashrc

    4. If the environment variables are not required, delete the preceding command from ~/.bashrc and run the following command to delete the environment variables that have taken effect:

      unset UCG_PLANC_UCX _ALLREDUCE_ ATTR

  • UCG_PLANC_UCX_ALLREDUCE_ATTR indicates the Allreduce algorithm parameter. For details about other optional parameters, see Table 6.
  • I indicates the algorithm ID.
  • n indicates the sequence number of an algorithm. If the value selected by the algorithm is not within the valid range, the default algorithm is executed.
  • S indicates the algorithm score (default to INT_MAX). This parameter is optional.
  • R indicates the size range of the algorithm package. The default value is [0, max). This parameter is optional.

Table 1, Table 2, Table 3, Table 4, and Table 5 show the mappings between algorithm sequence numbers and algorithms.

Table 1 Allreduce algorithms

No.

Algorithm

Remarks

1

Recursive

Usable after Hyper MPI is installed using source code.

2

Node-aware Recursive+Binomial

Usable after Hyper MPI is installed using source code.

3

Socket-aware Recursive+Binomial

Usable after Hyper MPI is installed using source code.

4

Ring

Usable after Hyper MPI is installed using source code.

5

Node-aware Recursive+K-nomial

Usable after Hyper MPI is installed using source code.

6

Socket-aware Recursive+K-nomial

Usable after Hyper MPI is installed using source code.

7

Node-aware K-nomial+K-nominal

Usable after Hyper MPI is installed using source code.

8

Socket-aware K-nominal+K-nominal

Usable after Hyper MPI is installed using source code.

12

Rabenseifner

Usable after Hyper MPI is installed using source code.

13

Node-aware Rabenseifner

Usable after Hyper MPI is installed using source code.

14

Socket-aware Rabenseifner

Usable after Hyper MPI is installed using source code.

Table 2 Bcast algorithms

No.

Algorithm

Remarks

2

Node-aware Binomial+Binomial

Usable after Hyper MPI is installed using source code.

3

Node-aware K-nomial+Binomial

Usable after Hyper MPI is installed using source code.

4

Node-aware K-nomial+K-nomial

Usable after Hyper MPI is installed using source code.

6

Ring

Usable after Hyper MPI is installed using source code.

Table 3 Barrier algorithms

No.

Algorithm

Remarks

1

Recursive

Usable after Hyper MPI is installed using source code.

2

Node-aware Recursive+Binomial

Usable after Hyper MPI is installed using source code.

3

Socket-aware Recursive+Binomial

Usable after Hyper MPI is installed using source code.

4

Node-aware Recursive+K-nomial

Usable after Hyper MPI is installed using source code.

5

Socket-aware Recursive+K-nomial

Usable after Hyper MPI is installed using source code.

6

Node-aware K-nomial+K-nominal

Usable after Hyper MPI is installed using source code.

7

Socket-aware K-nominal+K-nominal

Usable after Hyper MPI is installed using source code.

Table 4 Scatterv algorithms

No.

Algorithm

Remarks

1

Linear

Usable after Hyper MPI is installed using source code.

2

Knomial tree

Usable after Hyper MPI is installed using source code.

Table 5 Allgatherv algorithms

No.

Algorithm

Remarks

1

Neighbor exchange

Usable after Hyper MPI is installed using source code.

2

Ring

Usable after Hyper MPI is installed using source code.

3

Ring-HPL

Usable after Hyper MPI is installed using source code.

4

Linear

Usable after Hyper MPI is installed using source code.

5

Bruck

Usable after Hyper MPI is installed using source code.

Hyper MPI is compatible with command parameters of Open MPI. For details, see https://www.open-mpi.org/doc/current/man1/mpirun.1.php. For details about new command parameters of Hyper MPI, see Table 6.

Table 6 New command parameters of Hyper MPI

Parameter

Description

-x UCG_PLANC_UCX_ALLREDUCE_FANOUT_INTER_DEGREE=

Fan-out value between K-tree nodes of Allreduce.

-x UCG_PLANC_UCX_ALLREDUCE_FANIN_INTER_DEGREE=

Fan-in value between K-tree nodes of Allreduce.

-x UCG_PLANC_UCX_ALLREDUCE_FANOUT_INTRA_DEGREE=

Fan-out value of the Allreduce K-tree node.

-x UCG_PLANC_UCX_ALLREDUCE_FANIN_INTRA_DEGREE=

Fan-in value of the Allreduce K-tree node.

-x UCG_PLANC_UCX_BARRIER_FANOUT_INTER_DEGREE=

Fan-out value between K-tree nodes of Barrier.

-x UCG_PLANC_UCX_BARRIER_FANIN_INTER_DEGREE=

Fan-in value between K-tree nodes of Barrier.

-x UCG_PLANC_UCX_BARRIER_FANOUT_INTRA_DEGREE=

Fan-out value of the Barrier K-tree node.

-x UCG_PLANC_UCX_BARRIER_FANIN_INTRA_DEGREE=

Fan-in value of the Barrier K-tree node.

-x UCG_PLANC_UCX_BCAST_ATTR=

Bcast algorithm parameter. The number following the equal sign (=) indicates the algorithm sequence number. For example, in -x UCG_PLANC_UCX_BCAST_ATTR=I:2, the number 2 indicates Bcast algorithm 2.

-x UCG_PLANC_UCX_ALLREDUCE_ATTR=

Allreduce algorithm parameter. The number following the equal sign (=) indicates the algorithm sequence number. For example, in -x UCG_PLANC_UCX_ALLREDUCE_ATTR=I:7, the number 7 indicates Allreduce algorithm 7.

-x UCG_PLANC_UCX_BARRIER_ATTR=

Barrier algorithm parameter. The number following the equal sign (=) indicates the algorithm sequence number. For example, in -x UCG_PLANC_UCX_BARRIER_ATTR=I:4, the number 4 indicates Barrier algorithm 4.

-x UCG_PLANC_UCX_ALLGATHERV_ATTR=

Allgatherv algorithm parameter. The number following the equal sign (=) indicates the algorithm sequence number. For example, in -x UCG_PLANC_UCX_ ALLGATHERV_ATTR=I:3, the number 3 indicates Allgatherv algorithm 3.

-x UCG_PLANC_UCX_SCATTERV_ATTR=

Scatterv algorithm parameter. The number following the equal sign (=) indicates the algorithm sequence number. For example, in -x UCG_PLANC_UCX_ SCATTERV_ATTR=I:1, the number 1 indicates Scatterv algorithm 1.

-x UCG_PLANC_UCX_BCAST_NA_KNTREE_INTER_DEGREE=

K value between Bcast node-aware Kn-tree algorithm nodes. The default value is 8.

-x UCG_PLANC_UCX_BCAST_NA_KNTREE_INTRA_DEGREE=

K value of the Bcast node-aware Kn-tree algorithm node. The default value is 2.

-x UCG_PLANC_UCX_REDUCE_KNTREE_DEGREE=

K value of the Reduce Kn-tree algorithm. The default value is 2.

-x UCG_PLANC_UCX_SCATTERV_KNTREE_DEGREE=

K value of the Scatterv Kn-tree algorithm. The default value is 2.

-x UCG_PLANC_UCX_SCATTERV_MIN_SEND_BATCH=

Minimum boundary value of the TX batch processing mode of the Scatterv linear algorithm. The default value is auto.

-x UCG_PLANC_UCX_SCATTERV_MAX_SEND_BATCH=

Maximum boundary value of the TX batch processing mode of the Scatterv linear algorithm. The default value is auto.

-x UCG_PLANC_UCX_NPOLLS=

Number of UCP progress polling cycles for P2P request test. The default value is 3.

-x UCG_PLANC_UCX_USE_OOB=

Whether to reuse the OMPI PML UCX link. try indicates that the link is enabled. yes indicates that the link is forcibly enabled. no indicates that the link is disabled. The default value is try.

-x UCG_USE_MT_MUTEX=

Whether to use a mutual exclusion (mutex) in UCG to support multiple threads. y indicates that the mutex lock is used, n indicates that the default spinlock lock is used. The default value is n.

-x UCG_LOG_LEVEL=

UCG log level. The value can be fatal, error, warn, info, debug, or trace. The default value is warn.

--mca coll_ucg_priority

Priority for Hyper MPI to call the UCG module. A larger value indicates a higher priority. The default value is 90. For example, --mca coll_ucg_priority 100 indicates that the UCG priority is set to 100.

--mca coll_ucg_verbosity

Log verbosity of the UCG component. The default value is 2.

--mca coll_ucg_max_rcache_size

Set the request cache size of the COLL UCG component. The default value is 0, indicating that the cache is disabled. If this parameter is set to 1024, 1,024 caches are enabled.

--mca coll_ucg_disable_coll

Disable some collective operations at run time, for example, --mca coll_ucg_disable_coll barrier,bcast. Use commas (,) to separate multiple collective operations.