Product Overview

Concepts

Message Passing Interface (MPI) is a parallel computing communication interface that supports multi-language programming. It is developed based on Open MPI and the Open UCX P2P communication framework. It integrates the UCX COLL and UCG frameworks for collective communication, and implements the collective operation acceleration algorithm in the frameworks. Hyper MPI features outstanding performance, massive processing capability, and portability. It applies to manufacturing, meteorology, and government HPC scenarios. With Hyper MPI, it is promising to build a high-performance computing ecosystem based on Huawei-developed Kunpeng servers in the long term.

Benefits

There are a large number of MPI collective communication functions. Only MPI 3.1 has defined more than 30 collective communication functions. Among these collective communication functions, MPI_Allreduce, MPI_Bcast, and MPI_Barrier play an important role, and are more frequently invoked. In many applications, most of MPI collective operations adopt small-packet communication.

Optimum cost effectiveness
Hyper MPI has optimized the algorithms and topology awareness for the preceding three collective communication functions. The optimizations ensure that compared with peer offerings, Hyper MPI offers a higher MPI_Bcast performance and a close MPI_Allreduce performance for small packets.
Free ecosystem
Currently, the industry-leading MPI libraries adopt closed-source architecture and do not support the Kunpeng ecosystem. However, Hyper MPI can be enabled not only on x86 servers, but also on servers and clusters powered by Kunpeng processors. Different from x86-based servers, Kunpeng-based servers have more cores on a single node, simpler instruction sets, lower power consumption, and lower costs. This helps to build a computing ecosystem with Kunpeng chips.

Functions

Hyper MPI, which is based on Open MPI, integrates the Open UCX P2P communication framework and UCX COLL collective communication optimization framework, and implements the optimization algorithm acceleration library in the UCX COLL framework. It helps Huawei gain the competitive edge of MPI collective communication. Hyper MPI supports the following collective communication operations:

MPI_Allreduce collective operations
- Prototype
```
int MPI_Allreduce(void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, MPI_Comm comm)
```
- Function description
  MPI_Allreduce is an MPI group reduction function. MPI_Allreduce performs a mathematical operation (for example, addition or multiplication) or logical operation (for example, AND or OR) on the send buffer of each independent process, and then synchronizes the result to the receive buffer of all processes in the communication domain.
MPI_Bcast collective operations
- Prototype
```
int MPI_Bcast(void *buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm)
```
- Function description
  MPI_Bcast is an MPI broadcast handling function. The root process sends data in the buffer to all the other processes in the communication domain so that all processes obtain the same data.
MPI_Barrier collective operations
- Prototype
```
int MPI_Barrier(MPI_Comm comm)
```
- Function description
  MPI_Barrier is an MPI synchronization function. It is used to synchronize all processes in the communication domain, ensuring that all the processes are synchronized after a process invokes a function.

The Hyper MPI collective communication algorithms support a maximum of $\text{[math]}$ bytes for each data packet. If the length exceeds $\text{[math]}$ bytes, an error message is displayed and the algorithm exits. In this case, you need to manually switch to the native Open MPI algorithm and retry.

Parent topic: Product Description