OmniShuffle

OmniShuffle is the shuffle acceleration feature.

OmniShuffle runs in big data clusters of the customer's data center as a performance acceleration component of the big data engine Spark. It employs effective features such as unified addressing of the memory pool, data exchange in memory semantics, and converged shuffle to reduce the drive I/O overhead, quicken the data analysis process, and improve cluster resource utilization. As a performance acceleration component of Spark, OmniShuffle uses the plugin mechanism provided by Spark to implement the Shuffle Manager and Broadcast Manager plugin interfaces and replace the native Shuffle and Broadcast of Spark in a non-intrusive manner.

OmniShuffle enables in-memory shuffle by implementing the Shuffle Manager plugin interface. That is, the shuffle process is completed in the memory pool based on memory semantics, reducing shuffle data flushing to disks. The time overhead and computing power overhead caused by data flushing and reading, serialization and deserialization, compression and decompression can be lessened. The Broadcast Manager interface is implemented to enable variable broadcast based on memory pool sharing, improving the transmission efficiency of broadcast variables among executors. In addition, OmniShuffle supports two network modes: Remote Direct Memory Access (RDMA) and TCP. Compared with TCP, RDMA improves transmission efficiency, requires less computing power, and implements efficient data exchange between nodes.

OmniShuffle automatically adjusts the parallelism degree of Spark SQL jobs in real time based on historical data, eliminating the need to manually optimize the parallelism degree and reducing spills in the shuffle-reduce process by 90%. Due to this, OmniShuffle quickens big data cluster jobs while increasing the job throughput.

Overall Solution

OmniShuffle enables in-memory shuffle by implementing the Shuffle Manager interface of Shuffle/Broadcast. Figure 1 shows the overall solution architecture of OmniShuffle. Table 1 describes the subsystems.

Figure 1 Logical architecture of OmniShuffle

**Table 1** Subsystems
Subsystem	Description
Memory pool kit	Provides the distributed shared memory infrastructure and basic memory semantics.
Metadata	Stores basic information about shuffle data and nodes.
Shuffle semantics	Provides APIs for implementing Spark shuffle semantics.
Broadcast semantics	Provides APIs for implementing Spark broadcast semantics.

Service Process

OmniShuffle implements the Spark Shuffle Manager interface to achieve in-memory shuffle. Figure 2 shows the OmniShuffle service process.

Figure 2 Service process

After connecting OmniShuffle to Spark, you can access the Spark CLI to view the cluster running status. Administrators and O&M personnel can view the running status of the cluster on the OmniShuffle CLI.

Parent topic: Key Features