Related Concepts
Understand the following concepts before using OmniShuffle.
- In-memory shuffle: During shuffle, data is cached in the memory instead of being directly written to drives. In this way, the drive I/O overhead is reduced to improve the data processing efficiency.
- OCKD process: After OmniShuffle is installed, you can use the OCKD process to start or stop OmniShuffle.
- Remote Shuffle Service (RSS): The shuffle service is deployed on a node outside the Spark cluster and shuffles data on the remote node.
- External Shuffle Service (ESS): The shuffle service is deployed on the compute nodes of the Spark cluster and shuffles data on the compute nodes.
Parent topic: Feature Overview