Change Description
In addition to the External Shuffle Service (ESS) mode, the Remote Shuffle Service (RSS) mode is added. The RSS works on decoupled storage and compute architecture. By optimizing the Spark shuffle write process, BoostRSS saves the data generated in the Map stage to RSS nodes. In this way, the original small files and small I/O operations are aggregated into efficient large files and continuous large I/O operations. The RSS significantly improves the overall drive read and write efficiency and alleviates the I/O burden on compute nodes. As a result, MapReduce task execution performance increases.
New Features
- Added support for the RSS mode, which is more efficient.
- Added support for replicas, which improve system reliability in case of faults and errors.
- Added support for traffic control and load balancing.
Modified Features
None
Removed Features
None
Parent topic: V1.5.0