Application Scenarios
Data volumes soar as Internet big data applications, cloud-native services, and AI applications grow quickly. The traditional storage-compute coupled architecture is difficult to scale out, and even hampers data sharing in the cloud era. To address this pain point, the storage-compute decoupled architecture has emerged as a feasible alternative. However, in the decoupled architecture, applications on the compute side must go to the peer network when accessing data on the storage side. The cross-network access reduces service I/O performance. What's worse, large-scale deployment of compute nodes reduces resource utilization on the compute side.
The BoostIO acceleration kit leverages the Huawei computing platform to build high-performance distributed read and write caches on the compute side. Based on the extensive application ecosystem and broad northbound compatibility of the open source distributed file system JuiceFS, BoostIO has presented itself as an effective solution to performance loss problems inherent in the storage-compute decoupled architecture for big data and AI applications.
In the storage-compute decoupled architecture, the Spark big data engine has the performance bottleneck of slow dataset loading, and large language models (LLMs) in AI converged computing have performance bottlenecks of slow dataset loading and checkpoint writing. BoostIO breaks those performance bottlenecks in the following ways:
- BoostIO builds a multi-tier distributed write cache using memory media and high-speed drives on the compute side. It enables a remote direct memory access (RDMA) high-speed network and the multi-copy redundancy mechanism to ensure high data reliability. BoostIO retains application I/Os on the compute side to reduce the data write latency.
- BoostIO sets up a read cache and a write cache, which are independent of each other. This independent architecture design brings advantages such as independent cache configurations, flexible eviction policies, and independent resource allocation to each cache.
- BoostIO combines the distributed read cache with intelligent data prefetch and hot/cold data identification to ensure that hot and warm data is cached in the memory and high-speed drives on the compute side, whereas cold data is stored in the back-end large-capacity storage cluster. This mechanism increases the cache hit ratio and shortens the data read latency.
In conclusion, BoostIO alleviates performance bottlenecks in big data and AI converged computing and improves end-to-end application performance.