我要评分
获取效率
正确性
完整性
易理解

Introduction

In storage I/O-intensive scenarios such as Spark and HBase components in distributed storage and big data, the performance of accessing I/O storage devices has a significant impact on the overall service performance. Users are also concerned about the cost per gigabyte of storage devices. The contradiction between storage capacity and I/O performance will exist for a long period of time. It is a good practice to use small-capacity, high-speed storage media as cache drives. Cache drives improve the overall storage I/O performance. They store the predicted I/O data that may be accessed again so that the data can be directly obtained from the high-speed cache.

Figure 1 and Figure 2 illustrate the smart prefetch software architecture for distributed storage and big data respectively.

Figure 1 Smart prefetch software architecture for distributed storage
Figure 2 Smart prefetch software architecture for big data
  • I/O storage devices include hard disk drives (HDDs) and solid-state drives (SSDs).
  • The performance here refers to the bandwidth, latency, and number of operations per unit time for accessing I/O storage devices.
  • Small-capacity, high-speed storage media may be random access memory (RAM) drives or Non-Volatile Memory express (NVMe) SSDs.

The smart prefetch function uses high-speed cache drives and efficient prefetch algorithms to improve the storage I/O performance, thus improving the overall system performance in I/O-intensive scenarios.

The smart prefetch function consists of the following modules:

  1. Huawei smart prefetch driver in kernel mode: bcache
  2. Huawei smart prefetch engine framework in user mode: acache_client
  3. Huawei smart prefetch engine algorithm in user mode: hcache
  4. bcache configuration tool: bcache-tools