Introduction

A distributed storage system generally consists of multiple nodes, and each node contains multiple drives. The drives jointly store data through logical data division. Due to sector status and external environment differences, the I/O request processing duration of different drives may vary. As a result, I/O response is slow, and services may be interrupted, affecting cluster performance. If slow drives can be detected in advance when services are running, service isolation can be performed to reduce long-tail latency in clusters and improve cluster stability. The slow drive detection feature can be used to collect w_await information of system drives, identify and process abnormal drive data, and confirm the drive status.

Figure 1 Working principle of slow HDD/SSD detection

Parent topic: Slow HDD/SSD Detection