Rate This Document
Findability
Accuracy
Completeness
Readability

Adjusting the Dirty Data Refresh Policy to Reduce the Disk I/O Pressure

Principle

The data that needs to be written back to disks in the page cache is dirty data. When an application instructs the system to save dirty data, the application can directly write the data to a disk (O_DIRECT mode) or write the data to the page cache (non-O_DIRECT mode). In non-O_DIRECT mode, operations on data cached in the page cache are performed in the memory, reducing operations on disks.

Procedure

The system provides the following parameters to adjust the policy:

  1. /proc/sys/vm/dirty_expire_centiseconds: This parameter specifies the duration for storing dirty data in the cache, that is, when the duration expires, the dirty data needs to be written to disks. The default value of this parameter is 30s (3000 x 0.01s). If service data is written continuously, set this parameter to a smaller value to prevent burst I/O waiting caused by centralized I/Os. You can run the echo command to change the value.

    echo 2000 > /proc/sys/vm/dirty_expire_centisecs

  2. /proc/sys/vm/dirty_background_ratio: This parameter specifies the maximum percentage of dirty pages to the total memory before the dirty pages are written to disk by the pdflush process (based on memfree + Cached - Mapped). Increasing the value of this parameter will allocate more memory for the write buffer, thereby improving the disk write performance. However, for services that mainly involve disk write operations, set this parameter to a smaller value to prevent data from being stacked and becoming the bottleneck. You can identify the bottleneck by observing the time fluctuation range of await based on services. The default value is 10. You can run the echo command to change the value.

    echo 8 > /proc/sys/vm/dirty_background_ratio

  3. /proc/sys/vm/dirty_ratio: This parameter specifies the maximum ratio of dirty pages to the total memory. If the ratio exceeds the value, the system does not add dirty pages and the file read and write operations change to the synchronous mode. After the file read and write operations change to the synchronous mode, the block time of the file read and write operations of the application becomes longer, which slows down the system. The default value of this parameter is 20. For write-intensive services, you can increase this parameter to prevent the disk from entering the synchronous write state too early. You can run the echo command to adjust the value.

    echo 40 > proc/sys/vm/dirty_ratio

    If the cache size and time of dirty data are increased, the probability of data loss increases in case of unexpected power failures. Therefore, for data that needs to be stored to disks immediately, the application should use the O_DIRECT mode to prevent key data loss.