Optimizing the Disk I/O Scheduling Mode
Principle
When the file system performs disk read and write operations through the driver, the read and write requests are not sent to the driver immediately. Instead, the read and write requests are delayed. In this way, the I/O scheduler of the Linux kernel can combine multiple read and write requests into one request or sort them (to reduce the addressing of mechanical disks) and send them to the driver, improving performance. When we introduced the iostat tool earlier, we also mentioned the merged statistics, which are obtained from this process.
The current Linux versions support three scheduling mechanisms:
- Completely Fair Queuing (CFQ) scheduling
This is the default scheduling algorithm of the early Linux kernel. It allocates a scheduling queue to each process. By default, I/O resources are allocated based on the time slice and the limitation on number of requests to ensure that each process fairly occupies the I/O resources. The performance of this algorithm is not very high when the I/O pressure is high and I/Os are concentrated in several processes.
- Deadline scheduling
This scheduling algorithm maintains four queues: the read queue, the write queue, the read timeout queue and the write timeout queue. When the kernel receives a new request, it merges the request if possible; if the request cannot be merged, it sorts the request. If neither merge nor insertion is possible, it is placed at the end of the read or write queue. After a period of time, the I/O scheduler moves the requests in the read or write queue to the read timeout or write timeout queue. This algorithm does not limit the I/O resources used by each process. It is applicable to scenarios where I/O pressure is high and I/Os are concentrated in several processes, for example, big data and database scenarios using HDDs.
- NOOP (or NONE) scheduling, a simple first in first out (FIFO) scheduling policy
SSDs support random read and write. Therefore, you can select this simplest scheduling policy for SSDs to achieve the best performance.
Procedure
Run the following command to view the current scheduling mode:
cat /sys/block/$DEVICE-NAME/queue/scheduler
noop deadline [cfq]
where [ ] contains the current disk I/O scheduling mode.
The returned value or default value may vary according to the operating system you are using.
You can run the echo command to change the scheduling mode.
For example, run the following command to change the scheduling mode of sda to deadline.
echo deadline > /sys/block/sda/queue/scheduler
If the HPC program does not require I/O read and write operations on the local disk, skip this step.