FUSE Kernel Space Tuning
Overview
In AI scenarios, the Filesystem in Userspace (FUSE) is widely used, which imposes higher requirements on I/O latency. The open source Linux FUSE has high latency due to overheads of frequent thread creation/destruction and data copy between the user space and kernel space.
Technical Principles
- Thread-to-core binding modification
A fixed number of threads are created for I/O transmission between the FUSE kernel and libfuse and are bound to CPU cores. In the FUSE kernel, the read/write request structures are bound to threads one by one, which reduces the overhead of thread switching between CPU cores, thereby improving overall I/O performance.
- Metadata copy-free through MMAP
Metadata of each read/write is transmitted through MMAP, reducing the number of memory copies and improving I/O performance.
Expected Results
Both the multi-concurrency 4 KB random read/write performance and multi-concurrency 1 MB sequential read/write performance are improved by 20%.