Rate This Document
Findability
Accuracy
Completeness
Readability

Data Compaction

Overview

In the Ceph system, data blocks are flushed to drives based on 4 KB alignment. If an input data block does not conform to 4 KB alignment, it is padded with zeros. This wastes storage space, especially in HDD/SSD hybrid storage where compression is aligned to 64 KB. In this case, the space wasted by unaligned parts is even greater after data compression. The data compaction feature compacts different data blocks by byte and flushes them to drives by 4 KB, reducing space waste.

Technical Principles

The data compaction feature compacts the unaligned parts of data blocks to achieve higher storage density. In compression scenarios, compressed data blocks are not aligned. Through data compaction, storage space can be saved, and the compression ratio can be improved.

Figure 1 Data compaction process

Expected Results

The compression ratio is improved by more than 20%, and the performance of flushing data to drives does not deteriorate.