Rate This Document
Findability
Accuracy
Completeness
Readability

Open-Source RocksDB-like Database Reference Architecture

Scenarios

RocksDB is a key-value (KV) data storage engine designed based on the log-structured merge-tree (LSM-Tree). It features high write throughput, strong read performance, and high compression ratio. It is applicable to write-intensive, high-concurrency, and local persistence scenarios. The LSM-Tree storage engine model enables a large number of sequential writes to drives, which is friendly to hard disk drives (HDDs).

Architectures and Principles

Figure 1 RocksDB architecture
Table 1 Module information

Module Name

Description

Write-ahead log (WAL)

It records data before the data is written to memtables. This ensures that the successfully written data is not lost in the event of a breakdown.

Manifest Log

It stores database metadata, including the Sorted String Table (SSTable) files, the level (L0 to Ln) and key range of each file, the newly generated files, and the files to be deleted. It also supports multi-version and multi-column family management.

Memtable

It is a SkipList in the memory. All data is sequentially written to a memtable first, which is fast. When a memtable is full, it becomes an immutable memtable and will be flushed to drives in the background.

Immutable Memtable

When the data written to a memtable reaches a certain threshold, the memtable is converted into an immutable memtable and waits to be flushed to the L0 persistent layer.

SSTable

SSTables are sorted and immutable KV files on drives. They are sorted by key and support binary search. They feature a multi-layer structure, including L0, L1, L2, and so on.

Compaction

It combines multi-layer, duplicate, expired, and deleted data and deletes old versions to keep data in order and reduce read amplification.

Block Cache

It is the read cache in the memory, which caches data blocks in SSTable files on drives to reduce drive I/O, improve read performance, and reduce latency.