OmniStateStore
Apache Flink is an open source stream processing framework designed for real-time stream processing and analysis. It supports both unbounded and bounded data streams and offers a rich set of APIs to accommodate a wide range of stream processing scenarios.
State store is an important feature of Flink and is mainly implemented by the state backend. As the volume of state data grows, the performance of state storage comes under pressure. OmniStateStore acts as the Flink backend plugin to accelerate state storage and improve the overall Flink performance.
Architecture Design
Figure 1 shows the overall OmniStateStore architecture, which consists of BSS-Cache and BSS-Store.
- BSS-Cache offers hot data access with hash-like performance and efficient data downgrading mechanisms.
- BSS-Store delivers large-capacity warm data access, leveraging a drive-organized log-structured merge-tree (LSM tree).
Typical Deployments
As a Flink plugin, OmniStateStore is deployed in the same way as Flink. Flink supports multiple deployment modes, including Yarn, standalone, and containerized deployments.
In a typical deployment scenario, OmniStateStore is deployed across three Docker containers, each allocated 8 cores and 32 GB of memory. One container runs the Job Manager, while each of the remaining two containers hosts four Task Managers. The Job Manager is allocated 8 GB of memory, while each Task Manager is allocated two task slots and 8 GB of memory.
