Component Principles
Elasticsearch
Elasticsearch is a highly scalable open-source full-text search and analysis engine that allows storage, search, and analysis of large amounts of data in near real time. It is often used as an underlying engine and technology to power complex search functions and requirements. Figure 1 shows the component architecture of Elasticsearch.
- Linear node expansion
- Distributed real-time file storage, real-time analysis, and diversified search
- Scalability to a maximum of hundreds of servers to process PB-level structured or unstructured data
- Rich geographical information search and geographical location aggregation
- Multiple copies
- Documents are stored in indexes, which can be added, deleted, modified, and queried. Diversified document processing capabilities are provided.
HBase
HBase undertakes data storage. HBase is an open-source, column-oriented, distributed storage system that is suitable for storing massive amounts of unstructured or semi-structured data. It features high reliability, high performance, and flexible scalability, and supports real-time data read/write. Figure 2 shows the component architecture of HBase.
Typical features of a table stored in HBase are as follows:
- Big table: One table contains hundred millions of lines and millions of columns.
- Column-oriented: column-oriented storage, retrieval, and permission control
- Sparse: Null columns in the table do not occupy any storage space.
Parent topic: Real-Time Retrieval

