Introduction to Core Components
The core components of Elasticsearch include nodes, clusters, shards, and indices. This section describes the core components.
Figure 1 shows the Elasticsearch core components. Nodes are classified into primary nodes and secondary nodes. The letter P indicates a shard and the letter R indicates a replica shard.
Node
A node is a server in an Elasticsearch cluster. It stores data and processes client requests. Node roles (primary, data, ingest, coordinating) can be configured in the configuration file or via startup parameters. Table 1 describes the node types. Generally, a cluster contains one primary node and multiple secondary nodes.
Node Type |
Description |
|---|---|
Primary node |
The primary node is elected among all nodes in the cluster. It manages cluster-wide changes, such as adding or removing indices and nodes. |
Data node |
The data node stores data and performs operations such as indexing and search. |
Ingest node |
The ingest node preprocesses documents and performs various transformations before the documents are indexed. |
Coordinating node |
The coordinating node does not store data, process data, or participate in cluster management. It mainly processes routing requests, aggregates results, and offloads work from data nodes. |
Cluster
A cluster consists of one or more nodes that work together to provide data storage and search services. Clusters are identified by cluster names (cluster.name). All nodes within the same cluster must be configured with the same cluster name to prevent cross-cluster data sharing. Nodes within the cluster share the load of data storage and processing. Whenever nodes are added or removed, the cluster automatically rebalances data to ensure an even distribution of load across all nodes.
Index
Basic Concept |
Description |
|---|---|
Index |
An index is a logical unit of storage (analogous to a database). It is a collection of documents with similar characteristics. It can contain one or more types. |
Type |
A type is analogous to a table in a database. A document type provides the mapping information for documents with different structures within the same index, facilitating storage. |
Document |
A document is a basic unit that can be indexed, specifically the top-level structure or root object serialized as JSON. It is analogous to a row in a database. A type can contain multiple documents. |
Field |
A field is the smallest unit within a document. It is analogous to a column in a database. Each document contains multiple fields. |
Shard
A shard is a subset of index data. Sharding can split a large index into multiple shards that can be distributed across different nodes. Each shard is a Lucene instance. Shards are either primary or replica shards. Replica shards are copies of primary shards. They improve data availability and increase search performance. Replicas can reside on different nodes. If a primary shard fails or becomes unavailable, a replica can be promoted to primary.
