Introduction to HBase
HBase (short for Hadoop Database) is a distributed storage system that is column-based and scalable with high reliability and high performance. With the HBase technology, a large-scale storage cluster can be set up on low-cost PC servers.
HBase consists of three components: HMaster, HRegionServer, and ZooKeeper. The three components have the following responsibilities:
HMaster
HMaster is the controller of the entire HBase and has the following responsibilities:
- Perform load balancing.
- Manage permissions (using ACL).
- Recycle junk files in the HDFS.
- Manage metadata of namespaces and tables.
- Create, delete, and update tables (updating column families).
- Allocate regions: Allocate regions upon startup. Reallocate regions on the invalid RegionServer. Allocate regions during division.
HRegionServer
HRegionServer is the actual HBase reader/writer and has the following responsibilities:
- Divide regions.
- Interact with HDFS and manage table data.
- Respond to read and write requests from clients and perform I/O operations.
ZooKeeper
ZooKeeper is the actual HBase coordinator and has the following responsibilities:
- Store table metadata in HBase.
- Ensure that only one HMaster in the cluster is in active state.
- Store hbase:meta, that is, location information of all regions.
- Monitor the RegionServer status and report the RS online and offline status to HMaster.
- The Zookeeper cluster uses the Paxos protocol to ensure the node status consistency.
Parent topic: Introduction