Architecture
Yet Another Resource Negotiator (Yarn) is a framework for resource management and job scheduling in a Hadoop cluster. It allocates resources and schedules jobs in a cluster so that multiple computing frameworks (such as MapReduce, Spark, and Tez) can share the same cluster resources. YARN employs the ResourceManager (RM), NodeManager (NM), and ApplicationMaster (AM) to manage resources and schedule jobs. Yarn offers multiple schedulers, including the First In First Out (FIFO) Scheduler, Capacity Scheduler, and Fair Scheduler. Choose the scheduler most appropriate to your service environment.
OmniScheduler optimizes the open source Capacity Scheduler to schedule resources based on the weight calculation and sorting results of cluster nodes' physical resources. This optimized Yarn load scheduling algorithm enables balanced resource configuration and efficient resource utilization. Figure 1 shows the overall architecture.
It consists of five modules:
- Prometheus: This open-source event monitoring system and time series database is widely used to manage various infrastructure resources.
- Node Exporter: It is a component in the Prometheus ecosystem used to collect and expose machine-level metrics, including but not limited to CPU usage, memory usage, disk I/O, network I/O, and file system information.
- LoadsMetricApplication: This load collection and analysis tool obtains machine metric information from the Node Exporter, analyzes and processes the information, and reports the generated cluster load and balancing data to Prometheus.
- Grafana: It obtains cluster load and balancing information from Prometheus, and visualizes the information in charts and dashboards for display on the user interface.
- OmniScheduler: It obtains the node load sorting information from LoadsMetric and prioritizes job scheduling for nodes with lower loads.
