OmniScheduler
Yet Another Resource Negotiator (YARN) is a resource management and scheduling framework in the Hadoop ecosystem. It offers multiple scheduling algorithms, including FIFO Scheduler, Capacity Scheduler, and Fair Scheduler. The OmniScheduler Yarn load scheduling algorithm optimizes the open source Capacity Scheduler scheduler. It allocates resources based on the weight calculation and sorting results of physical cluster node resources, using specified parameters, to ensure balanced resource distribution and efficient utilization. Figure 1 shows the overall architecture of OmniScheduler.
It consists of five modules:
- Prometheus: This open source event monitoring system and time series database is widely used to manage various infrastructure resources.
- Node Exporter: It is a component in the Prometheus ecosystem used to collect and expose machine-level metrics, including but not limited to CPU usage, memory usage, disk I/O, network I/O, and file system information.
- LoadsMetricApplication: This load collection and analysis tool obtains machine metric information from the Node Exporter, analyzes and processes the information, and reports the generated cluster load and balancing data to Prometheus.
- Grafana: It obtains cluster load and balancing data from Prometheus, and visualizes the data in charts and dashboards for display on the user interface.
- Yarn load scheduling algorithm: It obtains the node load sorting information from LoadsMetric and prioritizes job scheduling for nodes with lower loads.
OmniScheduler Performance Data
When the cluster's average CPU load is low, uneven scheduling may lead to severe load imbalance between nodes. OmniScheduler optimizes node balancing significantly.
As the average CPU load rises in open source scenarios, the cluster's idle resources decrease, reducing the load difference between nodes and naturally improving node balancing. In this scenario, the impact of OmniScheduler may be less significant.
