Installing OmniScheduler
OmniScheduler: It obtains the node load sorting information from LoadsMetric and prioritizes job scheduling for nodes with lower loads. Install OmniScheduler on the server1 node only. Before the installation, deploy Hadoop 3.3.4 on all nodes.
Background
OmniScheduler is deployed as a plugin using the OmniScheduler JAR package. It overwrites open source classes of Hadoop.
For details about the classes of open source Hadoop, see Table 1.
|
Class Name |
JAR Package |
JAR Package Directory |
|---|---|---|
|
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler |
hadoop-yarn-server-resourcemanager-3.3.4.jar |
${HADOOP_HOME}/share/hadoop/yarn/ |
|
org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.SchedulerInfo |
hadoop-yarn-server-resourcemanager-3.3.4.jar |
${HADOOP_HOME}/share/hadoop/yarn/ |
Procedure
- Place the OmniScheduler plugin JAR package to the corresponding Hadoop directory.
cp /home/hadoop/loadsmetric-software/boostkit-yarn-schedule-load-evolution-1.0.0.jar ${HADOOP_HOME}/share/hadoop/yarn/lib - Modify the yarn-site.xml configuration file.
- Open the configuration file.
vi ${HADOOP_HOME}/etc/hadoop/yarn-site.xml - Press i to enter the insert mode and add the following content to the file:
1 2 3 4 5
<!-- Set the RM to use LoadBasedCapacityScheduler. --> <property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LoadBasedCapacityScheduler</value> </property>
- Press Esc, type :wq!, and press Enter to save the file and exit.
- Open the configuration file.
- Modify the capacity-scheduler.xml configuration file.
- Open the configuration file.
vi ${HADOOP_HOME}/etc/hadoop/capacity-scheduler.xml - Press i to enter the insert mode and add the following content to the file:
<!-- Set whether to enable asynchronous scheduling. --> <property> <name>yarn.scheduler.capacity.schedule-asynchronously.enable</name> <value>true</value> </property> <!-- Asynchronous scheduling period --> <property> <name>yarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms</name> <value>1000</value> </property> <!-- Percentage of Node Managers involved in generating containers --> <property> <name>yarn.scheduler.capacity.schedule-asynchronously.global.allocate-percent</name> <value>50.0</value> </property> <!-- Interval for requesting LOADS_METRIC_SERVER. The default value is 1000 milliseconds. -->; <property> <name>yarn.scheduler.capacity.loads-metric-server.request-interval-ms</name> <value>1000</value> </property> <!-- Address for requesting LOADS_METRIC_SERVER (Domain_name:Port), for example, server1:9090. --> <property> <name>yarn.scheduler.capacity.loads-metric-server.address</name> <value>server1:9090</value> </property>
- Press Esc, type :wq!, and press Enter to save the file and exit.
- Open the configuration file.
- Synchronize the configuration file to the other nodes.
scp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml agent1:${HADOOP_HOME}/etc/hadoop/ scp ${HADOOP_HOME}/etc/hadoop/capacity-scheduler.xml agent1:${HADOOP_HOME}/etc/hadoop/ scp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml agent2:${HADOOP_HOME}/etc/hadoop/ scp ${HADOOP_HOME}/etc/hadoop/capacity-scheduler.xml agent2:${HADOOP_HOME}/etc/hadoop/ scp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml agent3:${HADOOP_HOME}/etc/hadoop/ scp ${HADOOP_HOME}/etc/hadoop/capacity-scheduler.xml agent3:${HADOOP_HOME}/etc/hadoop/ - Restart the Yarn cluster.
${HADOOP_HOME}/sbin/stop-yarn.sh ${HADOOP_HOME}/sbin/start-yarn.sh - Check whether the OmniScheduler JAR package has taken effect. If it has taken effect, the value of Scheduler Type is LoadBasedCapacityScheduler on the Hadoop management page (http://server1:8088/).
