Rate This Document
Findability
Accuracy
Completeness
Readability

Installing OmniScheduler

Install OmniScheduler on the server1 node only. Before the installation, deploy Hadoop 3.3.4 on all nodes.

Description

OmniScheduler is deployed as a plugin using the OmniScheduler JAR package. It overwrites open source classes of Hadoop.

For details about the classes of open source Hadoop, see Table 1.

Table 1 Classes of the open source Hadoop

Class Name

JAR Package

JAR Package Directory

org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

hadoop-yarn-server-resourcemanager-3.3.4.jar

${HADOOP_HOME}/share/hadoop/yarn/

org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.SchedulerInfo

hadoop-yarn-server-resourcemanager-3.3.4.jar

${HADOOP_HOME}/share/hadoop/yarn/

Procedure

  1. Place the OmniScheduler plugin JAR package to the corresponding Hadoop directory.
    cp /home/hadoop/loadsmetric-software/boostkit-yarn-schedule-load-evolution-1.0.0.jar ${HADOOP_HOME}/share/hadoop/yarn/lib
  2. Modify the yarn-site.xml configuration file.
    1. Open the configuration file.
      vi ${HADOOP_HOME}/etc/hadoop/yarn-site.xml
    2. Press i to enter the insert mode and add the following content to the file:
      1
      2
      3
      4
      5
      <!-- Set the RM to use LoadBasedCapacityScheduler. -->
      <property>
        <name>yarn.resourcemanager.scheduler.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LoadBasedCapacityScheduler</value>
      </property>
      
    3. Press Esc, type :wq!, and press Enter to save the file and exit.
  3. Modify the capacity-scheduler.xml configuration file.
    1. Open the configuration file.
      vi ${HADOOP_HOME}/etc/hadoop/capacity-scheduler.xml
    2. Press i to enter the insert mode and add the following content to the file:
      <!-- Set whether to enable asynchronous scheduling. -->
      <property>
        <name>yarn.scheduler.capacity.schedule-asynchronously.enable</name>
        <value>true</value>
      </property>
      <!-- Asynchronous scheduling period -->
      <property>
        <name>yarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms</name>
        <value>1000</value>
      </property>
      <!-- Percentage of Node Managers involved in generating containers -->
      <property>
        <name>yarn.scheduler.capacity.schedule-asynchronously.global.allocate-percent</name>
        <value>50.0</value>
      </property>
      <!-- Interval for requesting LOADS_METRIC_SERVER. The default value is 1000 milliseconds. -->
      <property>
        <name>yarn.scheduler.capacity.loads-metric-server.request-interval-ms</name>
        <value>1000</value>
      </property>
      <!-- Address for requesting LOADS_METRIC_SERVER (Domain_name:Port), for example, server1:9090. -->
      <property>
        <name>yarn.scheduler.capacity.loads-metric-server.address</name>
        <value>server1:9090</value>
      </property>
    3. Press Esc, type :wq!, and press Enter to save the file and exit.
  4. Synchronize the configuration file to the other nodes.
    scp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml agent1:${HADOOP_HOME}/etc/hadoop/
    scp ${HADOOP_HOME}/etc/hadoop/capacity-scheduler.xml agent1:${HADOOP_HOME}/etc/hadoop/
    scp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml agent2:${HADOOP_HOME}/etc/hadoop/
    scp ${HADOOP_HOME}/etc/hadoop/capacity-scheduler.xml agent2:${HADOOP_HOME}/etc/hadoop/
    scp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml agent3:${HADOOP_HOME}/etc/hadoop/
    scp ${HADOOP_HOME}/etc/hadoop/capacity-scheduler.xml agent3:${HADOOP_HOME}/etc/hadoop/
  5. Restart the Yarn cluster.
    ${HADOOP_HOME}/sbin/stop-yarn.sh
    ${HADOOP_HOME}/sbin/start-yarn.sh
  6. Check whether the OmniScheduler JAR package has taken effect. If it has taken effect, the value of Scheduler Type is LoadBasedCapacityScheduler on the Hadoop management page (http://server1:8088/).