安装OmniScheduler Yarn负载调度算法

安装OmniScheduler Yarn负载调度算法步骤在server1节点执行,在安装前需要在所有节点部署Hadoop 3.3.4。

说明

OmniScheduler Yarn负载调度算法以JAR包形式进行插件化部署,对Hadoop原生类进行覆盖。

覆盖Hadoop的原生类请参见表1

表1 覆盖原生类

类名

所属原生JAR包

JAR包目录

org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

hadoop-yarn-server-resourcemanager-3.3.4.jar

${HADOOP_HOME}/share/hadoop/yarn/

org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.SchedulerInfo

hadoop-yarn-server-resourcemanager-3.3.4.jar

${HADOOP_HOME}/share/hadoop/yarn/

安装步骤

  1. 将OmniScheduler Yarn负载调度算法插件化JAR包放置到Hadoop对应目录中。

    cp /home/hadoop/loadsmetric-software/boostkit-yarn-schedule-load-evolution-1.0.0.jar ${HADOOP_HOME}/share/hadoop/yarn/lib

  2. 修改yarn-site.xml配置文件。

    1. 打开配置文件。
      vi ${HADOOP_HOME}/etc/hadoop/yarn-site.xml
    2. “i”进入编辑模式,在yarn-site.xml文件中追加如下内容。
      1
      2
      3
      4
      5
      <!-- 设定rm使用LoadBasedCapacityScheduler -->
      <property>
        <name>yarn.resourcemanager.scheduler.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LoadBasedCapacityScheduler</value>
      </property>
      
    3. “Esc”键,输入:wq!,按“Enter”保存并退出编辑。

  3. 修改capacity-scheduler.xml配置文件。

    1. 打开配置文件。
      vi ${HADOOP_HOME}/etc/hadoop/capacity-scheduler.xml
    2. “i”进入编辑模式,在capacity-scheduler.xml中追加如下内容。
      <!-- 是否开启异步调度 -->    
      <property>    
        <name>yarn.scheduler.capacity.schedule-asynchronously.enable</name>    
        <value>true</value>    
      </property>    
      <!-- 异步调度周期 -->    
      <property>    
        <name>yarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms</name>    
        <value>1000</value>    
      </property>   
      <!-- 平铺nm分配container的百分比 -->    
      <property>    
        <name>yarn.scheduler.capacity.schedule-asynchronously.global.allocate-percent</name>    
        <value>50.0</value>    
      </property>    
      <!-- 请求LOADS_METRIC_SERVER间隔,默认1000毫秒 -->
      <property>
        <name>yarn.scheduler.capacity.loads-metric-server.request-interval-ms</name>
        <value>1000</value>
      </property>
      <!-- 请求LOADS_METRIC_SERVER的地址(域名:端口),例如server1:9090-->
      <property>
        <name>yarn.scheduler.capacity.loads-metric-server.address</name>
        <value>server1:9090</value>
      </property>
    3. “Esc”键,输入:wq!,按“Enter”保存并退出编辑。

  4. 将上述配置文件同步到集群其他节点。

    scp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml agent1:${HADOOP_HOME}/etc/hadoop/
    scp ${HADOOP_HOME}/etc/hadoop/capacity-scheduler.xml agent1:${HADOOP_HOME}/etc/hadoop/
    scp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml agent2:${HADOOP_HOME}/etc/hadoop/
    scp ${HADOOP_HOME}/etc/hadoop/capacity-scheduler.xml agent2:${HADOOP_HOME}/etc/hadoop/
    scp ${HADOOP_HOME}/etc/hadoop/yarn-site.xml agent3:${HADOOP_HOME}/etc/hadoop/
    scp ${HADOOP_HOME}/etc/hadoop/capacity-scheduler.xml agent3:${HADOOP_HOME}/etc/hadoop/

  5. 重启Yarn集群。

    ${HADOOP_HOME}/sbin/stop-yarn.sh 
    ${HADOOP_HOME}/sbin/start-yarn.sh

  6. 验证OmniScheduler Yarn负载调度算法JAR包是否生效,观察Hadoop管理界面http://server1:8088/,Scheduler Type显示为LoadBasedCapacityScheduler。