Rate This Document
Findability
Accuracy
Completeness
Readability

Preparations

Before using the Yarn package manager to start Hadoop, complete necessary Hadoop configurations. Then restart Hadoop.

  • Select either the normal mode or Yarn mode to start.
  • When the Yarn package manager is used for the startup, if the Hadoop version is 3.2.2 or later, recompile container-executor. Otherwise, Hadoop cannot be started. For details, see Failed to Start Hadoop Using Yarn.
  1. Modify the Hadoop configuration file on the server node.
    1. Modify ${HADOOP_HOME}/etc/hadoop/yarn-site.xml.
      1. Enable the container-executor configuration and set the user group to ockadmin. See the following figure:
        <property>
            <name>yarn.nodemanager.container-executor.class</name>
            <value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value>
        </property>
        <property>
            <name>yarn.nodemanager.linux-container-executor.group</name>
            <value>ockadmin</value>
        </property>
        <property>
            <name>yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users</name>
            <value>false</value>
        </property>
      2. Enable the Node Labels feature.
        Figure 1 Enabling Node Labels

        The following is a configuration example of yarn-site.xml:

        <propert>
            <name>yarn.node-labels.enabled</name>
            <value>true</value>
        </propert>
        <propert>
            <name>yarn.node-labels.fs-store.root-dir</name>
            <value>/tmp/node-labels</value>
        </propert>
        <propert>
            <name>yarn.resourcemanager.scheduler.class</name>
            <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
        </propert>
      3. Restart the ResourceManager for the configurations to take effect.
        • Query the node list to check whether the ResourceManager is running.
          yarn node -list
        • Stop the ResourceManager service.
          yarn --daemon stop resourcemanager
        • After the ResourceManager service is stopped, restart the service.
          yarn --daemon start resourcemanager
    2. Modify the ${HADOOP_HOME}/etc/hadoop/core-site.xml file. Set the proxy user and user group to ${HADOOP_USER} and ${HADOOP_GROUP} respectively.
      <property>
          <name>hadoop.proxyuser.sparkadmin.hosts</name>
          <value>*</value>
      </property>
      <property>
          <name>hadoop.proxyuser.ockadmin.groups</name>
          <value>*</value>
      </property>
    3. Switch to the root user and create /etc/hadoop/container-executor.cfg.
      mkdir -p /etc/hadoop/
      touch container-executor.cfg

      Add the following content to the file:

      yarn.nodemanager.linux-container-executor.group=ockadmin #configured value of yarn.nodemanager.linux-container-executor.group
      banned.users= #comma separated list of users who can not run applications
      min.user.id=1000 #Prevent other super-users
      allowed.system.users= #comma separated list of system users who CAN run applications
      feature.tc.enabled=false
    4. Switch to the root user and change the permission on the /etc/hadoop/container-executor.cfg file to 755.
      chmod 755 -R /etc/hadoop
    5. Perform the following operations on all nodes.
      1. Switch to the ockdadmin user. Create the .hadooprc file in the user directory and change the permission to 640.
        su - ockadmin
        touch .hadooprc
        chmod 640 .hadooprc
      2. Add the following content to the file:
        export HADOOP_USER_NAME=ockadmin
        export HDFS_DATANODE_USER=ockadmin
        export HDFS_NAMENODE_USER=ockadmin
        export HDFS_SECONDARYNAMENODE_USER=ockadmin
        export YARN_RESOURCEMANAGER_USER=ockadmin
        export YARN_NODEMANAGER_USER=ockadmin
  2. Distribute yarn-site.xml, core-site.xml, and container-executor.cfg to all nodes. Then, run the following command on each node to modify the container-executor permission:
    chown root:ockadmin $HADOOP_HOME/bin/container-executor
    chmod 6050 $HADOOP_HOME/bin/container-executor
  3. Restart Hadoop.
    Go to the Hadoop bin directory and restart Hadoop.
    cd $HADOOP_HOME/sbin
    ./stop-all.sh
    ./start-all.sh

    After Hadoop is restarted, run the jps command to check whether the ResourceManager is started. If it is not started, view the ResourceManager logs for analysis. If it is not started because the secure mode is enabled, exit the secure mode and restart the Hadoop service.

    hadoop dfsadmin -safemode leave
    ./start-all.sh