Yarn启动Hadoop失败的解决方法

问题现象描述

Yarn启动Hadoop(3.2.2及以上版本)时返回如下信息。

2024-09-19 15:41:42,256 INFO org.apache.hadoop.service.AbstractService: Service NodeManager failed in state INITED
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
...
Caused by: ExitCodeException exitCode=24: File /home/sparkadmin/hadoop-3.2.0/etc/hadoop/container-executor.cfg must be owned by root, but is owned by 1002

关键过程、根本原因分析

由于LinuxContainerExecutor通过container-executor来启动容器,出于系统安全考虑,要求其所依赖的配置文件container-executor.cfg及其各级父路径所有者必须为root用户。该问题原因是container-executor中配置的路径为默认路径,需重新编译container-executor。

结论、解决方案及效果

  1. 获取对应版本的Hadoop源码包(此处以Hdoop 3.2.0-RC1版本为例):https://github.com/apache/hadoop/releases/tag/release-3.2.0-RC1
  2. 解压源码包后进入“hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager”目录。
  3. 执行以下命令进行编译。

    cmake src -DHADOOP_CONF_DIR=/etc/hadoop
    make

  4. 进入“target/native/target/usr/local/bin”目录提取container-executor文件修改属主为root,并替换系统环境中对应的二进制文件即可。