Hadoop所有的配置文件都在“$HADOOP_HOME/etc/hadoop”目录下,修改以下配置文件前,需要切换到“HADOOP_HOME/etc/hadoop”目录。
1 | cd $HADOOP_HOME/etc/hadoop |
修改环境变量JAVA_HOME为绝对路径,并修改用户为root。
1 2 3 4 | echo "export JAVA_HOME=/usr/local/jdk8u252-b09" >> hadoop-env.sh echo "export HDFS_NAMENODE_USER=root" >> hadoop-env.sh echo "export HDFS_SECONDARYNAMENODE_USER=root" >> hadoop-env.sh echo "export HDFS_DATANODE_USER=root" >> hadoop-env.sh |
修改用户为root。
1 2 3 | echo "export YARN_REGISTRYDNS_SECURE_USER=root" >> yarn-env.sh echo "export YARN_RESOURCEMANAGER_USER=root" >> yarn-env.sh echo "export YARN_NODEMANAGER_USER=root" >> yarn-env.sh |
1 | mkdir /home/hadoop_tmp_dir
|
1 | vi core-site.xml
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://server1:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop_tmp_dir</value> </property> <property> <name>ipc.client.connect.max.retries</name> <value>100</value> </property> <property> <name>ipc.client.connect.retry.interval</name> <value>10000</value> </property> <property> <name>hadoop.proxyuser.root.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.root.groups</name> <value>*</value> </property> </configuration> |
举例:
1 | mkdir -p /data/data{1,2,3,4,5,6,7,8,9,10,11,12}/hadoop |
1 | vi hdfs-site.xml
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | <configuration> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/data/data1/hadoop/nn</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/data/data1/hadoop/dn,/data/data2/hadoop/dn,/data/data3/hadoop/dn,/data/data4/hadoop/dn,/data/data5/hadoop/dn,/data/data6/hadoop/dn,/data/data7/hadoop/dn,/data/data8/hadoop/dn,/data/data9/hadoop/dn,/data/data10/hadoop/dn,/data/data11/hadoop/dn,/data/data12/hadoop/dn</value> </property> <property> <name>dfs.http.address</name> <value>server1:50070</value> </property> <property> <name>dfs.namenode.http-bind-host</name> <value>0.0.0.0</value> </property> <property> <name>dfs.datanode.handler.count</name> <value>600</value> </property> <property> <name>dfs.namenode.handler.count</name> <value>600</value> </property> <property> <name>dfs.namenode.service.handler.count</name> <value>600</value> </property> <property> <name>ipc.server.handler.queue.size</name> <value>300</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration> |
1 | vi mapred-site.xml
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> <final>true</final> <description>The runtime framework for executing MapReduce jobs</description> </property> <property> <name>mapreduce.job.reduce.slowstart.completedmaps</name> <value>0.88</value> </property> <property> <name>mapreduce.application.classpath</name> <value> /usr/local/hadoop/etc/hadoop, /usr/local/hadoop/share/hadoop/common/*, /usr/local/hadoop/share/hadoop/common/lib/*, /usr/local/hadoop/share/hadoop/hdfs/*, /usr/local/hadoop/share/hadoop/hdfs/lib/*, /usr/local/hadoop/share/hadoop/mapreduce/*, /usr/local/hadoop/share/hadoop/mapreduce/lib/*, /usr/local/hadoop/share/hadoop/yarn/*, /usr/local/hadoop/share/hadoop/yarn/lib/* </value> </property> <property> <name>mapreduce.map.memory.mb</name> <value>6144</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>6144</value> </property> <property> <name>mapreduce.map.java.opts</name> <value>-Xmx5530m</value> </property> <property> <name>mapreduce.reduce.java.opts</name> <value>-Xmx2765m</value> </property> <property> <name>mapred.child.java.opts</name> <value>-Xmx2048m -Xms2048m</value> </property> <property> <name>mapred.reduce.parallel.copies</name> <value>20</value> </property> <property> <name>yarn.app.mapreduce.am.env</name> <value>HADOOP_MAPRED_HOME=/usr/local/hadoop</value> </property> <property> <name>mapreduce.map.env</name> <value>HADOOP_MAPRED_HOME=/usr/local/hadoop</value> </property> <property> <name>mapreduce.reduce.env</name> <value>HADOOP_MAPRED_HOME=/usr/local/hadoop</value> </property> <property> <name>mapreduce.job.counters.max</name> <value>1000</value> </property> </configuration> |
举例:
1 | mkdir -p /data/data{1,2,3,4,5,6,7,8,9,10,11,12}/hadoop/yarn |
1 | vi yarn-site.xml
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> <final>true</final> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>server1</value> </property> <property> <name>yarn.resourcemanager.bind-host</name> <value>0.0.0.0</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>371200</value> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>371200</value> </property> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>1024</value> </property> <property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>64</value> </property> <property> <name>yarn.scheduler.maximum-allocation-vcores</name> <value>64</value> </property> <property> <name>yarn.scheduler.minimum-allocation-vcores</name> <value>1</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.client.nodemanager-connect.max-wait-ms</name> <value>300000</value> </property> <property> <name>yarn.nodemanager.vmem-pmem-ratio</name> <value>2.1</value> </property> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property> <property> <name>yarn.nodemanager.pmem-check-enabled</name> <value>false</value> </property> <property> <name>yarn.application.classpath</name> <value> /usr/local/hadoop/etc/hadoop, /usr/local/hadoop/share/hadoop/common/*, /usr/local/hadoop/share/hadoop/common/lib/*, /usr/local/hadoop/share/hadoop/hdfs/*, /usr/local/hadoop/share/hadoop/hdfs/lib/*, /usr/local/hadoop/share/hadoop/mapreduce/*, /usr/local/hadoop/share/hadoop/mapreduce/lib/*, /usr/local/hadoop/share/hadoop/yarn/*, /usr/local/hadoop/share/hadoop/yarn/lib/* </value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>/data/data1/hadoop/yarn/local,/data/data2/hadoop/yarn/local,/data/data3/hadoop/yarn/local,/data/data4/hadoop/yarn/local,/data/data5/hadoop/yarn/local,/data/data6/hadoop/yarn/local,/data/data7/hadoop/yarn/local,/data/data8/hadoop/yarn/local,/data/data9/hadoop/yarn/local,/data/data10/hadoop/yarn/local,/data/data11/hadoop/yarn/local,/data/data12/hadoop/yarn/local</value> </property> <property> <name>yarn.nodemanager.log-dirs</name><value>/data/data1/hadoop/yarn/log,/data/data2/hadoop/yarn/log,/data/data3/hadoop/yarn/log,/data/data4/hadoop/yarn/log,/data/data5/hadoop/yarn/log,/data/data6/hadoop/yarn/log,/data/data7/hadoop/yarn/log,/data/data8/hadoop/yarn/log,/data/data9/hadoop/yarn/log,/data/data10/hadoop/yarn/log,/data/data11/hadoop/yarn/log,/data/data12/hadoop/yarn/log</value> </property> <property> <name>yarn.timeline-service.enabled</name> <value>true</value> </property> <property> <name>yarn.timeline-service.hostname</name> <value>server1</value> </property> <property> <name>yarn.timeline-service.http-cross-origin.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.system-metrics-publisher.enabled</name> <value>true</value> </property> </configuration> |
1 | vi workers
|
agent1 agent2 agent3