关于Hadoop集群的配置方法——另附安装网址

hadoop-2.6.0.tar.gz(压缩包网址 https://archive.apache.org/dist/hadoop/common/hadoop-2.6.0/

配置前需要配置主机名:vi etc/hosts

           ip  loclhost

              ...

1、修改目录权限

  chown -R root:root hadoop260/

2、配置JDK

  [root@vwmaster hadoop260]# vi etc/hadoop/core-site.xml

  export JAVA_HOME=/opt/bigdata/java/jdk180

3、hadoop js 文件系统,集群配置时将IP改为主机名

[root@vwmaster hadoop260]# vi etc/hadoop/core-site.xml
<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://vwmaster:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/opt/bigdata/hadoop/hadoop260</value>
  </property>
  <property>
    <name>hadoop.proxyuser.root.hosts</name>
    <value>*</value>
  </property>
  <property>
    <name>hadoop.proxyuser.root.groups</name>
    <value>*</value>
  </property>
</configuration>

4、hadoop replicas备份

[root@vwmaster hadoop260]# vi etc/hadoop/hdfs-site.xml
<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
    <name>dfs.permissions</name>
    <value>false</value>
  </property>
  <!--
	  <property>
		<name>dfs.hosts</name>
		<value>/opt/bigdata/hadoop/hadoop260/etc/hadoop/slaves</value>
	  </property>
	  <property>
		<name>dfs.namenode.secondary.http-address</name>
		<value>vwmaster:50090</value>
	  </property>
	  <property>
		<name>dfs.namenode.dir</name>
		<value>/opt/bigdata/hadoop/hdfs/namenode</value>
	  </property>
	  <property>
		<name>dfs.datanode.dir</name>
		<value>/opt/bigdata/hadoop/hdfs/datanode</value>
	  </property>
  -->
</configuration>

5、hadoop mapreduce 计算框架

[root@vwmaster hadoop260]# cp etc/hadoop/mapred-site.xml.template etc/hadoop/mapred-site.xml
[root@vwmaster hadoop260]# vi etc/hadoop/mapred-site.xml
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
  <!--
	  <property>
		<name>mapreduce.jobhistory.address</name>
		<value>vwmaster:10020</value>
	  </property>
	  <property>
		<name>mapreduce.jobhistory.webapp.address</name>
		<value>vwmaster:19888</value>
	  </property>
  -->
</configuration>

6、hadoop yarn 的调度管理

#---- yarn.log-aggregation.retain-seconds 添加yarn日志保留时间为7天(单位秒) ----#
#---- yarn.nodemanager.aux-services.mapreduce.shuffle.class 添加指定shuffle计算具体类型 ----#
#---- yarn.resourcemanager.hostname 添加yarn主机名 ----#
[root@vwmaster hadoop260]# vi etc/hadoop/yarn-site.xml
<configuration>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>vwmaster</value>
  </property>
 <!--
	  <property>
		<name>yarn.log-aggregation-enable</name>
		<value>true</value>
	  </property>
	  <property>
		<name>yarn.log-aggregation.retain-seconds</name>
		<value>604800</value>
	  </property>
	  <property>
		<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
		<value>org.apache.hadoop.mapred.ShuffleHandler</value>
	  </property>
  -->
</configuration>

7、hadoop slaves 配置(若hadoop 3.0 以上版本 slaves更名为workers)

 [root@vwmaster hadoop260]# vi etc/hadoop/slaves

  localhost  =>  从节点需要用于数据计算,配置所有的从节点

8、hadoop 环境变量

export HADOOP_HOME=/opt/bigdata/hadoop/hadoop260
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

 记得环境变量配置完成后需要激活:resource /etc/profile

9、在hadoop260/bin/下格式化hdfs

[root@vwmaster bin]# hdfs namenode -format

出现SHUTDOWN_MSG:Shutting down NameNode at vmaster/20.0.0.100则成功

10、配置hadoop-native库

tar -xvf hadoop-native-64-2.6.0.tar -C /opt/bigdata/hadoop/hadoop260/lib/native/

将其解压在hadoop260/lib/native目录

11、启动hadoop historyserver

[root@vwmaster bin]# start-all.sh

[root@vwmaster sbin]# ./mr-jobhistory-daemon.sh start historyserver

注:由于启动/关闭hadoop需要密码,所以需要配置免密,同理也需要对从机进行免密

[root@vwmaster bin]# cd ~
[root@vwmaster ~]# cd .ssh/
[root@vwmaster .ssh]# ls
authorized_keys  id_rsa  id_rsa.pub  known_hosts
[root@vwmaster .ssh]# cat id_rsa.pub >> authorized_keys 
[root@vwmaster .ssh]# ssh localhost

12、对从机进行配置

 将配置好的hadoop进行压缩

  tar -cvf hadoop.tar.gz hadoop/

将其复制到从机(需免密,见11)

  scp hadoop.tar.gz root@hostname:/dir

在从机上进行解压

  tar -zxvf hadoop.tar.gz -C /dir

注意:从机与主机路径不一致的情况下,需要修改路径与环境变量

  source /etc/profile

原文地址:https://www.cnblogs.com/afeiiii/p/13518928.html