搭建hadoop+zookeeper HA集群

系统环境:CentOS6.3+JDK1.7

部署资源:hadoop-2.2.0  zookeeper-3.4.5  

集群规划

  主机名        IP        安装的软件        运行的进程

    reagina01  192.168.8.201  jdk,hadoop       NameNode,DFSZKFailoverController

    reagina02  192.168.8.202  jdk,hadoop       NameNode,DFSZKFailoverController

    reagina03  192.168.8.203  jdk,hadoop       ResourceManager

    reagina04  192.168.8.204  jdk,hadoop,zookeeper  DataNode,NodeManager,JournalNode,QuorumPeerMain

    reagina05  192.168.8.205  jdk,hadoop,zookeeper  DataNode,NodeManager,JournalNode,QuorumPeerMain

    reagina06  192.168.8.206  jdk,hadoop,zookeeper  DataNode,NodeManager,JournalNode,QuorumPeerMain

在reagina01节点上拆解Hadoop:

[root@reagina01 ~]# tar -zxvf hadoop-2.2.0-64bit.tar.gz -C /reagina/

修改配置文件:

1、hadoop-env.sh

2、core-site.xml

3、hdfs-site.xml

4、mapred-site.xml

5、yarn-site.xml

6、slaves

配置免密码登录:

  首先要配置reagina01到reagina01、reagina02、reagina03、reagina04、reagina05、reagina06的免密码登录。

[root@reagina01 ~]# ssh-keygen -t rsa

[root@reagina01 ~]# ssh-copy-id reagina01

[root@reagina01 ~]# ssh-copy-id reagina02

[root@reagina01 ~]# ssh-copy-id reagina03

[root@reagina01 ~]# ssh-copy-id reagina04

[root@reagina01 ~]# ssh-copy-id reagina05

[root@reagina01 ~]# ssh-copy-id reagina06

  然后配置reagina03到reagina03、reagina04、reagina05、reagina06的免密码登录

[root@reagina03 ~]# ssh-keygen -t rsa

[root@reagina03 ~]# ssh-copy-id reagina03

[root@reagina03 ~]# ssh-copy-id reagina04

[root@reagina03 ~]# ssh-copy-id reagina05

[root@reagina03 ~]# ssh-copy-id reagina06

  注意:两个namenode之间要配置ssh免密码登录,所以还要配置reagina02到reagina01的免密码登录:

[root@reagina02 ~]# ssh-keygen -t rsa

[root@reagina02 ~]# ssh-copy-id reagina01

将配置好的Hadoop拷贝到其他节点:

[root@reagina01 ~]# scp -r /reagina/ reagina02:/

[root@reagina01 ~]# scp -r /reagina/ reagina03:/

[root@reagina01 ~]# scp -r /reagina/hadoop-2.2.0/ root@reagina04:/reagina/

[root@reagina01 ~]# scp -r /reagina/hadoop-2.2.0/ root@reagina05:/reagina/

[root@reagina01 ~]# scp -r /reagina/hadoop-2.2.0/ root@reagina06:/reagina/

reagina04、reagina05、reagina06节点上的zookeeper集群的搭建:

  http://www.cnblogs.com/reagina/p/6366363.html

启动zookeeper集群:

在reagina01上启动所有journalnode:

[root@reagina03 ~]# /reagina/hadoop-2.2.0/sbin/hadoop-daemons.sh start journalnode

格式化HDFS:

[root@reagina01 ~]# /reagina/hadoop-2.2.0/bin/hdfs namenode -format

格式化后会再根据core-site.xml中的hadoop.tmp.dir配置生成一个目录,然后将这个目录拷贝到reagina02节点:

[root@reagina01 ~]# scp -r /reagina/hadoop-2.2.0/tmp/ reagina02:/reagina/hadoop-2.2.0/

格式化ZK:

[root@reagina01 ~]# /reagina/hadoop-2.2.0/bin/hdfs zkfc -formatZK

启动HDFS:

[root@reagina01 ~]# /reagina/hadoop-2.2.0/sbin/start-dfs.sh 

启动YARN(注意:是在reagina03上执行start-yarn.sh,把namenode和resourcemanager分开是因为性能问题,因为它们都要占用大量资源,所以把它们分开了,它们分开了就要分别在不同的机器上启动)

[root@reagina03 ~]# /reagina/hadoop-2.2.0/sbin/start-yarn.sh

在namenode节点上(reagina01、reagina02)启动zookeeper选举制度:

[root@reagina01 ~]# /reagina/hadoop-2.2.0/sbin/hadoop-daemon.sh start zkfc
starting zkfc, logging to /reagina/hadoop-2.2.0/logs/hadoop-root-zkfc-reagina01.out

[root@reagina02 ~]# /reagina/hadoop-2.2.0/sbin/hadoop-daemon.sh start zkfc
starting zkfc, logging to /reagina/hadoop-2.2.0/logs/hadoop-root-zkfc-reagina02.out
[root@reagina02 ~]# jps
2631 DFSZKFailoverController
2691 Jps
2183 NameNode
[root@reagina02 ~]#

到此,hadoop2.2.0配置完毕,可以统计浏览器访问:

 

验证HDFS  HA

[root@reagina01 sbin]# hadoop fs -put /etc/profile /profile

[root@reagina01 sbin]# hadoop fs -ls /

-rw-r--r-- 3 root supergroup 1924 2017-02-05 01:10 /profile

[root@reagina01 sbin]# jps
4727 NameNode
5260 DFSZKFailoverController
5491 Jps
[root@reagina01 sbin]# kill -9 4727

原文地址:https://www.cnblogs.com/reagina/p/6366902.html