ZooKeeper-3.3.4集群安装配置

ZooKeeper是一个分布式开源框架,提供了协调分布式应用的基本服务,它向外部应用暴露一组通用服务——分布式同步(Distributed Synchronization)、命名服务(Naming Service)、集群维护(Group Maintenance)等,简化分布式应用协调及其管理的难度,提供高性能的分布式服务。ZooKeeper本身可以以Standalone模式安装运行,不过它的长处在于通过分布式ZooKeeper集群(一个Leader,多个Follower),基于一定的策略来保证ZooKeeper集群的稳定性和可用性,从而实现分布式应用的可靠性。

有关ZooKeeper的介绍,网上很多,也可以参考文章后面,我整理的一些相关链接。

下面,我们简单说明一下ZooKeeper的配置。

ZooKeeper Standalone模式

从Apache网站上(zookeeper.apache.org)下载ZooKeeper软件包,我选择了3.3.4版本的(zookeeper-3.3.4.tar.gz),在一台Linux机器上安装非常容易,只需要解压缩后,简单配置一下即可以启动ZooKeeper服务器进程。

将zookeeper-3.3.4/conf目录下面的 zoo_sample.cfg修改为zoo.cfg,配置文件内容如下所示:

[plain] view plaincopy
 
  1. tickTime=2000  
  2. dataDir=/home/hadoop/storage/zookeeper  
  3. clientPort=2181  
  4. initLimit=5  
  5. syncLimit=2  

上面各个配置参数的含义也非常简单,引用如下所示:

参数说明:

  • tickTime: zookeeper中使用的基本时间单位, 毫秒值.
  • dataDir: 数据目录. 可以是任意目录.
  • dataLogDir: log目录, 同样可以是任意目录. 如果没有设置该参数, 将使用和dataDir相同的设置.
  • clientPort: 监听client连接的端口号.

下面启动ZooKeeper服务器进程:

[plain] view plaincopy
 
  1. cd zookeeper-3.3.4/  
  2. bin/zkServer.sh start  

通过jps命令可以查看ZooKeeper服务器进程,名称为QuorumPeerMain。

在客户端连接ZooKeeper服务器,执行如下命令:

[plain] view plaincopy
 
  1. bin/zkCli.sh -server dynamic:2181  

上面dynamic是我的主机名,如果在本机执行,则执行如下命令即可:

[plain] view plaincopy
 
  1. bin/zkCli.sh  

客户端连接信息如下所示:

[plain] view plaincopy
 
  1. hadoop@master:~/installation/zookeeper-3.3.4$ bin/zkCli.sh -server dynamic:2181  
  2. Connecting to dynamic:2181  
  3. 2012-01-08 21:30:06,178 - INFO  [main:Environment@97] - Client environment:zookeeper.version=3.3.3-1203054, built on 11/17/2011 05:47 GMT  
  4. 2012-01-08 21:30:06,188 - INFO  [main:Environment@97] - Client environment:host.name=master  
  5. 2012-01-08 21:30:06,191 - INFO  [main:Environment@97] - Client environment:java.version=1.6.0_30  
  6. 2012-01-08 21:30:06,194 - INFO  [main:Environment@97] - Client environment:java.vendor=Sun Microsystems Inc.  
  7. 2012-01-08 21:30:06,200 - INFO  [main:Environment@97] - Client environment:java.home=/home/hadoop/installation/jdk1.6.0_30/jre  
  8. 2012-01-08 21:30:06,203 - INFO  [main:Environment@97] - Client environment:java.class.path=/home/hadoop/installation/zookeeper-3.3.4/bin/../build/classes:/home/hadoop/installation/zookeeper-3.3.4/bin/../build/lib/*.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../zookeeper-3.3.4.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/log4j-1.2.15.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/jline-0.9.94.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/commons-lang-2.4.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/commons-collections-3.2.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/commons-cli-1.1.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/apache-rat-tasks-0.6.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/apache-rat-core-0.6.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../src/java/lib/*.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../conf:/home/hadoop/installation/jdk1.6.0_30/lib/*.jar:/home/hadoop/installation/jdk1.6.0_30/jre/lib/*.jar  
  9. 2012-01-08 21:30:06,206 - INFO  [main:Environment@97] - Client environment:java.library.path=/home/hadoop/installation/jdk1.6.0_30/jre/lib/i386/client:/home/hadoop/installation/jdk1.6.0_30/jre/lib/i386:/home/hadoop/installation/jdk1.6.0_30/jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib  
  10. 2012-01-08 21:30:06,213 - INFO  [main:Environment@97] - Client environment:java.io.tmpdir=/tmp  
  11. 2012-01-08 21:30:06,216 - INFO  [main:Environment@97] - Client environment:java.compiler=<NA>  
  12. 2012-01-08 21:30:06,235 - INFO  [main:Environment@97] - Client environment:os.name=Linux  
  13. 2012-01-08 21:30:06,244 - INFO  [main:Environment@97] - Client environment:os.arch=i386  
  14. 2012-01-08 21:30:06,246 - INFO  [main:Environment@97] - Client environment:os.version=3.0.0-14-generic  
  15. 2012-01-08 21:30:06,251 - INFO  [main:Environment@97] - Client environment:user.name=hadoop  
  16. 2012-01-08 21:30:06,254 - INFO  [main:Environment@97] - Client environment:user.home=/home/hadoop  
  17. 2012-01-08 21:30:06,255 - INFO  [main:Environment@97] - Client environment:user.dir=/home/hadoop/installation/zookeeper-3.3.4  
  18. 2012-01-08 21:30:06,264 - INFO  [main:ZooKeeper@379] - Initiating client connection, connectString=dynamic:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@bf32c  
  19. 2012-01-08 21:30:06,339 - INFO  [main-SendThread():ClientCnxn$SendThread@1061] - Opening socket connection to server dynamic/192.168.0.107:2181  
  20. Welcome to ZooKeeper!  
  21. 2012-01-08 21:30:06,397 - INFO  [main-SendThread(dynamic:2181):ClientCnxn$SendThread@950] - Socket connection established to dynamic/192.168.0.107:2181, initiating session  
  22. JLine support is enabled  
  23. 2012-01-08 21:30:06,492 - INFO  [main-SendThread(dynamic:2181):ClientCnxn$SendThread@739] - Session establishment complete on server dynamic/192.168.0.107:2181, sessionid = 0x134b9b714f9000c, negotiated timeout = 30000  
  24.   
  25. WATCHER::  
  26.   
  27. WatchedEvent state:SyncConnected type:None path:null  
  28. [zk: dynamic:2181(CONNECTED) 0]   

接着,可以使用help查看Zookeeper客户端可以使用的基本操作命令。

ZooKeeper Distributed模式

ZooKeeper分布式模式安装(ZooKeeper集群)也比较容易,这里说明一下基本要点。

首先要明确的是,ZooKeeper集群是一个独立的分布式协调服务集群,“独立”的含义就是说,如果想使用ZooKeeper实现分布式应用的协调与管理,简化协调与管理,任何分布式应用都可以使用,这就要归功于Zookeeper的数据模型(Data Model)和层次命名空间(Hierarchical Namespace)结构,详细可以参考http://zookeeper.apache.org/doc/trunk/zookeeperOver.html。在设计你的分布式应用协调服务时,首要的就是考虑如何组织层次命名空间。

下面说明分布式模式的安装配置,过程如下所示:

第一步:主机名称到IP地址映射配置

ZooKeeper集群中具有两个关键的角色:Leader和Follower。集群中所有的结点作为一个整体对分布式应用提供服务,集群中每个结点之间都互相连接,所以,在配置的ZooKeeper集群的时候,每一个结点的host到IP地址的映射都要配置上集群中其它结点的映射信息。

例如,我的ZooKeeper集群中每个结点的配置,以slave-01为例,/etc/hosts内容如下所示:

[plain] view plaincopy
 
  1. 192.168.0.179   slave-01  
  2. 192.168.0.178   slave-02  
  3. 192.168.0.177   slave-03  

ZooKeeper采用一种称为Leader election的选举算法。在整个集群运行过程中,只有一个Leader,其他的都是Follower,如果ZooKeeper集群在运行过程中Leader出了问题,系统会采用该算法重新选出一个Leader。因此,各个结点之间要能够保证互相连接,必须配置上述映射。

ZooKeeper集群启动的时候,会首先选出一个Leader,在Leader election过程中,某一个满足选举算的结点就能成为Leader。整个集群的架构可以参考http://zookeeper.apache.org/doc/trunk/zookeeperOver.html#sc_designGoals

第二步:修改ZooKeeper配置文件

在其中一台机器(slave-01)上,解压缩zookeeper-3.3.4.tar.gz,修改配置文件conf/zoo.cfg,内容如下所示:

[plain] view plaincopy
 
  1. tickTime=2000  
  2. dataDir=/home/hadoop/storage/zookeeper  
  3. clientPort=2181  
  4. initLimit=5  
  5. syncLimit=2  
  6. server.1=slave-01:2888:3888  
  7. server.2=slave-02:2888:3888  
  8. server.3=slave-03:2888:3888  

上述配置内容说明,可以参考http://zookeeper.apache.org/doc/trunk/zookeeperStarted.html#sc_RunningReplicatedZooKeeper

新增了几个参数, 其含义如下:

  • initLimit: zookeeper集群中的包含多台server, 其中一台为leader, 集群中其余的server为follower. initLimit参数配置初始化连接时, follower和leader之间的最长心跳时间. 此时该参数设置为5, 说明时间限制为5倍tickTime, 即5*2000=10000ms=10s.
  • syncLimit: 该参数配置leader和follower之间发送消息, 请求和应答的最大时间长度. 此时该参数设置为2, 说明时间限制为2倍tickTime, 即4000ms.
  • server.X=A:B:C 其中X是一个数字, 表示这是第几号server. A是该server所在的IP地址. B配置该server和集群中的leader交换消息所使用的端口. C配置选举leader时所使用的端口. 由于配置的是伪集群模式, 所以各个server的B, C参数必须不同.

第三步:远程复制分发安装文件

上面已经在一台机器slave-01上配置完成ZooKeeper,现在可以将该配置好的安装文件远程拷贝到集群中的各个结点对应的目录下:

[plain] view plaincopy
 
  1. cd /home/hadoop/installation/  
  2. scp -r zookeeper-3.3.4/ hadoop@slave-02:/home/hadoop/installation/  
  3. scp -r zookeeper-3.3.4/ hadoop@slave-03:/home/hadoop/installation/  

第四步:设置myid

在我们配置的dataDir指定的目录下面,创建一个myid文件,里面内容为一个数字,用来标识当前主机,conf/zoo.cfg文件中配置的server.X中X为什么数字,则myid文件中就输入这个数字,例如:

[plain] view plaincopy
 
  1. hadoop@slave-01:~/installation/zookeeper-3.3.4$ echo "1" > /home/hadoop/storage/zookeeper/myid  
  2. hadoop@slave-02:~/installation/zookeeper-3.3.4$ echo "2" > /home/hadoop/storage/zookeeper/myid  
  3. hadoop@slave-03:~/installation/zookeeper-3.3.4$ echo "3" > /home/hadoop/storage/zookeeper/myid  

按照上述进行配置即可。

第五步:启动ZooKeeper集群

在ZooKeeper集群的每个结点上,执行启动ZooKeeper服务的脚本,如下所示:

[plain] view plaincopy
 
  1. hadoop@slave-01:~/installation/zookeeper-3.3.4$ bin/zkServer.sh start  
  2. hadoop@slave-02:~/installation/zookeeper-3.3.4$ bin/zkServer.sh start  
  3. hadoop@slave-03:~/installation/zookeeper-3.3.4$ bin/zkServer.sh start  

以结点slave-01为例,日志如下所示:

[plain] view plaincopy
 
  1. hadoop@slave-01:~/installation/zookeeper-3.3.4$ tail -500f zookeeper.out   
  2. 2012-01-08 06:51:19,117 - INFO  [main:QuorumPeerConfig@90] - Reading configuration from: /home/hadoop/installation/zookeeper-3.3.4/bin/../conf/zoo.cfg  
  3. 2012-01-08 06:51:19,133 - INFO  [main:QuorumPeerConfig@310] - Defaulting to majority quorums  
  4. 2012-01-08 06:51:19,167 - INFO  [main:QuorumPeerMain@119] - Starting quorum peer  
  5. 2012-01-08 06:51:19,227 - INFO  [main:NIOServerCnxn$Factory@143] - binding to port 0.0.0.0/0.0.0.0:2181  
  6. 2012-01-08 06:51:19,277 - INFO  [main:QuorumPeer@819] - tickTime set to 2000  
  7. 2012-01-08 06:51:19,278 - INFO  [main:QuorumPeer@830] - minSessionTimeout set to -1  
  8. 2012-01-08 06:51:19,279 - INFO  [main:QuorumPeer@841] - maxSessionTimeout set to -1  
  9. 2012-01-08 06:51:19,281 - INFO  [main:QuorumPeer@856] - initLimit set to 5  
  10. 2012-01-08 06:51:19,347 - INFO  [Thread-1:QuorumCnxManager$Listener@473] - My election bind port: 3888  
  11. 2012-01-08 06:51:19,393 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumPeer@621] - LOOKING  
  12. 2012-01-08 06:51:19,396 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@663] - New election. My id =  1, Proposed zxid = 0  
  13. 2012-01-08 06:51:19,400 - INFO  [WorkerReceiver Thread:FastLeaderElection@496] - Notification: 1 (n.leader), 0 (n.zxid), 1 (n.round), LOOKING (n.state), 1 (n.sid), LOOKING (my state)  
  14. 2012-01-08 06:51:19,416 - WARN  [WorkerSender Thread:QuorumCnxManager@384] - Cannot open channel to 2 at election address slave-02/192.168.0.178:3888  
  15. java.net.ConnectException: Connection refused  
  16.         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)  
  17.         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)  
  18.         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)  
  19.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
  20.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:340)  
  21.         at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:360)  
  22.         at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:333)  
  23.         at java.lang.Thread.run(Thread.java:662)  
  24. 2012-01-08 06:51:19,420 - WARN  [WorkerSender Thread:QuorumCnxManager@384] - Cannot open channel to 3 at election address slave-03/192.168.0.177:3888  
  25. java.net.ConnectException: Connection refused  
  26.         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)  
  27.         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)  
  28.         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)  
  29.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
  30.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:340)  
  31.         at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:360)  
  32.         at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:333)  
  33.         at java.lang.Thread.run(Thread.java:662)  
  34. 2012-01-08 06:51:19,612 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 2 at election address slave-02/192.168.0.178:3888  
  35. java.net.ConnectException: Connection refused  
  36.         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)  
  37.         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)  
  38.         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)  
  39.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
  40.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)  
  41.         at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)  
  42.         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)  
  43. 2012-01-08 06:51:19,615 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 3 at election address slave-03/192.168.0.177:3888  
  44. java.net.ConnectException: Connection refused  
  45.         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)  
  46.         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)  
  47.         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)  
  48.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
  49.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)  
  50.         at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)  
  51.         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)  
  52. 2012-01-08 06:51:19,616 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@697] - Notification time out: 400  
  53. 2012-01-08 06:51:20,019 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 2 at election address slave-02/192.168.0.178:3888  
  54. java.net.ConnectException: Connection refused  
  55.         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)  
  56.         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)  
  57.         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)  
  58.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
  59.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)  
  60.         at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)  
  61.         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)  
  62. 2012-01-08 06:51:20,021 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 3 at election address slave-03/192.168.0.177:3888  
  63. java.net.ConnectException: Connection refused  
  64.         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)  
  65.         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)  
  66.         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)  
  67.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
  68.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)  
  69.         at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)  
  70.         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)  
  71. 2012-01-08 06:51:20,022 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@697] - Notification time out: 800  
  72. 2012-01-08 06:51:20,825 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 2 at election address slave-02/192.168.0.178:3888  
  73. java.net.ConnectException: Connection refused  
  74.         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)  
  75.         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)  
  76.         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)  
  77.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
  78.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)  
  79.         at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)  
  80.         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)  
  81. 2012-01-08 06:51:20,827 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 3 at election address slave-03/192.168.0.177:3888  
  82. java.net.ConnectException: Connection refused  
  83.         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)  
  84.         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)  
  85.         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)  
  86.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
  87.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)  
  88.         at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)  
  89.         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)  
  90. 2012-01-08 06:51:20,828 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@697] - Notification time out: 1600  
  91. 2012-01-08 06:51:22,435 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 2 at election address slave-02/192.168.0.178:3888  
  92. java.net.ConnectException: Connection refused  
  93.         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)  
  94.         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)  
  95.         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)  
  96.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
  97.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)  
  98.         at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)  
  99.         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)  
  100. 2012-01-08 06:51:22,439 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 3 at election address slave-03/192.168.0.177:3888  
  101. java.net.ConnectException: Connection refused  
  102.         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)  
  103.         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)  
  104.         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)  
  105.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
  106.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)  
  107.         at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)  
  108.         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)  
  109. 2012-01-08 06:51:22,441 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@697] - Notification time out: 3200  
  110. 2012-01-08 06:51:22,945 - INFO  [WorkerReceiver Thread:FastLeaderElection@496] - Notification: 2 (n.leader), 0 (n.zxid), 1 (n.round), LOOKING (n.state), 2 (n.sid), LOOKING (my state)  
  111. 2012-01-08 06:51:22,946 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@721] - Updating proposal  
  112. 2012-01-08 06:51:22,949 - WARN  [WorkerSender Thread:QuorumCnxManager@384] - Cannot open channel to 3 at election address slave-03/192.168.0.177:3888  
  113. java.net.ConnectException: Connection refused  
  114.         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)  
  115.         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)  
  116.         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)  
  117.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
  118.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:340)  
  119.         at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:360)  
  120.         at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:333)  
  121.         at java.lang.Thread.run(Thread.java:662)  
  122. 2012-01-08 06:51:22,951 - INFO  [WorkerReceiver Thread:FastLeaderElection@496] - Notification: 2 (n.leader), 0 (n.zxid), 1 (n.round), LOOKING (n.state), 1 (n.sid), LOOKING (my state)  
  123. 2012-01-08 06:51:23,156 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumPeer@643] - FOLLOWING  
  124. 2012-01-08 06:51:23,170 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Learner@80] - TCP NoDelay set to: true  
  125. 2012-01-08 06:51:23,206 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:zookeeper.version=3.3.3-1203054, built on 11/17/2011 05:47 GMT  
  126. 2012-01-08 06:51:23,207 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:host.name=slave-01  
  127. 2012-01-08 06:51:23,207 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:java.version=1.6.0_30  
  128. 2012-01-08 06:51:23,208 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:java.vendor=Sun Microsystems Inc.  
  129. 2012-01-08 06:51:23,208 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:java.home=/home/hadoop/installation/jdk1.6.0_30/jre  
  130. 2012-01-08 06:51:23,209 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:java.class.path=/home/hadoop/installation/zookeeper-3.3.4/bin/../build/classes:/home/hadoop/installation/zookeeper-3.3.4/bin/../build/lib/*.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../zookeeper-3.3.4.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/log4j-1.2.15.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/jline-0.9.94.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/commons-lang-2.4.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/commons-collections-3.2.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/commons-cli-1.1.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/apache-rat-tasks-0.6.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/apache-rat-core-0.6.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../src/java/lib/*.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../conf:/home/hadoop/installation/jdk1.6.0_30/lib/*.jar:/home/hadoop/installation/jdk1.6.0_30/jre/lib/*.jar  
  131. 2012-01-08 06:51:23,210 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:java.library.path=/home/hadoop/installation/jdk1.6.0_30/jre/lib/i386/client:/home/hadoop/installation/jdk1.6.0_30/jre/lib/i386:/home/hadoop/installation/jdk1.6.0_30/jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib  
  132. 2012-01-08 06:51:23,210 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:java.io.tmpdir=/tmp  
  133. 2012-01-08 06:51:23,212 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:java.compiler=<NA>  
  134. 2012-01-08 06:51:23,212 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:os.name=Linux  
  135. 2012-01-08 06:51:23,212 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:os.arch=i386  
  136. 2012-01-08 06:51:23,213 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:os.version=3.0.0-14-generic  
  137. 2012-01-08 06:51:23,213 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:user.name=hadoop  
  138. 2012-01-08 06:51:23,214 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:user.home=/home/hadoop  
  139. 2012-01-08 06:51:23,214 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Environment@97] - Server environment:user.dir=/home/hadoop/installation/zookeeper-3.3.4  
  140. 2012-01-08 06:51:23,223 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@151] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /home/hadoop/storage/zookeeper/version-2 snapdir /home/hadoop/storage/zookeeper/version-2  
  141. 2012-01-08 06:51:23,339 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Learner@294] - Getting a snapshot from leader  
  142. 2012-01-08 06:51:23,358 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Learner@325] - Setting leader epoch 1  
  143. 2012-01-08 06:51:23,358 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@254] - Snapshotting: 0  
  144. 2012-01-08 06:51:25,511 - INFO  [WorkerReceiver Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 0 (n.zxid), 1 (n.round), LOOKING (n.state), 3 (n.sid), FOLLOWING (my state)  
  145. 2012-01-08 06:51:42,584 - INFO  [WorkerReceiver Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 0 (n.zxid), 2 (n.round), LOOKING (n.state), 3 (n.sid), FOLLOWING (my state)  

我启动的顺序是slave-01>slave-02>slave-03,由于ZooKeeper集群启动的时候,每个结点都试图去连接集群中的其它结点,先启动的肯定连不上后面还没启动的,所以上面日志前面部分的异常是可以忽略的。通过后面部分可以看到,集群在选出一个Leader后,最后稳定了。

其他结点可能也出现类似问题,属于正常。

第六步:安装验证

可以通过ZooKeeper的脚本来查看启动状态,包括集群中各个结点的角色(或是Leader,或是Follower),如下所示,是在ZooKeeper集群中的每个结点上查询的结果:

[plain] view plaincopy
 
  1. hadoop@slave-01:~/installation/zookeeper-3.3.4$ bin/zkServer.sh status  
  2. JMX enabled by default  
  3. Using config: /home/hadoop/installation/zookeeper-3.3.4/bin/../conf/zoo.cfg  
  4. Mode: follower  
  5.   
  6. hadoop@slave-02:~/installation/zookeeper-3.3.4$  bin/zkServer.sh status  
  7. JMX enabled by default  
  8. Using config: /home/hadoop/installation/zookeeper-3.3.4/bin/../conf/zoo.cfg  
  9. Mode: leader  
  10.   
  11. hadoop@slave-03:~/installation/zookeeper-3.3.4$  bin/zkServer.sh status  
  12. JMX enabled by default  
  13. Using config: /home/hadoop/installation/zookeeper-3.3.4/bin/../conf/zoo.cfg  
  14. Mode: follower  

通过上面状态查询结果可见,slave-02是集群的Leader,其余的两个结点是Follower。

另外,可以通过客户端脚本,连接到ZooKeeper集群上。对于客户端来说,ZooKeeper是一个整体(ensemble),连接到ZooKeeper集群实际上感觉在独享整个集群的服务,所以,你可以在任何一个结点上建立到服务集群的连接,例如:

[plain] view plaincopy
 
  1. hadoop@slave-03:~/installation/zookeeper-3.3.4$ bin/zkCli.sh -server slave-01:2181  
  2. Connecting to slave-01:2181  
  3. 2012-01-08 07:14:21,068 - INFO  [main:Environment@97] - Client environment:zookeeper.version=3.3.3-1203054, built on 11/17/2011 05:47 GMT  
  4. 2012-01-08 07:14:21,080 - INFO  [main:Environment@97] - Client environment:host.name=slave-03  
  5. 2012-01-08 07:14:21,085 - INFO  [main:Environment@97] - Client environment:java.version=1.6.0_30  
  6. 2012-01-08 07:14:21,089 - INFO  [main:Environment@97] - Client environment:java.vendor=Sun Microsystems Inc.  
  7. 2012-01-08 07:14:21,095 - INFO  [main:Environment@97] - Client environment:java.home=/home/hadoop/installation/jdk1.6.0_30/jre  
  8. 2012-01-08 07:14:21,104 - INFO  [main:Environment@97] - Client environment:java.class.path=/home/hadoop/installation/zookeeper-3.3.4/bin/../build/classes:/home/hadoop/installation/zookeeper-3.3.4/bin/../build/lib/*.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../zookeeper-3.3.4.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/log4j-1.2.15.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/jline-0.9.94.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/commons-lang-2.4.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/commons-collections-3.2.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/commons-cli-1.1.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/apache-rat-tasks-0.6.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../lib/apache-rat-core-0.6.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../src/java/lib/*.jar:/home/hadoop/installation/zookeeper-3.3.4/bin/../conf:/home/hadoop/installation/jdk1.6.0_30/lib/*.jar:/home/hadoop/installation/jdk1.6.0_30/jre/lib/*.jar  
  9. 2012-01-08 07:14:21,111 - INFO  [main:Environment@97] - Client environment:java.library.path=/home/hadoop/installation/jdk1.6.0_30/jre/lib/i386/client:/home/hadoop/installation/jdk1.6.0_30/jre/lib/i386:/home/hadoop/installation/jdk1.6.0_30/jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib  
  10. 2012-01-08 07:14:21,116 - INFO  [main:Environment@97] - Client environment:java.io.tmpdir=/tmp  
  11. 2012-01-08 07:14:21,124 - INFO  [main:Environment@97] - Client environment:java.compiler=<NA>  
  12. 2012-01-08 07:14:21,169 - INFO  [main:Environment@97] - Client environment:os.name=Linux  
  13. 2012-01-08 07:14:21,175 - INFO  [main:Environment@97] - Client environment:os.arch=i386  
  14. 2012-01-08 07:14:21,177 - INFO  [main:Environment@97] - Client environment:os.version=3.0.0-14-generic  
  15. 2012-01-08 07:14:21,185 - INFO  [main:Environment@97] - Client environment:user.name=hadoop  
  16. 2012-01-08 07:14:21,188 - INFO  [main:Environment@97] - Client environment:user.home=/home/hadoop  
  17. 2012-01-08 07:14:21,190 - INFO  [main:Environment@97] - Client environment:user.dir=/home/hadoop/installation/zookeeper-3.3.4  
  18. 2012-01-08 07:14:21,197 - INFO  [main:ZooKeeper@379] - Initiating client connection, connectString=slave-01:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@bf32c  
  19. 2012-01-08 07:14:21,305 - INFO  [main-SendThread():ClientCnxn$SendThread@1061] - Opening socket connection to server slave-01/192.168.0.179:2181  
  20. Welcome to ZooKeeper!  
  21. 2012-01-08 07:14:21,376 - INFO  [main-SendThread(slave-01:2181):ClientCnxn$SendThread@950] - Socket connection established to slave-01/192.168.0.179:2181, initiating session  
  22. JLine support is enabled  
  23. [zk: slave-01:2181(CONNECTING) 0] 2012-01-08 07:14:21,872 - INFO  [main-SendThread(slave-01:2181):ClientCnxn$SendThread@739] - Session establishment complete on server slave-01/192.168.0.179:2181, sessionid = 0x134bdcd6b730000, negotiated timeout = 30000  
  24.   
  25. WATCHER::  
  26.   
  27. WatchedEvent state:SyncConnected type:None path:null  
  28.   
  29. [zk: slave-01:2181(CONNECTED) 0] ls /  
  30. [zookeeper]  

当前根路径为/zookeeper。

总结说明

主机名与IP地址映射配置问题

启动ZooKeeper集群时,如果ZooKeeper集群中slave-01结点的日志出现如下错误:

[plain] view plaincopy
 
  1. java.net.SocketTimeoutException  
  2.         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:109)  
  3.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
  4.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)  
  5.         at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)  
  6.         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)  
  7. 2012-01-08 06:37:46,026 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@697] - Notification time out: 6400  
  8. 2012-01-08 06:37:57,431 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 2 at election address slave-02/202.106.199.35:3888  
  9. java.net.SocketTimeoutException  
  10.         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:109)  
  11.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)  
  12.         at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404)  
  13.         at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:688)  
  14.         at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:622)  
  15. 2012-01-08 06:38:02,442 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@384] - Cannot open channel to 3 at election address slave-03/202.106.199.35:3888  

很显然,slave-01在启动时连接集群中其他结点(slave-02、slave-03)时,主机名映射的IP与我们实际配置的不一致,所以集群中各个结点之间无法建立链路,整个ZooKeeper集群启动是失败的。

上面错误日志中slave-02/202.106.199.35:3888实际应该是slave-02/202.192.168.0.178:3888就对了,但是在进行域名解析的时候映射有问题,修改每个结点的/etc/hosts文件,将ZooKeeper集群中所有结点主机名到IP地址的映射配置上。

参考链接

下面是我整理搜集的有关ZooKeeper相关内容的网址,可以学习参考。

中文链接:

英文链接:
原文地址:https://www.cnblogs.com/breg/p/4755015.html