Zookeeper深入理解(二)应用场景模拟

1.分布式锁实现

我们可以利用临时节点来实现,多个进程都尝试创键临时节点/lock, 但最终只会有一个进程P能创建成功,而其他没能创建成功的进程,可以在节点/lock上Watch(相当于等待锁释放), 一旦进程P处理完事务,断开连接,节点/lock被自动删除,其他进程将得到通知,进而继续创建节点/lock,以争得锁资源。

实现步骤:

打开一个客户端创建临时lock节点

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 0] create -e /lock "lock" 

Created /lock

打开第二个客户端,创建临时lock节点报错,说明lock节点已经存在

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 2] create -e /lock "lock"

Node already exists: /lock

关闭第一个客户端后等待几秒钟后,在第二个客户端查看znode目录,lock节点已经不存在

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 9] ls /                  

[zookeeper]

然后就可创建lock节点

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 10] create -e /lock "lock"

Created /lock

通过以上命令也就模拟出了多个agent共享分布式锁的简单功能。

2.Master-Worker实现

第一个会话创建一个叫/master的临时节点

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 3] create -e /master "master1.example.com:2223"

Created /master

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 4] ls /

[zookeeper, master]

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 5] get /master

"master1.example.com:2223"

cZxid = 0x20000000f

ctime = Wed Mar 16 11:28:18 CST 2016

mZxid = 0x20000000f

mtime = Wed Mar 16 11:28:18 CST 2016

pZxid = 0x20000000f

cversion = 0

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x3537d63b3340001

dataLength = 26

numChildren = 0

假设现在还有另一个进程作为master备份节点,开始创建master节点,却被告知master节点已经存在

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 12] create -e /master "master2.example.com:2223"

Node already exists: /master

但有可能在某一瞬间主master就崩溃了,这时备份master应立即转为主master,所以我们需要Watch主master

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 13] stat /master true

cZxid = 0x20000000f

ctime = Wed Mar 16 11:28:18 CST 2016

mZxid = 0x20000000f

mtime = Wed Mar 16 11:28:18 CST 2016

pZxid = 0x20000000f

cversion = 0

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x3537d63b3340001

dataLength = 26

numChildren = 0

stat命令可以获取节点的属性,并且监听其是否存在,参数true表明设置Watch。 这时,主master突然崩溃断开连接(第一个的会话),这时第二个会话将得到节点/master删除的通知,并立即转为主master

在主节点退出client端

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 7] quit

Quitting...

2016-03-16 11:31:48,954 [myid:] - INFO  [main:ZooKeeper@684] - Session: 0x3537d63b3340001 closed

2016-03-16 11:31:48,955 [myid:] - INFO  [main-EventThread:ClientCnxn$EventThread@512] - EventThread shut down

[root@zookeeper1 ~]# 

同时观察从节点会收到watch消息

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 14] 

WATCHER::

WatchedEvent state:SyncConnected type:NodeDeleted path:/master

查看znode文件系统/master节点已经消失

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 14] ls /

[zookeeper]

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 15] create -e /master "master2.example.com:2223"

Created /master

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 16] ls /

[zookeeper, master]

3.工作者(Workers),任务(Tasks)和分配(Assignments)

先建立分别存在Workers,Tasks和Assignments的节点:/workers,/tasks,/assign。(注意是持久节点)

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 0] create /workers ""

Created /workers

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 1] create /tasks ""  

Created /tasks

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 2] create /assign ""

Created /assign

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 3] ls /

[zookeeper, workers, tasks, assign]

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 4] 

现在我们的master节点需要监听到节点/workers和/tasks,以便分配task到worker

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 4] ls /workers true

[]

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 5] ls /tasks true 

[]

在worker角色

打开另一个会话,假设现在有一个worker可用

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 18] create -e /workers/worker1.example.com "worker1.example.com:2224"

Created /workers/worker1.example.com

此时master也得到/workers子节点变化的通知

WATCHER::

WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/workers

为了收到分配的任务,worker需要创建一个节点 /assign/worker1.example.com,并且监听子节点的变化

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 20] create /assign/worker1.example.com ""

Created /assign/worker1.example.com

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 21] ls /assign/worker1.example.com true

[]

在Client角色

现在假设一个客户端提交了一个任务到服务器中, 并且它必须还得监听该任务节点, 因为客户端必须知道自己提交的任务到底被执行或执行成功没有

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 0] create -s /tasks/task- "cmd"

Created /tasks/task-0000000000

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 1] ls /tasks/task-0000000000 true

[]

这里我们创建了一个连续持久节点,因此其节点名称加上了一个递增整数0000000000, 这时,master节点就感知到有新的任务提交上来了,将其分配给worker1

WATCHER::

WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/tasks

然后master节点检查新的任务,可用的worker节点,并分配任务给worker

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 7] ls /tasks

[task-0000000000]

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 8] ls /workers

[worker1.example.com]

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 9] create /assign/worker1.example.com/task-0000000000 ""

Created /assign/worker1.example.com/task-0000000000

于是,worker节点感知到了分配给自己的任务,并做检查

WATCHER::

WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/assign/worker1.example.com

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 24] ls /assign/worker1.example.com

[task-0000000000]

worker一旦完成了任务,将在对应的任务下增加一个状态节点

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 25] create /tasks/task-0000000000/status "done"

Created /tasks/task-0000000000/status

此时客户端将得到通知,并检查任务执行结果

WATCHER::

WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/tasks/task-0000000000

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 2] get /tasks/task-0000000000

"cmd"

cZxid = 0x20000001d

ctime = Wed Mar 16 13:28:05 CST 2016

mZxid = 0x20000001d

mtime = Wed Mar 16 13:28:05 CST 2016

pZxid = 0x20000001f

cversion = 1

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x0

dataLength = 5

numChildren = 1

[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 3] get /tasks/task-0000000000/status

"done"

cZxid = 0x20000001f

ctime = Wed Mar 16 13:35:33 CST 2016

mZxid = 0x20000001f

mtime = Wed Mar 16 13:35:33 CST 2016

pZxid = 0x20000001f

cversion = 0

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x0

dataLength = 6

numChildren = 0

于是,客户端就知道了任务被执行的结果,这里结果为"done", 表示任务被成功执行。

 以上就是整个Master-Worker架构的主要工作机制,虽然只是一个模拟过程, 但是对我们理解Master-Worker工作原理是很有帮助的,对以后要研究代码实现,也是一个很好的铺垫。

4.以上内容主要介绍了zookeeper运用的主要三种模式,最后精炼总结一下:

1)分布式锁实现

通过创建临时节点/lock锁节点的方式,谁先成功创建锁谁就占用锁,谁用完锁谁来释放锁,谁占用锁但程序崩溃就自动释放锁。

2)Master-Worker实现

作为比较常用的主备解决方案原理为:主节点启动占用/master临时节点,被节点启动无法占用/master节点,但备用节点会watch /master节点,当主节点崩溃,备节点收到消息并立即占用主节点/master

3)工作者、任务和分配

存在Workers,Tasks和Assignments的节点

master节点需要监听到节点/workers和/tasks

创建一个workers后master收到消息

worker创建assign节点,监听assign

client创建task节点,并监听此task节点

master感知task上传和确定可用的worker,分配任务给worker

worker感知task被分配,任务处理完成增加节点状态

client感知task处理完成,任务执行最后成功。

原文地址:https://www.cnblogs.com/run4life/p/5327231.html