postgresql 高可用 etcd + patroni 之三 switchover

os: centos 7.4
postgresql: 9.6.9
etcd: 3.2.18
patroni: 1.4.4

patroni + etcd 是在一个postgrsql 开源大会上 亚信的一个哥们讲解的高可用方案。
依然是基于 postgreql stream replication。

ip规划
192.168.56.101 node1 master
192.168.56.102 node2 slave
192.168.56.103 node3 slave

手动 switchover

切换前的状态

$ patronictl -c /usr/patroni/conf/patroni_postgresql.yml list pg96
+---------+----------+----------------+--------+---------+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | Lag in MB |
+---------+----------+----------------+--------+---------+-----------+
|   pg96  | pg96_101 | 192.168.56.101 |        | running |       0.0 |
|   pg96  | pg96_102 | 192.168.56.102 |        | running |       0.0 |
|   pg96  | pg96_103 | 192.168.56.103 | Leader | running |       0.0 |
+---------+----------+----------------+--------+---------+-----------+

执行手动切换

$ patronictl -c /usr/patroni/conf/patroni_postgresql.yml switchover
Master [pg96_103]: pg96_103
Candidate ['pg96_101', 'pg96_102'] []: pg96_101
When should the switchover take place (e.g. 2015-10-01T14:30)  [now]: now
Current cluster topology
+---------+----------+----------------+--------+---------+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | Lag in MB |
+---------+----------+----------------+--------+---------+-----------+
|   pg96  | pg96_101 | 192.168.56.101 |        | running |       0.0 |
|   pg96  | pg96_102 | 192.168.56.102 |        | running |       0.0 |
|   pg96  | pg96_103 | 192.168.56.103 | Leader | running |       0.0 |
+---------+----------+----------------+--------+---------+-----------+
Are you sure you want to switchover cluster pg96, demoting current master pg96_103? [y/N]: y
Switchover failed, details: 503, Switchover failed

node1 的日志如下

2018-07-11 23:26:34,635 INFO: received switchover request with leader=pg96_103 candidate=pg96_101 scheduled_at=None
2018-07-11 23:26:34,645 INFO: Got response from pg96_101 http://127.0.0.1:8008/patroni: {"database_system_identifier": "6576484813966394513", "postmaster_start_time": "2018-07-11 22:03:55.130 CST", "timeline": 3, "xlog": {"received_location": 50385856, "replayed_timestamp": "2018-07-11 22:34:29.725 CST", "paused": false, "replayed_location": 50385856}, "patroni": {"scope": "pg96", "version": "1.4.4"}, "state": "running", "role": "replica", "server_version": 90609}
2018-07-11 23:26:39,126 INFO: Lock owner: pg96_103; I am pg96_101
2018-07-11 23:26:39,126 INFO: does not have lock
2018-07-11 23:26:39,142 INFO: no action.  i am a secondary and i am following a leader

node3 的日志如下

2018-07-11 23:27:06,254 INFO: Lock owner: pg96_103; I am pg96_103
2018-07-11 23:27:06,274 INFO: Got response from pg96_101 http://127.0.0.1:8008/patroni: {"database_system_identifier": "6576484813966394513", "postmaster_start_time": "2018-07-11 17:38:41.768 CST", "timeline": 3, "xlog": {"location": 50385856}, "patroni": {"scope": "pg96", "version": "1.4.4"}, "replication": [{"sync_state": "potential", "sync_priority": 2, "client_addr": "192.168.56.102", "state": "streaming", "application_name": "pg96_102", "usename": "replicator"}, {"sync_state": "sync", "sync_priority": 1, "client_addr": "192.168.56.101", "state": "streaming", "application_name": "pg96_101", "usename": "replicator"}], "state": "running", "role": "master", "server_version": 90609}
2018-07-11 23:27:06,364 INFO: Member pg96_101 exceeds maximum replication lag
2018-07-11 23:27:06,365 WARNING: manual failover: no healthy members found, failover is not possible
2018-07-11 23:27:06,365 INFO: Cleaning up failover key
2018-07-11 23:27:06,389 INFO: no action.  i am the leader with the lock

在node3的日志输出发现 WARNING: manual failover: no healthy members found, failover is not possible
先记录下,研究明白后再补充。

原文地址:https://www.cnblogs.com/ctypyb2002/p/9792928.html