postgresql 高可用 etcd + patroni 之二 patroni

os: centos 7.4
postgresql: 9.6.9
etcd: 3.2.18
patroni: 1.4.4

patroni + etcd 是在一个postgrsql 开源大会上 亚信的一个哥们讲解的高可用方案。
依然是基于 postgreql stream replication。

ip规划
192.168.56.101 node1 master
192.168.56.102 node2 slave
192.168.56.103 node3 slave

安装postgresql并配置好stream

node1、node2、node3 节点上注意设置如下几个参数

synchronous_commit = on
full_page_writes = on
wal_log_hints = on
synchronous_standby_names = '*'
max_replication_slots = 10

主要是为了使用 pg_rewind,尽量不用 synchronous 方式复制数据,性能影响太大。

node1上创建复制槽,至关重要,patroni 用到了这个玩意

postgres=# create user replicator replication login encrypted password '1qaz2wsx';

postgres=# select * from pg_create_physical_replication_slot('pgsql96_node1');
postgres=# select * from pg_create_physical_replication_slot('pgsql96_node2');
postgres=# select * from pg_create_physical_replication_slot('pgsql96_node3');

node2、node3 配置stream replication

$ /usr/pgsql-9.6/bin/pg_ctl stop -m fast -D /var/lib/pgsql/9.6/main

$ cd /var/lib/pgsql/9.6/main
$ rm -rf ./*
$ /usr/pgsql-9.6/bin/pg_basebackup -h 192.168.56.101 -D /var/lib/pgsql/9.6/main -U replicator -v -P -R

$ vi recovery.conf
recovery_target_timeline = 'latest'
standby_mode = 'on'
primary_conninfo = 'host=192.168.56.101 port=5432 user=replicator password=1qaz2wsx'
primary_slot_name = 'pgsql96_node1'
trigger_file = '/tmp/postgresql.trigger.5432'

$ /usr/pgsql-9.6/bin/pg_ctl start -D /var/lib/pgsql/9.6/main -o "-c config_file=/etc/postgresql/9.6/main/postgresql.conf"

注意 recovery.conf 的 primary_slot_name 在不同节点值会不同。

添加复制功能条目 pg_hba.conf

$ vi pg_hba.conf

# Database administrative login by Unix domain socket
local   all             postgres                                peer

# TYPE  DATABASE        USER            ADDRESS                 METHOD

# "local" is for Unix domain socket connections only
local   all             all                                     peer

# IPv4 local connections:
host    all             postgres        127.0.0.1/32            trust
host    all             all             127.0.0.1/32            md5
host    all             all             192.168.56.0/24         md5

# IPv6 local connections:
host    all             all             ::1/128                 md5

# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication     replicator                              peer
host    replication     replicator      127.0.0.1/32            md5
host    replication     replicator      ::1/128                 md5

host	replication	    replicator	    192.168.56.101/32	         md5
host	replication	    replicator	    192.168.56.102/32	         md5
host	replication	    replicator	    192.168.56.103/32	         md5

$ psql -c "select pg_reload_conf();"

查看复制状态

postgres=#  select client_addr,
       pg_xlog_location_diff(sent_location, write_location) as write_delay,
       pg_xlog_location_diff(sent_location, flush_location) as flush_delay,
       pg_xlog_location_diff(sent_location, replay_location) as replay_delay 
 from pg_stat_replication;

  client_addr   | write_delay | flush_delay | replay_delay 
----------------+-------------+-------------+--------------
 192.168.56.102 |           0 |           0 |            0
 192.168.56.103 |           0 |           0 |            0
(2 row)

安装etcd

参考上一篇blog安装好etcd

下载、安装 patroni

用户也可以参考 https://www.linode.com/docs/databases/postgresql/create-a-highly-available-postgresql-cluster-using-patroni-and-haproxy/

个人觉得上篇文章中 etcd 做成单点不太合适,当然作为参考完全没有问题。

# cd /tmp
# curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
# python get-pip.py
# pip install patroni[dependencies]

patroni的一些依赖

urllib3>=1.19.1,!=1.21
boto
psycopg2>=2.5.4
PyYAML
requests
six>=1.7
kazoo>=1.3.1
python-etcd>=0.4.3,<0.5
python-consul>=0.7.0
click>=4.1
prettytable>=0.7
tzlocal
python-dateutil
psutil
cdiff
kubernetes>=2.0.0,<=6.0.0,!=4.0.*,!=5.0.*

patroni 的配置

# which patroni
/usr/bin/patroni

# patroni --help
/usr/lib64/python2.7/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
  """)
Usage: /usr/bin/patroni config.yml
	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable

错误提示 please use “pip install psycopg2-binary” instead

# pip install psycopg2-binary

patroni 配置文件

# mkdir -p /usr/patroni/conf
# cd /usr/patroni/conf/

# vi patroni_postgresql.yml

scope: pgsql96
namespace: /pgsql/
name: pgsql96_node1

restapi:
  listen: 192.168.56.101:8008
  connect_address: 192.168.56.101:8008

etcd:
  host: 192.168.56.101:2379

bootstrap:
  # this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster
  # and all other cluster members will use it as a `global configuration`
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    master_start_timeout: 300
#    synchronous_mode: false
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:
        listen_addresses: "*"
        port: 5432
        wal_level: logical
        hot_standby: "on"
        wal_keep_segments: 1000
        max_wal_senders: 10
        max_replication_slots: 10
        wal_log_hints: "on"
#        archive_mode: "on"
#        archive_timeout: 1800s
#        archive_command: gzip < %p > /data/backup/pgwalarchive/%f.gz
#      recovery_conf:
#        restore_command: gunzip < /data/backup/pgwalarchive/%f.gz > %p

postgresql:
  listen: 0.0.0.0:5432
  connect_address: 192.168.56.101:5432
  data_dir: /var/lib/pgsql/9.6/data
  bin_dir: /usr/pgsql-9.6/bin
#  config_dir:
  authentication:
    replication:
      username: replicator
      password: 1qaz2wsx
    superuser:
      username: postgres
      password: 1qaz2wsx

#watchdog:
#  mode: automatic # Allowed values: off, automatic, required
#  device: /dev/watchdog
#  safety_margin: 5

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

上面的配置和后面的输出信息有细微差异,是因为当时实验完成后又对这个配置文件做了好几次修正,方便大家直接copy使用。

手动启动 patroni

参数将按以下顺序应用(运行时被赋予最高优先级):

1、从文件加载参数postgresql.base.conf(或从自定义conf文件(如果已设置)
2、从文件加载参数postgresql.conf
3、从文件加载参数postgresql.auto.conf
4、运行时参数使用-o-name=value

node1、node2、node3 三个节点依次启动

$ patroni /usr/patroni/conf/patroni_postgresql.yml

node1 的日志如下

2018-07-11 18:17:22,402 INFO: Lock owner: pg96_101; I am pg96_101
2018-07-11 18:17:22,430 INFO: no action.  i am the leader with the lock
2018-07-11 18:17:32,403 INFO: Lock owner: pg96_101; I am pg96_101
2018-07-11 18:17:32,432 INFO: no action.  i am the leader with the lock

node2 的日志如下

2018-07-11 18:17:22,421 INFO: Lock owner: pg96_101; I am pg96_102
2018-07-11 18:17:22,421 INFO: does not have lock
2018-07-11 18:17:22,435 INFO: no action.  i am a secondary and i am following a leader
2018-07-11 18:17:32,426 INFO: Lock owner: pg96_101; I am pg96_102
2018-07-11 18:17:32,426 INFO: does not have lock
2018-07-11 18:17:32,436 INFO: no action.  i am a secondary and i am following a leader

node3 的日志如下

2018-07-11 18:17:22,409 INFO: Lock owner: pg96_101; I am pg96_103
2018-07-11 18:17:22,410 INFO: does not have lock
2018-07-11 18:17:22,423 INFO: no action.  i am a secondary and i am following a leader
2018-07-11 18:17:32,415 INFO: Lock owner: pg96_101; I am pg96_103
2018-07-11 18:17:32,415 INFO: does not have lock
2018-07-11 18:17:32,425 INFO: no action.  i am a secondary and i am following a leader

查看集群状态

查看 patroni 集群状态

$ patronictl -c /usr/patroni/conf/patroni_postgresql.yml list pg96
+---------+----------+----------------+--------+---------+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | Lag in MB |
+---------+----------+----------------+--------+---------+-----------+
|   pg96  | pg96_101 | 192.168.56.101 | Leader | running |       0.0 |
|   pg96  | pg96_102 | 192.168.56.102 |        | running |       0.0 |
|   pg96  | pg96_103 | 192.168.56.103 |        | running |       0.0 |
+---------+----------+----------------+--------+---------+-----------+

$ patronictl -c /usr/patroni/conf/patroni_postgresql.yml show-config pg96
loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
  parameters:
    listen_addresses: '*'
    port: 5432
  use_pg_rewind: true
retry_timeout: 10
ttl: 30

查看 etcd 的 信息

$ etcdctl ls /pg96/pg96/
/pg96/pg96/members
/pg96/pg96/initialize
/pg96/pg96/leader
/pg96/pg96/config
/pg96/pg96/optime
$ etcdctl get /pg96/pg96/members/pg96_101
{"conn_url":"postgres://192.168.56.101:5432/postgres","api_url":"http://127.0.0.1:8008/patroni","timeline":1,"state":"running","role":"master","xlog_location":50378640}
$ etcdctl get /pg96/pg96/members/pg96_102
{"conn_url":"postgres://192.168.56.102:5432/postgres","api_url":"http://127.0.0.1:8008/patroni","timeline":1,"state":"running","role":"replica","xlog_location":50378640}
$ etcdctl get /pg96/pg96/members/pg96_103
{"conn_url":"postgres://192.168.56.103:5432/postgres","api_url":"http://127.0.0.1:8008/patroni","timeline":1,"state":"running","role":"replica","xlog_location":50378640}
$ etcdctl get /pg96/pg96/initialize
6576484813966394513
$ etcdctl get /pg96/pg96/leader
pg96_101
$ etcdctl get /pg96/pg96/config
{"ttl":30,"maximum_lag_on_failover":1048576,"retry_timeout":10,"postgresql":{"use_pg_rewind":true,"parameters":{"listen_addresses":"*","port":5432}},"loop_wait":10}

$ etcdctl get /pg96/pg96/optime/leader
50378640

connction

using jdbc:

jdbc:postgresql://node1,node2,node3/postgres?targetServerType=master

libpq starting from postgresql 10:

postgresql://node1:port,node2:port,node3:port/?target_session_attrs=read-write

配置随OS启动

# vi /etc/rc.local
su - postgres -c "/usr/bin/patroni /usr/patroni/conf/patroni_postgresql.yml >> /var/log/postgresql/patroni.log  2>&1 &"

或者配置成 patroni.service

# vi /etc/systemd/system/patroni.service

[Unit]
Description=Runners to orchestrate a high-availability PostgreSQL
After=syslog.target network.target etcd.target

[Service]
Type=simple

User=postgres
Group=postgres

ExecStart=/usr/bin/patroni /usr/patroni/conf/patroni_postgresql.yml

KillMode=process

TimeoutSec=30

Restart=no

[Install]
WantedBy=multi-user.targ
# systemctl status patroni
# systemctl start patroni
# systemctl enable patroni

# systemctl status postgresql
# systemctl disable postgresql

# systemctl status etcd
# systemctl enable etcd

禁止 postgresql 的自启动,通过 patroni 来管理 postgresql。

总结:
个人感觉 etcd + patroni 还是相当不错的,会继续对patroni 研究下。

参考:
https://github.com/zalando/patroni
https://patroni.readthedocs.io/en/latest/
https://pypi.org/project/patroni/

https://github.com/zalando/patroni/blob/master/docs/replication_modes.rst
https://postgresconf.org/system/events/document/000/000/228/Patroni_tutorial_4x3-2.pdf

原文地址:https://www.cnblogs.com/ctypyb2002/p/9792939.html