MySQL的Auto-Failover功能

今天来体验一下MySQL的Auto-Failover功能,这里用到一个工具MySQL Utilities,它的功能很强大.此工具提供如下功能:
(1)管理工具 (克隆、复制、比较、差异、导出、导入)
(2)复制工具 (安装、配置)
(3)一般工具 (磁盘使用情况、冗余索引、搜索元数据)
而我们用它来实现Master-Slave的自动Failover,下面开始

Master:192.168.13.194
Slave:192.168.13.159
此Failover需要建立在GTID的基础上所以MySQL版本必须5.6即以上

先安装mysql-utilities
sudo yum install mysql-utilities

在my.cnf添加GTID参数(主从都要添加)

binlog-format = ROW
log-slave-updates = true
gtid-mode = on
enforce-gtid-consistency = true
report-host = 192.168.13.194
report-port = 3306
master-info-repository = TABLE
relay-log-info-repository = TABLE
sync-master-info = 1

在任意一个节点(此命令会自动完成主从配置)
[mysql@localhost ~]$  mysqlreplicate --master=root:XXXX@'192.168.13.194':3306 --slave=root:XXXX@'192.168.13.159':3306 --rpl-user=root:XXXX
# master on 192.168.13.194: ... connected.
# slave on 192.168.13.159: ... connected.
# Checking for binary logging on master...
# Setting up replication...
# ...done.

[mysql@localhost ~]$ mysql -uroot -pXXXX -h'192.168.13.194' -e 'show databases;'
Warning: Using a password on the command line interface can be insecure.
+--------------------+
| Database           |
+--------------------+
| information_schema |
| ss_test            |
| mysql              |
| performance_schema |
| pre_test_market    |
| test               |
+--------------------+
[mysql@localhost ~]$ mysql -uroot -pXXXX -h'192.168.13.159' -e 'show databases;'
Warning: Using a password on the command line interface can be insecure.
+--------------------+
| Database           |
+--------------------+
| information_schema |
| ss_test            |
| mysql              |
| performance_schema |
| pre_test_market    |
| test               |
+--------------------+

查看主从结构
[mysql@localhost ~]$ mysqlrplshow --master=root:XXXX@'192.168.13.194':3306 --discover-slaves-login=root:XXXX;
# master on 192.168.13.194: ... connected.
# Finding slaves for master: 192.168.13.194:3306

# Replication Topology Graph
192.168.13.194:3306 (MASTER)
   |
   +--- 192.168.13.159:3306 - (SLAVE)

查看主从状态
[mysql@localhost ~]$ mysqlrplcheck --master=root:XXXX@'192.168.13.194' --slave=root:XXXX@'192.168.13.159'
# master on 192.168.13.194: ... connected.
# slave on 192.168.13.159: ... connected.
Test Description                                                     Status
---------------------------------------------------------------------------
Checking for binary logging on master                                [pass]
Are there binlog exceptions?                                         [WARN]

+---------+---------------------------------------+------------+
| server  | do_db                                 | ignore_db  |
+---------+---------------------------------------+------------+
| master  | pre_test_market,test,ss_test          |            |
| slave   | pre_test_market,test,ss_test          |            |
+---------+---------------------------------------+------------+

Replication user exists?                                             [pass]
Checking server_id values                                            [pass]
Checking server_uuid values                                          [pass]
Is slave connected to master?                                        [pass]
Check master information file                                        [pass]
Checking InnoDB compatibility                                        [pass]
Checking storage engines compatibility                               [pass]
Checking lower_case_table_names settings                             [pass]
Checking slave delay (seconds behind master)                         [pass]
# ...done.

准备Failover(注意:mysqlfailover的时候有时候Failover Mode = fail,正常应该是auto解决此错误加参数--force)
例子如下:
[mysql@localhost ~]$ mysqlfailover --master=root:XXXX@'192.168.13.194':3306 --discover-slaves-login=root:XXXX --rediscover
# Discovering slaves for master at 192.168.13.194:3306
# Discovering slave at 192.168.13.159:3306
# Found slave: 192.168.13.159:3306
Multiple instances of failover console found for master 192.168.13.194:3306.
If this is an error, restart the console with --force. 
Failover mode changed to 'FAIL' for this instance. 
Console will start in 10 seconds..........starting Console.
# Checking privileges.
# Discovering slaves for master at 192.168.13.194:3306

MySQL Replication Failover Utility
Failover Mode = fail     Next Interval = Mon Jun  9 23:04:22 2014

正常模式:
[mysql@localhost ~]$ mysqlfailover --master=root:XXXX@'192.168.13.194':3306 --discover-slaves-login=root:XXXX --rediscover 
# Discovering slaves for master at 192.168.13.194:3306
# Discovering slave at 192.168.13.159:3306
# Found slave: 192.168.13.159:3306
# Checking privileges.
# Discovering slaves for master at 192.168.13.194:3306

MySQL Replication Failover Utility
Failover Mode = auto     Next Interval = Mon Jun  9 23:56:32 2014

Master Information
------------------
Binary Log File   Position  Binlog_Do_DB                          Binlog_Ignore_DB  
mysql-bin.000041  191       pre_test_market,test,ss_test                     

GTID Executed Set
a7e4c60d-62ca-11e3-8710-080027e08a30:1-8

Replication Health Status
+-----------------+-------+---------+--------+------------+---------+
| host            | port  | role    | state  | gtid_mode  | health  |
+-----------------+-------+---------+--------+------------+---------+
| 192.168.13.194  | 3306  | MASTER  | UP     | ON         | OK      |
| 192.168.13.159  | 3306  | SLAVE   | UP     | ON         | OK      |
+-----------------+-------+---------+--------+------------+---------+

关闭Master1
[mysql@master1 ~]$ mysqladmin -u root -proot shutdown
Warning: Using a password on the command line interface can be insecure.

开始自动Failover

Q-quit R-refresh H-health G-GTID Lists U-UUIDs
140609 23:30:22 mysqld_safe mysqld from pid file /mysql/mysqld.pid ended
Failed to reconnect to the master after 3 attemps.

Failover starting in 'auto' mode...
# Candidate slave 192.168.13.159:3306 will become the new master.
# Checking slaves status (before failover).
# Preparing candidate for failover.
# Creating replication user if it does not exist.
# Stopping slaves.
# Performing STOP on all slaves.
# Switching slaves to new master.
# Disconnecting new master as slave.
# Starting slaves.
# Performing START on all slaves.
# Checking slaves for errors.
# Failover complete.
# Discovering slaves for master at 192.168.13.159:3306
Error connecting to a slave as root@192.168.13.194: Cannot connect to the slave server.
Error Can't connect to MySQL server on '192.168.13.194:3306' (111 Connection refused)

WARNING: There are slaves that had connection errors.

Failover console will restart in 5 seconds.

MySQL Replication Failover Utility
Failover Mode = auto     Next Interval = Mon Jun  9 23:31:40 2014

Master Information
------------------
Binary Log File   Position  Binlog_Do_DB                          Binlog_Ignore_DB  
mysql-bin.000051  1688      pre_test_market,test,ss_test                    

GTID Executed Set
a7e4c60d-62ca-11e3-8710-080027e08a30:1-8

Replication Health Status
+-----------------+-------+---------+--------+------------+---------+
| host            | port  | role    | state  | gtid_mode  | health  |
+-----------------+-------+---------+--------+------------+---------+
| 192.168.13.159  | 3306  | MASTER  | UP     | ON         | OK      |
+-----------------+-------+---------+--------+------------+---------+
切换成功.Slave正式切换到Master

恢复Master1
[mysql@master1 ~]$ mysqld_safe &
140609 23:32:04 mysqld_safe Logging to '/mysql/mysqld.log'.
140609 23:32:04 mysqld_safe Starting mysqld daemon with databases from /mysql/data

把原Master1从新添加
[mysql@master1 ~]$ mysqlreplicate --master=root:XXXX@'192.168.13.159':3306 --slave=root:XXXX@'192.168.13.194':3306 --rpl-user=root:XXXX
# master on 192.168.13.159: ... connected.
# slave on 192.168.13.194: ... connected.
# Checking for binary logging on master...
# Setting up replication...
# ...done.

MySQL Replication Failover Utility
Failover Mode = auto     Next Interval = Mon Jun  9 23:34:04 2014

Master Information
------------------
Binary Log File   Position  Binlog_Do_DB                          Binlog_Ignore_DB  
mysql-bin.000051  1688      pre_test_market,test,ss_test                     

GTID Executed Set
a7e4c60d-62ca-11e3-8710-080027e08a30:1-8

Replication Health Status
+-----------------+-------+---------+--------+------------+---------+
| host            | port  | role    | state  | gtid_mode  | health  |
+-----------------+-------+---------+--------+------------+---------+
| 192.168.13.159  | 3306  | MASTER  | UP     | ON         | OK      |
| 192.168.13.194  | 3306  | SLAVE   | UP     | ON         | OK      |
+-----------------+-------+---------+--------+------------+---------+

主从切换恢复原状
[mysql@master1 ~]$ mysqlrpladmin --master=root:XXXX@'192.168.13.159':3306 --new-master=root:XXXX@'192.168.13.194':3306 --demote-master --discover-slaves-login=root:XXXX switchover
# Discovering slaves for master at 192.168.13.159:3306
# Discovering slave at 192.168.13.194:3306
# Found slave: 192.168.13.194:3306
# Checking privileges.
# Performing switchover from master at 192.168.13.159:3306 to slave at 192.168.13.194:3306.
# Checking candidate slave prerequisites.
# Checking slaves configuration to master.
# Waiting for slaves to catch up to old master.
# Stopping slaves.
# Performing STOP on all slaves.
# Demoting old master to be a slave to the new master.
# Switching slaves to new master.
# Starting all slaves.
# Performing START on all slaves.
# Checking slaves for errors.
# Switchover complete.
#
# Replication Topology Health:
+-----------------+-------+---------+--------+------------+---------+
| host            | port  | role    | state  | gtid_mode  | health  |
+-----------------+-------+---------+--------+------------+---------+
| 192.168.13.194  | 3306  | MASTER  | UP     | ON         | OK      |
| 192.168.13.159  | 3306  | SLAVE   | UP     | ON         | OK      |
+-----------------+-------+---------+--------+------------+---------+
# ...done.

由于我们从新恢复了.所以Failover交互模式失效从新在进入恢复正常
MySQL Replication Failover Utility
Failover Mode = auto     Next Interval = Mon Jun  9 23:35:35 2014

Master Information
------------------
Binary Log File   Position  Binlog_Do_DB                          Binlog_Ignore_DB  
mysql-bin.000051  1688      pre_test_market,test,ss_test                    

GTID Executed Set
a7e4c60d-62ca-11e3-8710-080027e08a30:1-8

Replication Health Status
+-----------------+-------+---------+--------+------------+------------------------------------+
| host            | port  | role    | state  | gtid_mode  | health                             |
+-----------------+-------+---------+--------+------------+------------------------------------+
| 192.168.13.159  | 3306  | MASTER  | UP     | ON         | OK                                 |
| 192.168.13.194  | 3306  | SLAVE   | WARN   |            | Slave is not connected to master.  |
+-----------------+-------+---------+--------+------------+------------------------------------+

[mysql@localhost ~]$ mysqlfailover --master=root:XXXX@'192.168.13.194':3306 --discover-slaves-login=root:XXXX --rediscover --force
# Discovering slaves for master at 192.168.13.194:3306
# Discovering slave at 192.168.13.159:3306
# Found slave: 192.168.13.159:3306
# Checking privileges.
# Discovering slaves for master at 192.168.13.194:3306

MySQL Replication Failover Utility
Failover Mode = auto     Next Interval = Mon Jun  9 23:36:10 2014

Master Information
------------------
Binary Log File   Position  Binlog_Do_DB                          Binlog_Ignore_DB  
mysql-bin.000041  191       pre_test_market,test,ss_test                   

GTID Executed Set
a7e4c60d-62ca-11e3-8710-080027e08a30:1-8

Replication Health Status
+-----------------+-------+---------+--------+------------+---------+
| host            | port  | role    | state  | gtid_mode  | health  |
+-----------------+-------+---------+--------+------------+---------+
| 192.168.13.194  | 3306  | MASTER  | UP     | ON         | OK      |
| 192.168.13.159  | 3306  | SLAVE   | UP     | ON         | OK      |
+-----------------+-------+---------+--------+------------+---------+

最后简单的说几句:
Auto Failover的前提首先是MySQL必须是5.6并且必须启动GTID.

Master Crash之后可以自动把Slave提升为New Master,但是Old Master恢复之后需要手动添加进来.并且Failover交互模式也会失效需要从新进一次显示才能正常.今天先到此了..^_^

原文地址:https://www.cnblogs.com/Yongzhouunknown/p/4857856.html