029:高可用之MHA

高可用之MHA

一、MHA 简介

MHA(Master High Availability)是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件。在MySQL故障切换过程中,MHA能做到在10~30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA能在最大程度上保证数据的一致性,以达到真正意义上的高可用 .

1.提供的功能

  • 自动化监控并执行自动故障转移,最小化宕机时间,通常为10-30s

    • 9-12秒内检测到master故障,
    • 7-10秒选择关闭master避免出现裂脑,
    • 几秒钟内,将差异中继日志(relay log)应用到新的master上
  • 交互式主服务器故障转移

    • MHA可以被定义成手动地实现故障转移,而不必去理会master的状态,即不监控master状态,确认故障发生后可通过MHA手动切换
  • 非交互式主服务器故障转移

    • 即不监控Master状态,但是发生故障后可通过MHA实现自动转移。
  • 在线切换Master到另外一台服务器

    • 场景:RAID卡/内存故障、硬件升级维护、升级版本等维护,需要手动切换主库的情况
    • MHA能够在0.5-2秒内实现切换,0.5-2秒的写阻塞通常是可接受的,所以你甚至能在非维护期间就在线切换master。诸如升级到高版本,升级到更快的服务器之类的工作,将会变得更容易。

2.MHA的优点

  • Perl语言开发的开源脚本,容易二次开发
  • 自动化监控并执行自动故障转移,最小化宕机时间
  • 自动补齐数据,维护数据一致性
  • 切换过程中支持调用其他脚本的接口
  • 支持在线切换master(仅阻塞写入)
  • 多个集群集中监控管理
  • 支持基于GTID的复制模式
  • 部署简单,对现有架构无需复杂更改及配置
  • 服务器器无性能损失(心跳缺省为每3s)
  • 支持任意存储引擎
  • 无需增加大量备用或闲置服务器

3 MHA的缺点:

  • 需要编写脚本或利用第三方工具来实现VIP的配置
  • 需要基于SSH免认证配置,存在一定的安全隐患

二、MHA 原理

1. 架构图

2.MHA 恢复原理

(1)从宕机崩溃的master保存二进制日志事件(binlog events);

(2)读取pos点找到含有最新binlog 日志的slave做为last slave;

(3)应用差异的中继日志(relay log)到其他的slave;

(4)应用从master保存的二进制日志事件(binlog events);

(5)提升一个slave为新的master;

(6)使其他的slave连接新的master进行复制;

三、环境

角色类型 主机名 IP地址 Server_ID MHA Manager MHA Node Keepalived LVS 脚本 备注
Master node01 192.168.222.245 49245 - 部署 - - 写vip绑定
Slave A mycat01 192.168.222.246 49246 - 部署 - - - 半同步
Master(spare) node02 192.168.222.247 49247 - 部署 - - 写vip绑定
Slave B node03 192.168.222.248 49248 - 部署 - - 读arp绑定
Slave C node04 192.168.222.249 49249 - 部署 - - 读arp绑定
Manage node05 192.168.222.59 - 部署 - - - - 监控/MySQL主库的故障转移
LVS+Keepalived A redis01 192.168.222.251 49251 - - 部署 部署 - 读负载均衡
LVS+Keepalived B redis02 192.168.222.252 49252 - - 部署 部署 - LVS高可用备机

四、搭建主从

1、配置主从

搭建基于GTID主从复制环境 
0.主库创建同步帐号grant RELOAD,REPLICATION SLAVE, REPLICATION CLIENT on *.* to repl@'%' identified by password 'xxxxxx'; flush privileges;
1.mysqldump 出主库数据,增加 master-data=2;
2.导入从库;
3.change master to master_host="192.168.222.245", master_port=3306, master_user='repl',master_password='repl', master_auto_position=1;
4.start slave;
5.配置从服务器只读;

2、 配置 lossless semi-sync replication (无损复制)

  • master (主备)
(root@localhost) 11:51:54 [(none)]> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
Query OK, 0 rows affected (0.05 sec)

(root@localhost) 11:52:39 [(none)]>  INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
Query OK, 0 rows affected (0.03 sec)

(root@localhost) 11:52:50 [(none)]> set global rpl_semi_sync_master_enabled = 1;
Query OK, 0 rows affected (0.01 sec)

(root@localhost) 11:52:56 [(none)]>  set global rpl_semi_sync_master_timeout = 5000;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) 11:53:01 [(none)]>  show global status like "%rpl%";
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 0     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 0     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 0     |
| Rpl_semi_sync_master_tx_wait_time          | 0     |
| Rpl_semi_sync_master_tx_waits              | 0     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 0     |
| Rpl_semi_sync_slave_status                 | OFF   |
+--------------------------------------------+-------+
15 rows in set (0.00 sec)

  • slave(Slave A )配置
(root@localhost) 11:47:17 [(none)]> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
Query OK, 0 rows affected (0.14 sec)

(root@localhost) 11:49:32 [(none)]> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
Query OK, 0 rows affected (0.04 sec)

(root@localhost) 11:50:25 [(none)]> set global rpl_semi_sync_slave_enabled = 1;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) 11:50:42 [(none)]>  show global variables like '%semi%';
+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | ON         |
| rpl_semi_sync_master_timeout              | 5000       |
| rpl_semi_sync_master_trace_level          | 32         |
| rpl_semi_sync_master_wait_for_slave_count | 1          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
| rpl_semi_sync_slave_enabled               | ON         |
| rpl_semi_sync_slave_trace_level           | 32         |
+-------------------------------------------+------------+
8 rows in set (0.00 sec)

五、 安装MHA

1、下载包和安装依赖包

mha4mysql-manager-0.57.tar.gz

mha4mysql-node-0.57.tar.gz

# yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-CPAN

2 安装MHA manager

  • 在所有的 MySQL上安装 Node 节点;在 MHA 的管理节点安装 manager 节点
tar zxvf mha4mysql-manager-0.57.tar.gz -C /usr/local/
cd /usr/local/mha4mysql-manager-0.57
perl Makefile.PL
make && make install

-- Manage安装完成后会在/usr/local/bin/下生成这些文件
#masterha_check_ssh : 检查MHA的SSH配置。 
#masterha_check_repl : 检查MySQL复制。 
#masterha_manager : 启动MHA。 
#masterha_check_status : 检测当前MHA运行状态。 
#masterha_master_monitor : 监测master是否宕机。 
#masterha_master_switch : 控制故障转移(自动或手动)。 
#masterha_conf_host : 添加或删除配置的server信息。

cp  -r /usr/local/mha4mysql-manager-0.57/samples/scripts/* /usr/local/bin/
-- 拷贝/usr/local/mha4mysql-manager-0.57/samples/scripts下的脚本到/usr/local/bin/

#master_ip_failover :自动切换时vip管理的脚本,不是必须,如果我们使用keepalived的,我们可以自己编写脚本完成对vip的管理,比如监控mysql,如果mysql异常,我们停止keepalived就行,这样vip就会自动漂移
#master_ip_online_change:在线切换时vip的管理,不是必须,同样可以可以自行编写简单的shell完成
#power_manager:故障发生后关闭主机的脚本,不是必须
#send_report:因故障切换后发送报警的脚本,不是必须,可自行编写简单的shell完成。

3 安装MHA Node

tar zxvf  mha4mysql-node-0.57.tar.gz -C /usr/local
cd /usr/local/mha4mysql-node-0.57
perl Makefile.PL
make && make install

-- Node安装完成后会在/usr/local/bin/下生成这些文件
#save_binary_logs : 保存和复制master的二进制日志。 
#apply_diff_relay_logs : 识别差异的中继日志事件并应用于其它slave。 
#filter_mysqlbinlog : 去除不必要的ROLLBACK事件(MHA已不再使用这个工具)。 
#purge_relay_logs : 清除中继日志(不会阻塞SQL线程)。

4 配置免秘钥

  • mha管理节点:
ssh-keygen -t rsa

ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.245"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.246"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.247"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.248"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.249"

  • mysql节点各自生成公私钥,并将公钥拷贝给其他mysql节点
# 192.168.222.245
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.59"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.246"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.247"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.248"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.249"


# 192.168.222.246
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.59"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.245"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.247"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.248"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.249"

# 192.168.222.247
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.59"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.245"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.246"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.248"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.249"

# 192.168.222.248
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.59"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.245"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.246"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.247"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.249"

# 192.168.222.249
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.59"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.245"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.246"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.247"
ssh-copy-id -i /root/.ssh/id_rsa.pub "root@192.168.222.248"
  • 测试ssh是否免密码登录

5 MHA配置

1、创建和修改配置文件

  • 在manager节点上创建配置文件(app1.cnf)
[root@node05 mha4mysql-manager-0.57]# cd  /usr/local/mha4mysql-manager-0.57
[root@node05 mha4mysql-manager-0.57]# mkdir -p /etc/masterha
[root@node05 mha4mysql-manager-0.57]# vim /etc/masterha/app1.cnf

[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1.log

[server1]
hostname=192.168.222.245
port=3306

[server2]
hostname=192.168.222.246
port=3306

#备用主库
[server3]
candidate_master=1      
check_repl_delay=0
hostname=192.168.222.247
port=3306

[server4]
hostname=192.168.222.248
port=3306

[server5]
hostname=192.168.222.249
port=3306
  • 在manager节点创建全局配置文件(masterha_default.cnf)
[root@node05 mha4mysql-manager-0.57]# vim /etc/masterha_default.cnf

[server default]
# 登陆mysql数据库账户及密码
user=gcdb
password=iforgot

# ssh用户
ssh_user=root

# mysql数据库master节点binlog的位置,该参数用于当master节点死掉后通过ssh方式顺序读取binlog event,需要配置,因为master节点死掉后无法通过replication机制来自动获取binlog日志位置
master_binlog_dir= /r2/mysqldata
#master_binlog_dir= /r2/mysqldata
remote_workdir=/data/mysql/

# 用于检测各节点间的连接性,此处详细可参考MHA parameters描述部分
secondary_check_script= masterha_secondary_check -s 192.168.222.247 -s 192.168.222.245
#secondary_check_script= masterha_secondary_check -s 192.168.222.245 -s 192.168.222.247 -s 192.168.222.248 -s 192.168.222.249  -s 192.168.222.246
ping_interval=3

#定义用于实现VIP漂移的脚本,后面的是shutdown以及report脚本
master_ip_failover_script=/usr/local/bin/master_ip_failover
#shutdown_script=/usr/local/bin/power_manager
shutdown_script=
report_script=/usr/local/bin/send_master_failover_mail


#MySQL用于复制的账号
repl_user=repl
repl_password=repl

#用于指定mha manager产生相关状态文件全路径
manager_workdir=/var/log/masterha

#指定mha manager的绝对路径的文件名日志文件
manager_log=/var/log/masterha/app1.log

  • 在manager节点上修改故障转移脚本(master_ip_failover)
[root@node05 mha4mysql-manager-0.57]# vim  /usr/local/bin/master_ip_failover
#!/usr/bin/env perl

use strict;
use warnings FATAL => 'all';
use Getopt::Long;
use MHA::DBHelper;

my (
    $command,          $ssh_user,        $orig_master_host, $orig_master_ip,
    $orig_master_port, $new_master_host, $new_master_ip,    $new_master_port
);

#添加切换vip
my $vip = '192.168.222.99/24';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig ens256:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens256:$key down";
$ssh_user = "root";


GetOptions(
    'command=s'          => $command,
    'ssh_user=s'         => $ssh_user,
    'orig_master_host=s' => $orig_master_host,
    'orig_master_ip=s'   => $orig_master_ip,
    'orig_master_port=i' => $orig_master_port,
    'new_master_host=s'  => $new_master_host,
    'new_master_ip=s'    => $new_master_ip,
    'new_master_port=i'  => $new_master_port,
);

exit &main();

sub main {

    print "

IN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===

";

    if ( $command eq "stop" || $command eq "stopssh" ) {

        my $exit_code = 1;
        eval {
            print "Disabling the VIP on old master: $orig_master_host 
";
            &stop_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn "Got Error: $@
";
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "start" ) {

        my $exit_code = 10;
        eval {
            print "Enabling the VIP - $vip on the new master - $new_master_host 
";
            &start_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn $@;
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "status" ) {
        print "Checking the Status of the script.. OK 
";
        exit 0;
    }
    else {
        &usage();
        exit 1;
    }
}

sub start_vip() {
    `ssh $ssh_user@$new_master_host " $ssh_start_vip "`;
}
sub stop_vip() {
     return 0  unless  ($ssh_user);
    `ssh $ssh_user@$orig_master_host " $ssh_stop_vip "`;
}

sub usage {
    print
    "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port
";
}

2、 检查 SSH/复制/MHA Manager 的配置

  • 检查SSH 情况:masterha_check_ssh --conf=/etc/masterha/app1.cnf
[root@node05 mha4mysql-manager-0.57]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
Mon May 28 15:07:28 2018 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon May 28 15:07:28 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Mon May 28 15:07:28 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Mon May 28 15:07:28 2018 - [info] Starting SSH connection tests..
Mon May 28 15:07:31 2018 - [debug]
Mon May 28 15:07:29 2018 - [debug]  Connecting via SSH from root@192.168.222.245(192.168.222.245:22) to root@192.168.222.246(192.168.222.246:22)..
Mon May 28 15:07:29 2018 - [debug]   ok.
Mon May 28 15:07:29 2018 - [debug]  Connecting via SSH from root@192.168.222.245(192.168.222.245:22) to root@192.168.222.247(192.168.222.247:22)..
Mon May 28 15:07:29 2018 - [debug]   ok.
Mon May 28 15:07:29 2018 - [debug]  Connecting via SSH from root@192.168.222.245(192.168.222.245:22) to root@192.168.222.248(192.168.222.248:22)..
Mon May 28 15:07:30 2018 - [debug]   ok.
Mon May 28 15:07:30 2018 - [debug]  Connecting via SSH from root@192.168.222.245(192.168.222.245:22) to root@192.168.222.249(192.168.222.249:22)..
Mon May 28 15:07:30 2018 - [debug]   ok.
Mon May 28 15:07:31 2018 - [debug]
Mon May 28 15:07:29 2018 - [debug]  Connecting via SSH from root@192.168.222.246(192.168.222.246:22) to root@192.168.222.245(192.168.222.245:22)..
Mon May 28 15:07:30 2018 - [debug]   ok.
Mon May 28 15:07:30 2018 - [debug]  Connecting via SSH from root@192.168.222.246(192.168.222.246:22) to root@192.168.222.247(192.168.222.247:22)..
Mon May 28 15:07:30 2018 - [debug]   ok.
Mon May 28 15:07:30 2018 - [debug]  Connecting via SSH from root@192.168.222.246(192.168.222.246:22) to root@192.168.222.248(192.168.222.248:22)..
Mon May 28 15:07:30 2018 - [debug]   ok.
Mon May 28 15:07:30 2018 - [debug]  Connecting via SSH from root@192.168.222.246(192.168.222.246:22) to root@192.168.222.249(192.168.222.249:22)..
Mon May 28 15:07:31 2018 - [debug]   ok.
Mon May 28 15:07:32 2018 - [debug]
Mon May 28 15:07:30 2018 - [debug]  Connecting via SSH from root@192.168.222.248(192.168.222.248:22) to root@192.168.222.245(192.168.222.245:22)..
Mon May 28 15:07:30 2018 - [debug]   ok.
Mon May 28 15:07:30 2018 - [debug]  Connecting via SSH from root@192.168.222.248(192.168.222.248:22) to root@192.168.222.246(192.168.222.246:22)..
Mon May 28 15:07:31 2018 - [debug]   ok.
Mon May 28 15:07:31 2018 - [debug]  Connecting via SSH from root@192.168.222.248(192.168.222.248:22) to root@192.168.222.247(192.168.222.247:22)..
Mon May 28 15:07:31 2018 - [debug]   ok.
Mon May 28 15:07:31 2018 - [debug]  Connecting via SSH from root@192.168.222.248(192.168.222.248:22) to root@192.168.222.249(192.168.222.249:22)..
Mon May 28 15:07:32 2018 - [debug]   ok.
Mon May 28 15:07:32 2018 - [debug]
Mon May 28 15:07:30 2018 - [debug]  Connecting via SSH from root@192.168.222.247(192.168.222.247:22) to root@192.168.222.245(192.168.222.245:22)..
Mon May 28 15:07:30 2018 - [debug]   ok.
Mon May 28 15:07:30 2018 - [debug]  Connecting via SSH from root@192.168.222.247(192.168.222.247:22) to root@192.168.222.246(192.168.222.246:22)..
Mon May 28 15:07:30 2018 - [debug]   ok.
Mon May 28 15:07:30 2018 - [debug]  Connecting via SSH from root@192.168.222.247(192.168.222.247:22) to root@192.168.222.248(192.168.222.248:22)..
Mon May 28 15:07:31 2018 - [debug]   ok.
Mon May 28 15:07:31 2018 - [debug]  Connecting via SSH from root@192.168.222.247(192.168.222.247:22) to root@192.168.222.249(192.168.222.249:22)..
Mon May 28 15:07:31 2018 - [debug]   ok.
Mon May 28 15:07:33 2018 - [debug]
Mon May 28 15:07:31 2018 - [debug]  Connecting via SSH from root@192.168.222.249(192.168.222.249:22) to root@192.168.222.245(192.168.222.245:22)..
Mon May 28 15:07:31 2018 - [debug]   ok.
Mon May 28 15:07:31 2018 - [debug]  Connecting via SSH from root@192.168.222.249(192.168.222.249:22) to root@192.168.222.246(192.168.222.246:22)..
Mon May 28 15:07:31 2018 - [debug]   ok.
Mon May 28 15:07:31 2018 - [debug]  Connecting via SSH from root@192.168.222.249(192.168.222.249:22) to root@192.168.222.247(192.168.222.247:22)..
Mon May 28 15:07:32 2018 - [debug]   ok.
Mon May 28 15:07:32 2018 - [debug]  Connecting via SSH from root@192.168.222.249(192.168.222.249:22) to root@192.168.222.248(192.168.222.248:22)..
Mon May 28 15:07:32 2018 - [debug]   ok.
Mon May 28 15:07:33 2018 - [info] All SSH connection tests passed successfully.
[root@node05 mha4mysql-manager-0.57]#

  • 检查复制情况:masterha_check_repl --conf=/etc/masterha/app1.cnf
[root@node05 mha4mysql-manager-0.57]# masterha_check_repl --conf=/etc/masterha/app1.cnf
Mon May 28 16:04:12 2018 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon May 28 16:04:12 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Mon May 28 16:04:12 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Mon May 28 16:04:12 2018 - [info] MHA::MasterMonitor version 0.57.
Mon May 28 16:04:13 2018 - [info] GTID failover mode = 1
Mon May 28 16:04:13 2018 - [info] Dead Servers:
Mon May 28 16:04:13 2018 - [info] Alive Servers:
Mon May 28 16:04:13 2018 - [info]   192.168.222.245(192.168.222.245:3306)
Mon May 28 16:04:13 2018 - [info]   192.168.222.246(192.168.222.246:3306)
Mon May 28 16:04:13 2018 - [info]   192.168.222.247(192.168.222.247:3306)
Mon May 28 16:04:13 2018 - [info]   192.168.222.248(192.168.222.248:3306)
Mon May 28 16:04:13 2018 - [info]   192.168.222.249(192.168.222.249:3306)
Mon May 28 16:04:13 2018 - [info] Alive Slaves:
Mon May 28 16:04:13 2018 - [info]   192.168.222.246(192.168.222.246:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 16:04:13 2018 - [info]     GTID ON
Mon May 28 16:04:13 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 16:04:13 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon May 28 16:04:13 2018 - [info]   192.168.222.247(192.168.222.247:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 16:04:13 2018 - [info]     GTID ON
Mon May 28 16:04:13 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 16:04:13 2018 - [info]   192.168.222.248(192.168.222.248:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 16:04:13 2018 - [info]     GTID ON
Mon May 28 16:04:13 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 16:04:13 2018 - [info]   192.168.222.249(192.168.222.249:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 16:04:13 2018 - [info]     GTID ON
Mon May 28 16:04:13 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 16:04:13 2018 - [info] Current Alive Master: 192.168.222.245(192.168.222.245:3306)
Mon May 28 16:04:13 2018 - [info] Checking slave configurations..
Mon May 28 16:04:13 2018 - [info]  read_only=1 is not set on slave 192.168.222.246(192.168.222.246:3306).
Mon May 28 16:04:13 2018 - [info]  read_only=1 is not set on slave 192.168.222.247(192.168.222.247:3306).
Mon May 28 16:04:13 2018 - [info]  read_only=1 is not set on slave 192.168.222.248(192.168.222.248:3306).
Mon May 28 16:04:13 2018 - [info]  read_only=1 is not set on slave 192.168.222.249(192.168.222.249:3306).
Mon May 28 16:04:13 2018 - [info] Checking replication filtering settings..
Mon May 28 16:04:13 2018 - [info]  binlog_do_db= , binlog_ignore_db=
Mon May 28 16:04:13 2018 - [info]  Replication filtering check ok.
Mon May 28 16:04:13 2018 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Mon May 28 16:04:13 2018 - [info] Checking SSH publickey authentication settings on the current master..
Mon May 28 16:04:13 2018 - [info] HealthCheck: SSH to 192.168.222.245 is reachable.
Mon May 28 16:04:13 2018 - [info]
192.168.222.245(192.168.222.245:3306) (current master)
 +--192.168.222.246(192.168.222.246:3306)
 +--192.168.222.247(192.168.222.247:3306)
 +--192.168.222.248(192.168.222.248:3306)
 +--192.168.222.249(192.168.222.249:3306)

Mon May 28 16:04:13 2018 - [info] Checking replication health on 192.168.222.246..
Mon May 28 16:04:13 2018 - [info]  ok.
Mon May 28 16:04:13 2018 - [info] Checking replication health on 192.168.222.247..
Mon May 28 16:04:13 2018 - [info]  ok.
Mon May 28 16:04:13 2018 - [info] Checking replication health on 192.168.222.248..
Mon May 28 16:04:13 2018 - [info]  ok.
Mon May 28 16:04:13 2018 - [info] Checking replication health on 192.168.222.249..
Mon May 28 16:04:13 2018 - [info]  ok.
Mon May 28 16:04:13 2018 - [info] Checking master_ip_failover_script status:
Mon May 28 16:04:13 2018 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.222.245 --orig_master_ip=192.168.222.245 --orig_master_port=3306
Mon May 28 16:04:13 2018 - [info]  OK.
Mon May 28 16:04:13 2018 - [warning] shutdown_script is not defined.
Mon May 28 16:04:13 2018 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.
[root@node05 mha4mysql-manager-0.57]#

  • 检查MHA 状态:masterha_check_status --conf=/etc/masterha/app1.cnf
[root@node05 mha4mysql-manager-0.57]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 is stopped(2:NOT_RUNNING).

3、验证(MHA Manager 配置)

  • session A
[root@node05 ~]# mkdir -p /var/log/masterha/app1/
  • session B
[root@node05 mha4mysql-manager-0.57]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
[1] 19246

  • session A
[root@node05 ~]# tail -f /var/log/masterha/app1/manager.log
Mon May 28 18:01:42 2018 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon May 28 18:01:42 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Mon May 28 18:01:42 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Mon May 28 18:01:42 2018 - [info] MHA::MasterMonitor version 0.57.
Mon May 28 18:01:43 2018 - [info] GTID failover mode = 1
Mon May 28 18:01:43 2018 - [info] Dead Servers:
Mon May 28 18:01:43 2018 - [info] Alive Servers:
Mon May 28 18:01:43 2018 - [info]   192.168.222.245(192.168.222.245:3306)
Mon May 28 18:01:43 2018 - [info]   192.168.222.246(192.168.222.246:3306)
Mon May 28 18:01:43 2018 - [info]   192.168.222.247(192.168.222.247:3306)
Mon May 28 18:01:43 2018 - [info]   192.168.222.248(192.168.222.248:3306)
Mon May 28 18:01:43 2018 - [info]   192.168.222.249(192.168.222.249:3306)
Mon May 28 18:01:43 2018 - [info] Alive Slaves:
Mon May 28 18:01:43 2018 - [info]   192.168.222.246(192.168.222.246:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:01:43 2018 - [info]     GTID ON
Mon May 28 18:01:43 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:01:43 2018 - [info]   192.168.222.247(192.168.222.247:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:01:43 2018 - [info]     GTID ON
Mon May 28 18:01:43 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:01:43 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon May 28 18:01:43 2018 - [info]   192.168.222.248(192.168.222.248:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:01:43 2018 - [info]     GTID ON
Mon May 28 18:01:43 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:01:43 2018 - [info]   192.168.222.249(192.168.222.249:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:01:43 2018 - [info]     GTID ON
Mon May 28 18:01:43 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:01:43 2018 - [info] Current Alive Master: 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:01:43 2018 - [info] Checking slave configurations..
Mon May 28 18:01:43 2018 - [info]  read_only=1 is not set on slave 192.168.222.246(192.168.222.246:3306).
Mon May 28 18:01:43 2018 - [info]  read_only=1 is not set on slave 192.168.222.247(192.168.222.247:3306).
Mon May 28 18:01:43 2018 - [info]  read_only=1 is not set on slave 192.168.222.248(192.168.222.248:3306).
Mon May 28 18:01:43 2018 - [info]  read_only=1 is not set on slave 192.168.222.249(192.168.222.249:3306).
Mon May 28 18:01:43 2018 - [info] Checking replication filtering settings..
Mon May 28 18:01:43 2018 - [info]  binlog_do_db= , binlog_ignore_db=
Mon May 28 18:01:43 2018 - [info]  Replication filtering check ok.
Mon May 28 18:01:43 2018 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Mon May 28 18:01:43 2018 - [info] Checking SSH publickey authentication settings on the current master..
Mon May 28 18:01:43 2018 - [info] HealthCheck: SSH to 192.168.222.245 is reachable.
Mon May 28 18:01:43 2018 - [info]
192.168.222.245(192.168.222.245:3306) (current master)
 +--192.168.222.246(192.168.222.246:3306)
 +--192.168.222.247(192.168.222.247:3306)
 +--192.168.222.248(192.168.222.248:3306)
 +--192.168.222.249(192.168.222.249:3306)

Mon May 28 18:01:43 2018 - [info] Checking master_ip_failover_script status:
Mon May 28 18:01:43 2018 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.222.245 --orig_master_ip=192.168.222.245 --orig_master_port=3306


IN SCRIPT TEST====/sbin/ifconfig ens256:1 down==/sbin/ifconfig ens256:1 192.168.222.99/24===

Checking the Status of the script.. OK
Mon May 28 18:01:43 2018 - [info]  OK.
Mon May 28 18:01:43 2018 - [warning] shutdown_script is not defined.
Mon May 28 18:01:43 2018 - [info] Set master ping interval 3 seconds.
Mon May 28 18:01:43 2018 - [info] Set secondary check script: masterha_secondary_check -s 192.168.222.247 -s 192.168.222.245
Mon May 28 18:01:43 2018 - [info] Starting ping health check on 192.168.222.245(192.168.222.245:3306)..
Mon May 28 18:01:43 2018 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..   --表示已开启监听
  • session B
[root@node05 mha4mysql-manager-0.57]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:1840) is running(0:PING_OK), master:192.168.222.245  --主库192.168.222.245 
[root@node05 mha4mysql-manager-0.57]#

六、部署负载均衡

1、部署 keepalived 和 lvs

yum install keepalived ipvsadm -y

2、创建检查从库脚本(check_slave.py)

  • 两台都需要创建脚本
[root@redis01 mybak]# cat /etc/keepalived/check_slave.py
#!/usr/bin/env python
#encoding:utf-8
import MySQLdb
import sys
ip=sys.argv[1]
user='repl'
pwd='repl'
port=int(sys.argv[2])
sbm=200

Slave_IO_Running = ''
Slave_SQL_Running = ''
Seconds_Behind_Master = ''
e=''

try:
  conn = MySQLdb.connect(host=ip,user=user,passwd=pwd,port=port,charset='utf8')
  cur = conn.cursor()
  cur.execute('show slave status')
  db_info = cur.fetchall()
  for n in db_info:
    Slave_IO_Running = n[10]
    Slave_SQL_Running = n[11]
    Seconds_Behind_Master = n[32]
  cur.close()
  conn.close()
except MySQLdb.Error,e:
    print "MySQLdb Error",e

if e == "":
  if db_info != ():
    if Slave_IO_Running == "No" or Slave_SQL_Running == "No":
      #print 'thread err'
      exit(1)
    else:
      if Seconds_Behind_Master > sbm:
        #print 'timeout err'
        exit(1)
      else:
        #print 'OK'
        exit(0)
  else:
    #print 'slave err'
    exit(1)
else:
  #print 'db err'
  exit(1

3、修改配置文件(keepalived.conf)

  • redis01主机修改keepalived.conf
[root@redis01 mybak]# vim  /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
    notification_email {
        gczheng@139.com     #设置报警邮件地址,可以设置多个,每行一个。#需开启本机的sendmail服务

    }
    notification_email_from guanchang26@163.com  #设置邮件的发送地址
    smtp_server 127.0.0.1      #设置smtp server地址
    smtp_connect_timeout 30    #设置连接smtp server的超时时间
    router_id MySQL_HA_222     #表示运行keepalived服务器的一个标识,两个配置文件要一样。发邮件时显示在邮件主题的信息
}


# db Read

vrrp_instance VI_1 {
    state MASTER              #指定keepalived的角色,MASTER表示此主机是主服务器,BACKUP表示此主机是备用服务器
    interface ens256          #指定HA监测网络的接口
    virtual_router_id 51      #虚拟路由标识,这个标识是一个数字,同一个vrrp实例使用唯一的标识。即同一vrrp_instance下,MASTER和BACKUP必须是一致的
    priority 100              #定义优先级,数字越大,优先级越高,在同一个vrrp_instance下,MASTER的优先级必须大于BACKUP的优先级
    advert_int 1              #设定MASTER与BACKUP负载均衡器之间同步检查的时间间隔,单位是秒
    authentication {          #设置验证类型和密码
        auth_type PASS        #设置验证类型,主要有PASS和AH两种
        auth_pass 1111        #设置验证密码,在同一个vrrp_instance下,MASTER与BACKUP必须使用相同的密码才能正常通信
    }
    virtual_ipaddress {       #设置虚拟IP地址,可以设置多个虚拟IP地址,每行一个
        192.168.222.100
    }
}

# db_read_vip
virtual_server 192.168.222.100 3306 {  #设置虚拟服务器,需要指定虚拟IP地址和服务端口,IP与端口之间用空格隔开
    delay_loop 1              #设置运行情况检查时间,单位是秒
    lb_algo rr                #设置负载调度算法,这里设置为rr,即轮询算法
    lb_kind DR                #设置LVS实现负载均衡的机制,有NAT、TUN、DR三个模式可选
    nat_mask 255.255.255.0
    protocol TCP              #指定转发协议类型,有TCP和UDP两种

   real_server 192.168.222.246 3306 { #配置服务节点1,需要指定real server的真实IP地址和端口,IP与端口之间用空格隔开
       weight 1              #配置服务节点的权值,权值大小用数字表示,数字越大,权值越高,设置权值大小可以为不同性能的服务器
                             #分配不同的负载,可以为性能高的服务器设置较高的权值,而为性能较低的服务器设置相对较低的权值,这样才能合理地利用和分配系统资源
       TCP_CHECK {           #realserver的状态检测设置部分,单位是秒
           connect_timeout 3    #表示3秒无响应超时
           nb_get_retry 3       #表示重试次数
           delay_before_retry 3 #表示重试间隔
           connect_port 3306
           }
           MISC_CHECK {
           misc_path "/etc/keepalived/check_slave.py 192.168.222.245 3306"
           misc_dynamic
       }
   }
     # real_server 192.168.222.247 3306 {
     # weight 1
        # TCP_CHECK {
        # connect_timeout 3
        # nb_get_retry 3
        # delay_before_retry 3
        # connect_port 3306
        # }
        # MISC_CHECK {
        # misc_path "/etc/keepalived/check_slave.py 192.168.222.247 3306"
        # misc_dynamic
        # }
    # }
    real_server 192.168.222.248 3306 {
        weight 1
        TCP_CHECK {
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
            connect_port 3306
            }
            MISC_CHECK {
            misc_path "/etc/keepalived/check_slave.py 192.168.222.248 3306"
            misc_dynamic
        }
    }
    real_server 192.168.222.249 3306 {
        weight 1
        TCP_CHECK {
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
            connect_port 3306
            }
            MISC_CHECK {
            misc_path "/etc/keepalived/check_slave.py 192.168.222.249 3306"
            misc_dynamic
        }
    }
}

[root@redis01 mybak]# systemctl start keepalived  --启动keepalived服务
  • redis02主机修改keepalived.conf
[root@redis02 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
    notification_email {
        gczheng@139.com     #设置报警邮件地址,可以设置多个,每行一个。#需开启本机的sendmail服务

    }
    notification_email_from guanchang26@163.com  #设置邮件的发送地址
    smtp_server 127.0.0.1      #设置smtp server地址
    smtp_connect_timeout 30    #设置连接smtp server的超时时间
    router_id MySQL_HA_222     #表示运行keepalived服务器的一个标识。发邮件时显示在邮件主题的信息
}


# db Read

vrrp_instance VI_1 {
    state BACKUP              #指定keepalived的角色,MASTER表示此主机是主服务器,BACKUP表示此主机是备用服务器
    interface ens256          #指定HA监测网络的接口
    virtual_router_id 51      #虚拟路由标识,这个标识是一个数字,同一个vrrp实例使用唯一的标识。即同一vrrp_instance下,MASTER和BACKUP必须是一致的
    priority 100              #定义优先级,数字越大,优先级越高,在同一个vrrp_instance下,MASTER的优先级必须大于BACKUP的优先级
    advert_int 1              #设定MASTER与BACKUP负载均衡器之间同步检查的时间间隔,单位是秒
    authentication {          #设置验证类型和密码
        auth_type PASS        #设置验证类型,主要有PASS和AH两种
        auth_pass 1111        #设置验证密码,在同一个vrrp_instance下,MASTER与BACKUP必须使用相同的密码才能正常通信
    }
    virtual_ipaddress {       #设置虚拟IP地址,可以设置多个虚拟IP地址,每行一个
        192.168.222.100
    }
}

# db_read_vip
virtual_server 192.168.222.100 3306 {  #设置虚拟服务器,需要指定虚拟IP地址和服务端口,IP与端口之间用空格隔开
    delay_loop 1              #设置运行情况检查时间,单位是秒
    lb_algo rr                #设置负载调度算法,这里设置为rr,即轮询算法
    lb_kind DR                #设置LVS实现负载均衡的机制,有NAT、TUN、DR三个模式可选
    nat_mask 255.255.255.0
    protocol TCP              #指定转发协议类型,有TCP和UDP两种

   real_server 192.168.222.246 3306 { #配置服务节点1,需要指定real server的真实IP地址和端口,IP与端口之间用空格隔开
       weight 1              #配置服务节点的权值,权值大小用数字表示,数字越大,权值越高,设置权值大小可以为不同性能的服务器
                             #分配不同的负载,可以为性能高的服务器设置较高的权值,而为性能较低的服务器设置相对较低的权值,这样才能合理地利用和分配系统资源
       TCP_CHECK {           #realserver的状态检测设置部分,单位是秒
           connect_timeout 3    #表示3秒无响应超时
           nb_get_retry 3       #表示重试次数
           delay_before_retry 3 #表示重试间隔
           connect_port 3306
           }
           MISC_CHECK {
           misc_path "/etc/keepalived/check_slave.py 192.168.222.245 3306"
           misc_dynamic
       }
   }
     # real_server 192.168.222.247 3306 {
     # weight 1
        # TCP_CHECK {
        # connect_timeout 3
        # nb_get_retry 3
        # delay_before_retry 3
        # connect_port 3306
        # }
        # MISC_CHECK {
        # misc_path "/etc/keepalived/check_slave.py 192.168.222.247 3306"
        # misc_dynamic
        # }
    # }
    real_server 192.168.222.248 3306 {
        weight 1
        TCP_CHECK {
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
            connect_port 3306
            }
            MISC_CHECK {
            misc_path "/etc/keepalived/check_slave.py 192.168.222.248 3306"
            misc_dynamic
        }
    }
    real_server 192.168.222.249 3306 {
        weight 1
        TCP_CHECK {
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
            connect_port 3306
            }
            MISC_CHECK {
            misc_path "/etc/keepalived/check_slave.py 192.168.222.249 3306"
            misc_dynamic
        }
    }
}

[root@redis02 ~]# systemctl stop keepalived
[root@redis02 ~]# systemctl start keepalived
[root@redis02 ~]# systemctl status keepalived
● keepalived.service - LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
   Active: active (running) since Mon 2018-05-28 16:59:21 CST; 5s ago
  Process: 4525 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 4526 (keepalived)
   CGroup: /system.slice/keepalived.service
           ├─4526 /usr/sbin/keepalived -D
           ├─4527 /usr/sbin/keepalived -D
           ├─4528 /usr/sbin/keepalived -D
           ├─4557 /usr/sbin/keepalived -D
           └─4558 python /etc/keepalived/check_slave.py 192.168.222.245 3306

May 28 16:59:23 redis02.db.com Keepalived_healthcheckers[4527]: Activating healthchecker for service [192.168.222.100]:3306
May 28 16:59:23 redis02.db.com Keepalived_healthcheckers[4527]: Activating healthchecker for service [192.168.222.100]:3306
May 28 16:59:23 redis02.db.com Keepalived_healthcheckers[4527]: Activating healthchecker for service [192.168.222.100]:3306
May 28 16:59:23 redis02.db.com Keepalived_healthcheckers[4527]: Activating healthchecker for service [192.168.222.100]:3306
May 28 16:59:25 redis02.db.com Keepalived_healthcheckers[4527]: pid 4538 exited with status 1
May 28 16:59:25 redis02.db.com Keepalived_healthcheckers[4527]: Misc check to [192.168.222.246] for [/etc/keepalived/check_slave.py 192.168.222.245 3306] failed.
May 28 16:59:25 redis02.db.com Keepalived_healthcheckers[4527]: Removing service [192.168.222.246]:3306 from VS [192.168.222.100]:3306
May 28 16:59:25 redis02.db.com Keepalived_healthcheckers[4527]: Remote SMTP server [127.0.0.1]:25 connected.
May 28 16:59:25 redis02.db.com Keepalived_healthcheckers[4527]: SMTP alert successfully sent.
May 28 16:59:26 redis02.db.com Keepalived_healthcheckers[4527]: pid 4549 exited with status 1

4、绑定IP

  • master 主备节点绑定写vip ,222网段的ens256网卡来绑定192.168.222.99
[root@node01 scripts]# cat realserver_master_vip.sh
#!/bin/bash
#description: Config realserver

VIP=192.168.222.99
 
#/etc/rc.d/init.d/functions

case "$1" in
start)
       /sbin/ifconfig ens256:1 $VIP netmask 255.255.255.255 broadcast $VIP
       /sbin/route add -host $VIP dev lo:0
       echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
       echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
       echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
       echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
       sysctl -p >/dev/null 2>&1
       echo "RealServer Start OK"
       ;;
stop)
       /sbin/ifconfig ens256:1 down
       /sbin/route del $VIP >/dev/null 2>&1
       echo "0" >/proc/sys/net/ipv4/conf/lo/arp_ignore
       echo "0" >/proc/sys/net/ipv4/conf/lo/arp_announce
       echo "0" >/proc/sys/net/ipv4/conf/all/arp_ignore
       echo "0" >/proc/sys/net/ipv4/conf/all/arp_announce
       echo "RealServer Stoped"
       ;;
*)
       echo "Usage: $0 {start|stop}"
       exit 1
esac

exit 0

[root@node01 scripts]# sh realserver_master_vip.sh start  --绑定vip

[root@node01 scripts]# ifconfig
ens192: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 00:50:56:9d:e7:22  txqueuelen 1000  (Ethernet)
        RX packets 10052  bytes 2431608 (2.3 MiB)
        RX errors 0  dropped 84  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens224: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.49.245  netmask 255.255.255.0  broadcast 192.168.49.255
        inet6 fe80::250:56ff:fe9d:16ce  prefixlen 64  scopeid 0x20<link>
        ether 00:50:56:9d:16:ce  txqueuelen 1000  (Ethernet)
        RX packets 9263  bytes 626840 (612.1 KiB)
        RX errors 0  dropped 5409  overruns 0  frame 0
        TX packets 1958  bytes 165412 (161.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens256: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.222.245  netmask 255.255.255.0  broadcast 192.168.222.255
        inet6 fe80::250:56ff:fe9d:57dd  prefixlen 64  scopeid 0x20<link>
        ether 00:50:56:9d:57:dd  txqueuelen 1000  (Ethernet)
        RX packets 7225  bytes 908959 (887.6 KiB)
        RX errors 0  dropped 61  overruns 0  frame 0
        TX packets 6008  bytes 655756 (640.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens256:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500           --绑定成功
        inet 192.168.222.99  netmask 255.255.255.255  broadcast 192.168.222.99
        ether 00:50:56:9d:57:dd  txqueuelen 1000  (Ethernet)

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1  (Local Loopback)
        RX packets 216  bytes 24002 (23.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 216  bytes 24002 (23.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

  • 从库主备节点(node03,node04)绑定读vip ,lo网卡来绑定192.168.222.100
[root@node03 ~]# cat /scripts/realserver_slave_vip.sh
#!/bin/bash
#description: Config realserver

VIP=192.168.222.100

/etc/rc.d/init.d/functions

case "$1" in
start)
       /sbin/ifconfig lo:0 $VIP netmask 255.255.255.255 broadcast $VIP
       /sbin/route add -host $VIP dev lo:0
       echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
       echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
       echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
       echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
       sysctl -p >/dev/null 2>&1
       echo "RealServer Start OK"
       ;;
stop)
       /sbin/ifconfig lo:0 down
       /sbin/route del $VIP >/dev/null 2>&1
       echo "0" >/proc/sys/net/ipv4/conf/lo/arp_ignore
       echo "0" >/proc/sys/net/ipv4/conf/lo/arp_announce
       echo "0" >/proc/sys/net/ipv4/conf/all/arp_ignore
       echo "0" >/proc/sys/net/ipv4/conf/all/arp_announce
       echo "RealServer Stoped"
       ;;
*)
       echo "Usage: $0 {start|stop}"
       exit 1
esac

exit 0

[root@node03 ~]# sh /scripts/realserver_slave_vip.sh start --开始绑定
[root@node03 ~]# ifconfig
ens192: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 00:50:56:9d:e5:33  txqueuelen 1000  (Ethernet)
        RX packets 9135  bytes 2101074 (2.0 MiB)
        RX errors 0  dropped 21  overruns 0  frame 0
        TX packets 2918  bytes 516420 (504.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens224: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.49.248  netmask 255.255.255.0  broadcast 192.168.49.255
        inet6 fe80::250:56ff:fe9d:d918  prefixlen 64  scopeid 0x20<link>
        ether 00:50:56:9d:d9:18  txqueuelen 1000  (Ethernet)
        RX packets 8286  bytes 531170 (518.7 KiB)
        RX errors 0  dropped 5533  overruns 0  frame 0
        TX packets 1140  bytes 81700 (79.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens256: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.222.248  netmask 255.255.255.0  broadcast 192.168.222.255
        inet6 fe80::250:56ff:fe9d:3c15  prefixlen 64  scopeid 0x20<link>
        ether 00:50:56:9d:3c:15  txqueuelen 1000  (Ethernet)
        RX packets 3587  bytes 698465 (682.0 KiB)
        RX errors 0  dropped 52  overruns 0  frame 0
        TX packets 2308  bytes 337351 (329.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1  (Local Loopback)
        RX packets 24  bytes 2808 (2.7 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 24  bytes 2808 (2.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo:0: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 192.168.222.100  netmask 255.255.255.255
        loop  txqueuelen 1  (Local Loopback)

5、验证绑定情况

1.验证写vip

  • 在manager 节点验证写vip,写采用vip迁移模式,单节点
[root@node05 app1.log]# mysql -ugcdb -piforgot -h192.168.222.99 -e "show variables like 'hostname'";
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node01.mysql.com |
+---------------+------------------+
[root@node05 app1.log]# mysql -ugcdb -piforgot -h192.168.222.99 -e "show variables like 'hostname'";
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node01.mysql.com |
+---------------+------------------+
[root@node05 app1.log]# mysql -ugcdb -piforgot -h192.168.222.99 -e "show variables like 'hostname'";
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node01.mysql.com |
+---------------+------------------+
[root@node05 app1.log]#

  • stop master (node01)
[root@node01 ~]# systemctl stop mysql
  • 查看管理日志
[root@node05 ~]#  tail -f /var/log/masterha/app1/manager.log
Mon May 28 18:11:34 2018 - [warning] Got error on MySQL select ping: 2013 (Lost connection to MySQL server during query)
Mon May 28 18:11:34 2018 - [info] Executing SSH check script: exit 0
Mon May 28 18:11:34 2018 - [info] Executing secondary network check script: masterha_secondary_check -s 192.168.222.247 -s 192.168.222.245  --user=root  --master_host=192.168.222.245  --master_ip=192.168.222.245  --master_port=3306 --master_user=gcdb --master_password=iforgot --ping_type=SELECT
Mon May 28 18:11:35 2018 - [info] HealthCheck: SSH to 192.168.222.245 is reachable.
Monitoring server 192.168.222.247 is reachable, Master is not reachable from 192.168.222.247. OK.
Monitoring server 192.168.222.245 is reachable, Master is not reachable from 192.168.222.245. OK.
Mon May 28 18:11:35 2018 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Mon May 28 18:11:37 2018 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.222.245' (111))
Mon May 28 18:11:37 2018 - [warning] Connection failed 2 time(s)..
Mon May 28 18:11:40 2018 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.222.245' (111))
Mon May 28 18:11:40 2018 - [warning] Connection failed 3 time(s)..
Mon May 28 18:11:43 2018 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.222.245' (111))
Mon May 28 18:11:43 2018 - [warning] Connection failed 4 time(s)..
Mon May 28 18:11:43 2018 - [warning] Master is not reachable from health checker!
Mon May 28 18:11:43 2018 - [warning] Master 192.168.222.245(192.168.222.245:3306) is not reachable!
Mon May 28 18:11:43 2018 - [warning] SSH is reachable.
Mon May 28 18:11:43 2018 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..
Mon May 28 18:11:43 2018 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon May 28 18:11:43 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Mon May 28 18:11:43 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Mon May 28 18:11:44 2018 - [info] GTID failover mode = 1
Mon May 28 18:11:44 2018 - [info] Dead Servers:
Mon May 28 18:11:44 2018 - [info]   192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:44 2018 - [info] Alive Servers:
Mon May 28 18:11:44 2018 - [info]   192.168.222.246(192.168.222.246:3306)
Mon May 28 18:11:44 2018 - [info]   192.168.222.247(192.168.222.247:3306)
Mon May 28 18:11:44 2018 - [info]   192.168.222.248(192.168.222.248:3306)
Mon May 28 18:11:44 2018 - [info]   192.168.222.249(192.168.222.249:3306)
Mon May 28 18:11:44 2018 - [info] Alive Slaves:
Mon May 28 18:11:44 2018 - [info]   192.168.222.246(192.168.222.246:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:44 2018 - [info]     GTID ON
Mon May 28 18:11:44 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:44 2018 - [info]   192.168.222.247(192.168.222.247:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:44 2018 - [info]     GTID ON
Mon May 28 18:11:44 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:44 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon May 28 18:11:44 2018 - [info]   192.168.222.248(192.168.222.248:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:44 2018 - [info]     GTID ON
Mon May 28 18:11:44 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:44 2018 - [info]   192.168.222.249(192.168.222.249:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:44 2018 - [info]     GTID ON
Mon May 28 18:11:44 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:44 2018 - [info] Checking slave configurations..
Mon May 28 18:11:44 2018 - [info]  read_only=1 is not set on slave 192.168.222.246(192.168.222.246:3306).
Mon May 28 18:11:44 2018 - [info]  read_only=1 is not set on slave 192.168.222.247(192.168.222.247:3306).
Mon May 28 18:11:44 2018 - [info]  read_only=1 is not set on slave 192.168.222.248(192.168.222.248:3306).
Mon May 28 18:11:44 2018 - [info]  read_only=1 is not set on slave 192.168.222.249(192.168.222.249:3306).
Mon May 28 18:11:44 2018 - [info] Checking replication filtering settings..
Mon May 28 18:11:44 2018 - [info]  Replication filtering check ok.
Mon May 28 18:11:44 2018 - [info] Master is down!
Mon May 28 18:11:44 2018 - [info] Terminating monitoring script.
Mon May 28 18:11:44 2018 - [info] Got exit code 20 (Master dead).
Mon May 28 18:11:44 2018 - [info] MHA::MasterFailover version 0.57.
Mon May 28 18:11:44 2018 - [info] Starting master failover.
Mon May 28 18:11:44 2018 - [info]
Mon May 28 18:11:44 2018 - [info] * Phase 1: Configuration Check Phase..
Mon May 28 18:11:44 2018 - [info]
Mon May 28 18:11:46 2018 - [info] GTID failover mode = 1
Mon May 28 18:11:46 2018 - [info] Dead Servers:
Mon May 28 18:11:46 2018 - [info]   192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info] Checking master reachability via MySQL(double check)...
Mon May 28 18:11:46 2018 - [info]  ok.
Mon May 28 18:11:46 2018 - [info] Alive Servers:
Mon May 28 18:11:46 2018 - [info]   192.168.222.246(192.168.222.246:3306)
Mon May 28 18:11:46 2018 - [info]   192.168.222.247(192.168.222.247:3306)
Mon May 28 18:11:46 2018 - [info]   192.168.222.248(192.168.222.248:3306)
Mon May 28 18:11:46 2018 - [info]   192.168.222.249(192.168.222.249:3306)
Mon May 28 18:11:46 2018 - [info] Alive Slaves:
Mon May 28 18:11:46 2018 - [info]   192.168.222.246(192.168.222.246:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:46 2018 - [info]     GTID ON
Mon May 28 18:11:46 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info]   192.168.222.247(192.168.222.247:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:46 2018 - [info]     GTID ON
Mon May 28 18:11:46 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon May 28 18:11:46 2018 - [info]   192.168.222.248(192.168.222.248:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:46 2018 - [info]     GTID ON
Mon May 28 18:11:46 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info]   192.168.222.249(192.168.222.249:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:46 2018 - [info]     GTID ON
Mon May 28 18:11:46 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info] Starting GTID based failover.
Mon May 28 18:11:46 2018 - [info]
Mon May 28 18:11:46 2018 - [info] ** Phase 1: Configuration Check Phase completed.
Mon May 28 18:11:46 2018 - [info]
Mon May 28 18:11:46 2018 - [info] * Phase 2: Dead Master Shutdown Phase..
Mon May 28 18:11:46 2018 - [info]
Mon May 28 18:11:46 2018 - [info] Forcing shutdown so that applications never connect to the current master..
Mon May 28 18:11:46 2018 - [info] Executing master IP deactivation script:
Mon May 28 18:11:46 2018 - [info]   /usr/local/bin/master_ip_failover --orig_master_host=192.168.222.245 --orig_master_ip=192.168.222.245 --orig_master_port=3306 --command=stopssh --ssh_user=root


IN SCRIPT TEST====/sbin/ifconfig ens256:1 down==/sbin/ifconfig ens256:1 192.168.222.99/24===

Disabling the VIP on old master: 192.168.222.245
Mon May 28 18:11:46 2018 - [info]  done.
Mon May 28 18:11:46 2018 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Mon May 28 18:11:46 2018 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Mon May 28 18:11:46 2018 - [info]
Mon May 28 18:11:46 2018 - [info] * Phase 3: Master Recovery Phase..
Mon May 28 18:11:46 2018 - [info]
Mon May 28 18:11:46 2018 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Mon May 28 18:11:46 2018 - [info]
Mon May 28 18:11:46 2018 - [info] The latest binary log file/position on all slaves is binlog.000043:234
Mon May 28 18:11:46 2018 - [info] Latest slaves (Slaves that received relay log files to the latest):
Mon May 28 18:11:46 2018 - [info]   192.168.222.246(192.168.222.246:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:46 2018 - [info]     GTID ON
Mon May 28 18:11:46 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info]   192.168.222.247(192.168.222.247:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:46 2018 - [info]     GTID ON
Mon May 28 18:11:46 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon May 28 18:11:46 2018 - [info]   192.168.222.248(192.168.222.248:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:46 2018 - [info]     GTID ON
Mon May 28 18:11:46 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info]   192.168.222.249(192.168.222.249:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:46 2018 - [info]     GTID ON
Mon May 28 18:11:46 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info] The oldest binary log file/position on all slaves is binlog.000043:234
Mon May 28 18:11:46 2018 - [info] Oldest slaves:
Mon May 28 18:11:46 2018 - [info]   192.168.222.246(192.168.222.246:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:46 2018 - [info]     GTID ON
Mon May 28 18:11:46 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info]   192.168.222.247(192.168.222.247:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:46 2018 - [info]     GTID ON
Mon May 28 18:11:46 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon May 28 18:11:46 2018 - [info]   192.168.222.248(192.168.222.248:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:46 2018 - [info]     GTID ON
Mon May 28 18:11:46 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info]   192.168.222.249(192.168.222.249:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:46 2018 - [info]     GTID ON
Mon May 28 18:11:46 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info]
Mon May 28 18:11:46 2018 - [info] * Phase 3.3: Determining New Master Phase..
Mon May 28 18:11:46 2018 - [info]
Mon May 28 18:11:46 2018 - [info] Searching new master from slaves..
Mon May 28 18:11:46 2018 - [info]  Candidate masters from the configuration file:
Mon May 28 18:11:46 2018 - [info]   192.168.222.247(192.168.222.247:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Mon May 28 18:11:46 2018 - [info]     GTID ON
Mon May 28 18:11:46 2018 - [info]     Replicating from 192.168.222.245(192.168.222.245:3306)
Mon May 28 18:11:46 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon May 28 18:11:46 2018 - [info]  Non-candidate masters:
Mon May 28 18:11:46 2018 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Mon May 28 18:11:46 2018 - [info] New master is 192.168.222.247(192.168.222.247:3306)
Mon May 28 18:11:46 2018 - [info] Starting master failover..
Mon May 28 18:11:46 2018 - [info]
From:
192.168.222.245(192.168.222.245:3306) (current master)
 +--192.168.222.246(192.168.222.246:3306)
 +--192.168.222.247(192.168.222.247:3306)
 +--192.168.222.248(192.168.222.248:3306)
 +--192.168.222.249(192.168.222.249:3306)

To:
192.168.222.247(192.168.222.247:3306) (new master)
 +--192.168.222.246(192.168.222.246:3306)
 +--192.168.222.248(192.168.222.248:3306)
 +--192.168.222.249(192.168.222.249:3306)
Mon May 28 18:11:46 2018 - [info]
Mon May 28 18:11:46 2018 - [info] * Phase 3.3: New Master Recovery Phase..
Mon May 28 18:11:46 2018 - [info]
Mon May 28 18:11:46 2018 - [info]  Waiting all logs to be applied..
Mon May 28 18:11:46 2018 - [info]   done.
Mon May 28 18:11:46 2018 - [info]  Replicating from the latest slave 192.168.222.246(192.168.222.246:3306) and waiting to apply..
Mon May 28 18:11:46 2018 - [info]  Waiting all logs to be applied on the latest slave..
Mon May 28 18:11:46 2018 - [info]  Resetting slave 192.168.222.247(192.168.222.247:3306) and starting replication from the new master 192.168.222.246(192.168.222.246:3306)..
Mon May 28 18:11:47 2018 - [info]  Executed CHANGE MASTER.
Mon May 28 18:11:48 2018 - [info]  Slave started.
Mon May 28 18:11:48 2018 - [info]  Waiting to execute all relay logs on 192.168.222.247(192.168.222.247:3306)..
Mon May 28 18:11:48 2018 - [info]  master_pos_wait(binlog.000007:442) completed on 192.168.222.247(192.168.222.247:3306). Executed 0 events.
Mon May 28 18:11:48 2018 - [info]   done.
Mon May 28 18:11:48 2018 - [info]   done.
Mon May 28 18:11:48 2018 - [info] Getting new master's binlog name and position..
Mon May 28 18:11:48 2018 - [info]  binlog.000437:234
Mon May 28 18:11:48 2018 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.222.247', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Mon May 28 18:11:48 2018 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: binlog.000437, 234, 4945742e-4a97-11e8-9841-0050569dc4ab:1-1131,
b891d7c5-5ef0-11e8-9101-0050569d16ce:1
Mon May 28 18:11:48 2018 - [info] Executing master IP activate script:
Mon May 28 18:11:48 2018 - [info]   /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=192.168.222.245 --orig_master_ip=192.168.222.245 --orig_master_port=3306 --new_master_host=192.168.222.247 --new_master_ip=192.168.222.247 --new_master_port=3306 --new_master_user='gcdb'   --new_master_password=xxx
Unknown option: new_master_user
Unknown option: new_master_password


IN SCRIPT TEST====/sbin/ifconfig ens256:1 down==/sbin/ifconfig ens256:1 192.168.222.99/24===

Enabling the VIP - 192.168.222.99/24 on the new master - 192.168.222.247
Mon May 28 18:11:48 2018 - [info]  OK.
Mon May 28 18:11:48 2018 - [info] ** Finished master recovery successfully.
Mon May 28 18:11:48 2018 - [info] * Phase 3: Master Recovery Phase completed.
Mon May 28 18:11:48 2018 - [info]
Mon May 28 18:11:48 2018 - [info] * Phase 4: Slaves Recovery Phase..
Mon May 28 18:11:48 2018 - [info]
Mon May 28 18:11:48 2018 - [info]
Mon May 28 18:11:48 2018 - [info] * Phase 4.1: Starting Slaves in parallel..
Mon May 28 18:11:48 2018 - [info]
Mon May 28 18:11:48 2018 - [info] -- Slave recovery on host 192.168.222.246(192.168.222.246:3306) started, pid: 3986. Check tmp log /var/log/masterha/app1.log/192.168.222.246_3306_20180528181144.log if it takes time..
Mon May 28 18:11:48 2018 - [info] -- Slave recovery on host 192.168.222.248(192.168.222.248:3306) started, pid: 3987. Check tmp log /var/log/masterha/app1.log/192.168.222.248_3306_20180528181144.log if it takes time..
Mon May 28 18:11:48 2018 - [info] -- Slave recovery on host 192.168.222.249(192.168.222.249:3306) started, pid: 3988. Check tmp log /var/log/masterha/app1.log/192.168.222.249_3306_20180528181144.log if it takes time..
Mon May 28 18:11:50 2018 - [info]
Mon May 28 18:11:50 2018 - [info] Log messages from 192.168.222.248 ...
Mon May 28 18:11:50 2018 - [info]
Mon May 28 18:11:48 2018 - [info]  Resetting slave 192.168.222.248(192.168.222.248:3306) and starting replication from the new master 192.168.222.247(192.168.222.247:3306)..
Mon May 28 18:11:50 2018 - [info]  Executed CHANGE MASTER.
Mon May 28 18:11:50 2018 - [info]  Slave started.
Mon May 28 18:11:50 2018 - [info]  gtid_wait(4945742e-4a97-11e8-9841-0050569dc4ab:1-1131,
b891d7c5-5ef0-11e8-9101-0050569d16ce:1) completed on 192.168.222.248(192.168.222.248:3306). Executed 0 events.
Mon May 28 18:11:50 2018 - [info] End of log messages from 192.168.222.248.
Mon May 28 18:11:50 2018 - [info] -- Slave on host 192.168.222.248(192.168.222.248:3306) started.
Mon May 28 18:11:50 2018 - [info]
Mon May 28 18:11:50 2018 - [info] Log messages from 192.168.222.249 ...
Mon May 28 18:11:50 2018 - [info]
Mon May 28 18:11:48 2018 - [info]  Resetting slave 192.168.222.249(192.168.222.249:3306) and starting replication from the new master 192.168.222.247(192.168.222.247:3306)..
Mon May 28 18:11:49 2018 - [info]  Executed CHANGE MASTER.
Mon May 28 18:11:49 2018 - [info]  Slave started.
Mon May 28 18:11:49 2018 - [info]  gtid_wait(4945742e-4a97-11e8-9841-0050569dc4ab:1-1131,
b891d7c5-5ef0-11e8-9101-0050569d16ce:1) completed on 192.168.222.249(192.168.222.249:3306). Executed 0 events.
Mon May 28 18:11:50 2018 - [info] End of log messages from 192.168.222.249.
Mon May 28 18:11:50 2018 - [info] -- Slave on host 192.168.222.249(192.168.222.249:3306) started.
Mon May 28 18:11:50 2018 - [info]
Mon May 28 18:11:50 2018 - [info] Log messages from 192.168.222.246 ...
Mon May 28 18:11:50 2018 - [info]
Mon May 28 18:11:48 2018 - [info]  Resetting slave 192.168.222.246(192.168.222.246:3306) and starting replication from the new master 192.168.222.247(192.168.222.247:3306)..
Mon May 28 18:11:50 2018 - [info]  Executed CHANGE MASTER.
Mon May 28 18:11:50 2018 - [info]  Slave started.
Mon May 28 18:11:50 2018 - [info]  gtid_wait(4945742e-4a97-11e8-9841-0050569dc4ab:1-1131,
b891d7c5-5ef0-11e8-9101-0050569d16ce:1) completed on 192.168.222.246(192.168.222.246:3306). Executed 0 events.
Mon May 28 18:11:50 2018 - [info] End of log messages from 192.168.222.246.
Mon May 28 18:11:50 2018 - [info] -- Slave on host 192.168.222.246(192.168.222.246:3306) started.
Mon May 28 18:11:50 2018 - [info] All new slave servers recovered successfully.
Mon May 28 18:11:50 2018 - [info]
Mon May 28 18:11:50 2018 - [info] * Phase 5: New master cleanup phase..
Mon May 28 18:11:50 2018 - [info]
Mon May 28 18:11:50 2018 - [info] Resetting slave info on the new master..
Mon May 28 18:11:51 2018 - [info]  192.168.222.247: Resetting slave info succeeded.
Mon May 28 18:11:51 2018 - [info] Master failover to 192.168.222.247(192.168.222.247:3306) completed successfully.
Mon May 28 18:11:51 2018 - [info] Deleted server1 entry from /etc/masterha/app1.cnf .
Mon May 28 18:11:51 2018 - [info]

----- Failover Report -----

app1: MySQL Master failover 192.168.222.245(192.168.222.245:3306) to 192.168.222.247(192.168.222.247:3306) succeeded

Master 192.168.222.245(192.168.222.245:3306) is down!

Check MHA Manager logs at node05.mysql.com:/var/log/masterha/app1/manager.log for details.

Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.222.245(192.168.222.245:3306)
Selected 192.168.222.247(192.168.222.247:3306) as a new master.
192.168.222.247(192.168.222.247:3306): OK: Applying all logs succeeded.
192.168.222.247(192.168.222.247:3306): OK: Activated master IP address.
192.168.222.248(192.168.222.248:3306): OK: Slave started, replicating from 192.168.222.247(192.168.222.247:3306)
192.168.222.249(192.168.222.249:3306): OK: Slave started, replicating from 192.168.222.247(192.168.222.247:3306)
192.168.222.246(192.168.222.246:3306): OK: Slave started, replicating from 192.168.222.247(192.168.222.247:3306)
192.168.222.247(192.168.222.247:3306): Resetting slave info succeeded.
Master failover to 192.168.222.247(192.168.222.247:3306) completed successfully.
Mon May 28 18:11:51 2018 - [info] Sending mail..
sh: /usr/local/bin/send_master_failover_mail: No such file or directory
Mon May 28 18:11:51 2018 - [error][/usr/local/share/perl5/MHA/MasterFailover.pm, ln2066] Failed to send mail with return code 127:0

  • 上面日志显示master成功切换到node02
  • 验证写vip是否切换到node02
[root@node05 app1]# mysql -ugcdb -piforgot -h192.168.222.99 -e "show variables like 'hostname'"; --vip成功切换到node02
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node02.mysql.com |
+---------------+------------------+
[root@node05 app1]# mysql -ugcdb -piforgot -h192.168.222.99 -e "show variables like 'hostname'";
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node02.mysql.com |
+---------------+------------------+
[root@node05 app1]#
  • new master
[root@node02 scripts]#  mysql -ugcdb -piforgot -e "show slave hosts"    --可以看到mha切换成功
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----------+------+------+-----------+--------------------------------------+
| Server_id | Host | Port | Master_id | Slave_UUID                           |
+-----------+------+------+-----------+--------------------------------------+
|     49249 |      | 3306 |     49247 | 2386666b-5a77-11e8-ae47-0050569d7d62 |
|     49246 |      | 3306 |     49247 | d13cfe69-4a96-11e8-a154-0050569d38ab |
|     49248 |      | 3306 |     49247 | 1e33bd2d-5a77-11e8-81c3-0050569dd918 |
+-----------+------+------+-----------+--------------------------------------+
[root@node02 scripts]#

2.验证读vip

  • 在manager 节点验证读vip
[root@node05 app1.log]# mysql -ugcdb -piforgot -h192.168.222.100 -e "show variables like 'hostname'";
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node03.mysql.com |
+---------------+------------------+
[root@node05 app1.log]# mysql -ugcdb -piforgot -h192.168.222.100 -e "show variables like 'hostname'";
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node04.mysql.com |
+---------------+------------------+
[root@node05 app1.log]# mysql -ugcdb -piforgot -h192.168.222.100 -e "show variables like 'hostname'";
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node03.mysql.com |
+---------------+------------------+
[root@node05 app1.log]#
  • 在redis01 查看lvs (node03和node04均摊)
[root@redis02 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.222.100:3306 rr
  -> 192.168.222.248:3306         Route   1      0          0
  -> 192.168.222.249:3306         Route   1      0          0
[root@redis02 ~]#
  • node04 停止mysql
[root@node04 ~]# systemctl stop mysql
[root@node04 ~]#
  • 在redis01 查看lvs,node04 被剔除
[root@redis02 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.222.100:3306 rr
  -> 192.168.222.248:3306         Route   1      0          0  
[root@redis02 ~]#
  • 只剩下node03在负载
[root@node05 app1]# mysql -ugcdb -piforgot -h192.168.222.100 -e "show variables like 'hostname'";
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node03.mysql.com |
+---------------+------------------+
[root@node05 app1]# mysql -ugcdb -piforgot -h192.168.222.100 -e "show variables like 'hostname'";
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node03.mysql.com |
+---------------+------------------+
[root@node05 app1]# mysql -ugcdb -piforgot -h192.168.222.100 -e "show variables like 'hostname'";
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node03.mysql.com |
+---------------+------------------+
[root@node05 app1]#
  • 启动 node04 mysql
[root@node04 ~]# systemctl start mysql
[root@node04 ~]#
  • 在redis01 查看lvs (node04成功加入)
[root@redis02 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.222.100:3306 rr
  -> 192.168.222.248:3306         Route   1      0          0
  -> 192.168.222.249:3306         Route   1      0          0
[root@redis02 ~]#
  • 读vip (node03和node04均摊)
[root@node05 app1]# mysql -ugcdb -piforgot -h192.168.222.100 -e "show variables like 'hostname'";
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node04.mysql.com |
+---------------+------------------+
[root@node05 app1]# mysql -ugcdb -piforgot -h192.168.222.100 -e "show variables like 'hostname'";
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node03.mysql.com |
+---------------+------------------+
[root@node05 app1]# mysql -ugcdb -piforgot -h192.168.222.100 -e "show variables like 'hostname'";
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------+------------------+
| Variable_name | Value            |
+---------------+------------------+
| hostname      | node04.mysql.com |
+---------------+------------------+
[root@node05 app1]#
原文地址:https://www.cnblogs.com/gczheng/p/9103882.html