MySQL 双主+keepalived 详细文档 M-M+keepalived

1. 操作系统,系统环境,目结结构,用户,权限,日志路径,脚本
2. 配置规范化

1.1 操作系统准备

操作系统 Kylin Linux release 3.3.1707 (Core)
数据库版本 mysql-5.6.15-linux
mysql相关路径 /data/mysqldata/{3306/{data,tmp,binlog,slave,log/iblog},backup,scripts}
软件路径 /data/soft
1.2 相关软件准备
选择磁盘挂载目录/data作为mysql数据路径以及其他相关路径
[root@localhost keepalived]# cat /etc/redhat-release
Kylin Linux release 3.3.1707 (Core)
[root@localhost keepalived]# mkdir -p /data/soft
[root@localhost etc]# uname -a
Linux localhost.localdomain 3.10.0-514.ky3.kb3.x86_64 #1 SMP Thu Dec 29 21:29:54 EST 2016 x86_64 x86_64 x86_64 GNU/Linux

2.1安装配置


2.1系统规划
Ip Hostname Role Server_id
* mysql1 master 201801701
* mysql2 master 201801702
*(vip)
*(远端) mysql3 slave 201801703

--(2台机器都要执行)
[root@localhost ~]# lscpu
[root@localhost ~]# free -m #查看内存大小,后面根据该值修改innodb_buffer_pool_size
[root@localhost ~]# df -h #查看磁盘大小,看是否满足/data的要求
[root@localhost ~]# mkdir -p /data/soft #创建放软件的目录,并上传相应的安装包和软件
[root@localhost ~]# sysctl -a |grep swap #查看swap的值,如果是30,修改为10
vm.swappiness = 30
[root@localhost ~]# vim /etc/sysctl.conf #修改并保存
vm.swappiness = 10
[root@localhost ~]# sysctl -p
[root@localhost ~]# cat /sys/block/sda/queue/rotational #查看是ssd还是hdd
[root@localhost ~]# cat /sys/block/sda/queue/scheduler #查看磁盘调度方式
noop [deadline] cfq #hdd默认deadline比较合适,ssd选择noop,修改方式echo deadline > /sys/block/sda/queue/scheduler
[root@localhost ~]# date #查看2个系统的时间,时区,时间是否一致,要保持一致
[root@localhost ~]# hostname #查看并修改hostname
[root@localhost ~]# vim /etc/hosts
192.168.19.223 mysql1
192.168.19.226 mysql2
[root@localhost ~]# hostnamectl set-hostname mysql1 #另外一台mysql2
[root@localhost ~]# systemctl stop firewalld.service #关闭防火墙或者添加3306规则
[root@localhost ~]# systemctl disable firewalld.service
[root@localhost ~]# vim /etc/selinux/config #关闭selinux,并保存,一定要是disabled
SELINUX=disabled
[root@localhost ~]# reboot

2.2Mysql环境配置
--(2台机器都要执行)
[root@mysql1 ~]# getenforce
Disabled
[root@mysql1 ~]# groupadd mysql
[root@mysql1 ~]# useradd -g mysql mysql
[root@mysql1 ~]# passwd mysql #密码这里设置为mysql
[root@mysql1 ~]# vim /etc/security/limits.conf
mysql soft nproc 65535
mysql hard nproc 65535
mysql soft nofile 65535
mysql hard nofile 65535
[root@mysql1 ~]# ulimit -n 65535
[root@mysql1 ~]# vim /home/mysql/.bash_profile
export LANG=en_US.UTF-8
export PATH=/usr/local/mysql/bin:$PATH
export MYSQL_PS1="(u@h:p) [d]> "

2.3二进制安装
安装依赖包
[root@mysql1 ~]# yum -y install lrzsz
[root@mysql1 ~]# yum install -y gcc gcc-* make cmake gcc-c++ libaio libaio-devel bison bison-devel autoconf automake zlib* libxml* ncurses-devel libmcrypt* libtool-ltdl-devel* --skip-broken
[root@mysql1 ~]# yum install -y openssl openssl-devel ncurses ncurses-devel
创建mysql相应的目录
[root@mysql1 ~]# mkdir -p /data/mysqldata/{3306/{data,tmp,binlog,slave,log/iblog},backup,scripts}
[root@mysql1 backup]# cd /data/soft/
[root@mysql1 soft]# ll
-rw-r--r-- 1 root root 304382512 Jun 4 23:46 mysql-5.6.15-linux-glibc2.5-x86_64.tar.gz
解压并安装mysql
[root@mysql1 soft]# tar -zxvf mysql-5.6.15-linux-glibc2.5-x86_64.tar.gz -C /usr/local/
[root@mysql1 soft]# cd /usr/local/
[root@mysql1 local]# ln -s mysql-5.6.15-linux-glibc2.5-x86_64 mysql

编辑配置文件my.cnf,详细内容见 tar_mysql_cnf.cnf
[root@mysql1 local]# touch /data/mysqldata/3306/my.cnf
[root@mysql1 soft]# mv tar_mysql_cnf.cnf /data/mysqldata/3306/my.cnf
另外一台mysql,除了修改
server_id=2018071202
innodb_buffer_pool_size为实际内存的50%-70%
auto-increment-offset = 2

初始化数据库
[mysql@mysql1 3306]$ /usr/local/mysql/scripts/mysql_install_db --defaults-file=/data/mysqldata/3306/my.cnf --datadir=/data/mysqldata/3306/data --basedir=/usr/local/mysql --user=mysql
Installing MySQL system tables...OK

Filling help tables...OK

修改文件目录权限
[root@mysql1 local]# chown -R mysql:mysql /usr/local/mysql/
[root@mysql1 local]# chown -R mysql:mysql /data/mysqldata/
启动mysql数据库
[mysql@mysql1 3306]$ /usr/local/mysql/bin/mysqld_safe --defaults-file=/data/mysqldata/3306/my.cnf &
查看启动进程
[mysql@mysql1 3306]$ ps -ef|grep mysql
root 14878 12988 0 09:41 pts/0 00:00:00 su - mysql
mysql 14879 14878 0 09:41 pts/0 00:00:00 -bash
mysql 15140 14879 0 09:59 pts/0 00:00:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --defaults-file=/data/mysqldata/3306/my.cnf
mysql 16053 15140 0 09:59 pts/0 00:00:00 /usr/local/mysql/bin/mysqld --defaults-file=/data/mysqldata/3306/my.cnf --basedir=/usr/local/mysql --datadir=/data/mysqldata/3306/data --plugin-dir=/usr/local/mysql/lib/plugin --log-error=/data/mysqldata/3306/log/mysql-error.log --open-files-limit=65535 --pid-file=/data/mysqldata/3306/data/mysql1.pid --socket=/data/mysqldata/3306/mysql.sock --port=3306
root 16081 15022 0 09:59 pts/1 00:00:00 tail -f -n 100 mysql-error.log
mysql 16089 14879 0 10:00 pts/0 00:00:00 ps -ef
mysql 16090 14879 0 10:00 pts/0 00:00:00 grep --color=auto mysql
查看启动日志
[root@mysql1 log]# tail -f -n 100 /data/mysqldata/3306/log/mysql-error.log
2018-07-18 09:59:18 16679 [Note] Server socket created on IP: '::'.
2018-07-18 09:59:18 16679 [Warning] 'user' entry 'root@mysql2' ignored in --skip-name-resolve mode.
2018-07-18 09:59:18 16679 [Warning] 'user' entry '@mysql2' ignored in --skip-name-resolve mode.
2018-07-18 09:59:18 16679 [Warning] 'proxies_priv' entry '@ root@mysql2' ignored in --skip-name-resolve mode.
2018-07-18 09:59:18 16679 [Note] Event Scheduler: Loaded 0 events
2018-07-18 09:59:18 16679 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
Version: '5.6.15-log' socket: '/data/mysqldata/3306/mysql.sock' port: 3306 MySQL Community Server (GPL)
2018-07-18 09:59:18 16679 [Note] Event Scheduler: scheduler thread started with id 1

2.4 创建mysql账号
登录mysql数据库,并创建相关用户
[mysql@mysql1 3306]$ mysql -S /data/mysqldata/3306/mysql.sock
(root@localhost:mysql.sock) [(none)]> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000003 | 120 | | | |
+------------------+----------+--------------+------------------+-------------------+
(root@localhost:mysql.sock) [(none)]> delete from mysql.user where (user,host) not in(select 'root','localhost');
(root@localhost:mysql.sock) [(none)]> update mysql.user set password=password('***'); #修改root密码为***
(root@localhost:mysql.sock) [(none)]> flush privileges;
(root@localhost:mysql.sock) [(none)]> truncate table mysql.db; #清除该db内容
创建超级权限账号
(root@localhost:mysql.sock) [(none)]> GRANT ALL PRIVILEGES ON *.* TO '*'@'127.0.0.1' IDENTIFIED BY "***";
(root@localhost:mysql.sock) [(none)]> GRANT ALL PRIVILEGES ON *.* TO '*'@'10.15.%' IDENTIFIED BY "***";
(root@localhost:mysql.sock) [(none)]> GRANT ALL PRIVILEGES ON *.* TO '*'@'192.168.%' IDENTIFIED BY "***";
创建复制账号
(root@localhost:mysql.sock) [(none)]> GRANT REPLICATION SLAVE,RELOAD,SUPER ON *.* to *@'192.168.%' identified by "***";
创建备份账号
(root@localhost:mysql.sock) [(none)]> GRANT select,PROCESS,RELOAD,SUPER, REPLICATION CLIENT ON *.* to *@'127.0.0.1' IDENTIFIED BY "***"
(root@localhost:mysql.sock) [(none)]> flush privileges;
(root@localhost:mysql.sock) [(none)]> select user,host,password from mysql.user;

#生产环境严格用户权限,密码控制(选择执行)
#grant select ,insert ,update,delete,execute on test.* to '*'@'%'identified by "***" ;
#flush privileges;
#show grants for *@'%';
#revoke insert ,update,delete,execute on test.* from *@'%';
#grant insert ,update,delete,execute on test.* to '*'@'%';


2.5 Mysql主从搭建
*(mysql1) master
*(mysql2) slave
--不使用gtid,在5.6.15中,使用gtid进行主从复制遇到了2个bug,所以这里不建议使用gtid
选择搭建主从,这里*作为master,*作为slave
1 如果是新安装的数据库,没有数据,这里就不用做备份还原来搭建主从,前提是从机已经做了上面的用户操作
mysql1(Master)上执行,获取binlog文件和位置点
(root@localhost:mysql.sock) [(none)]> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000003 | 2339 | | | |
+------------------+----------+--------------+------------------+-------------------+
在mysql2(Slave)上执行
(root@localhost:mysql.sock) [(none)]> change master to
master_host='*',
master_port=3306,
master_user='*',
master_password='***',
master_log_file='mysql-bin.000003',
master_log_pos=2339;
(root@localhost:mysql.sock) [(none)]> start slave;
(root@localhost:mysql.sock) [(none)]> show slave statusG;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: *
Master_User: *
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 2339
Relay_Log_File: relaylog.000002
Relay_Log_Pos: 283
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 2339
Relay_Log_Space: 449
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2018060701
Master_UUID: 2b876c4b-8a2e-11e8-a293-000c29af4e99
Master_Info_File: /data/mysqldata/3306/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
1 row in set (0.00 sec)

3 测试主从


mysql1(Master)上执行
(root@localhost:mysql.sock) [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| test |
+--------------------+
4 rows in set (0.00 sec)

(root@localhost:mysql.sock) [(none)]> use test;
Database changed
(root@localhost:mysql.sock) [test]> create table t1(id int,name varchar(10));
Query OK, 0 rows affected (0.03 sec)

(root@localhost:mysql.sock) [test]> insert into t1 values(1,'AAAAA');
Query OK, 1 row affected (0.00 sec)

(root@localhost:mysql.sock) [test]> commit;
Query OK, 0 rows affected (0.00 sec)

(root@localhost:mysql.sock) [test]> select * from test.t1;
+------+-------+
| id | name |
+------+-------+
| 1 | AAAAA |
+------+-------+
1 row in set (0.00 sec)
在mysql2(Slave)上执行,数据一致
(root@localhost:mysql.sock) [(none)]> select * from test.t1;
+------+-------+
| id | name |
+------+-------+
| 1 | AAAAA |
+------+-------+

4 Mysql主主搭建

在原来的Master-Slave基础上,
在mysql2(slave)上执行,找到文件和位置先,比如是mysql-bin.000003,2651
(root@localhost:mysql.sock) [(none)]> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000003 | 2651 | | | |
+------------------+----------+--------------+------------------+-------------------+

在mysql1(master)上执行
(root@localhost:mysql.sock) [(none)]> change master to
master_host='*',
master_port=*,
master_user='*',
master_password='***',
master_log_file='mysql-bin.000003',
master_log_pos=2651;
(root@localhost:mysql.sock) [(none)]> start slave;
(root@localhost:mysql.sock) [test]> show slave statusG;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: *
Master_User: *
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 2651
Relay_Log_File: relaylog.000002
Relay_Log_Pos: 283
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 2651
Relay_Log_Space: 449
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2018060702
Master_UUID: 2e4e10fa-8a2e-11e8-a293-005056bfac47
Master_Info_File: /data/mysqldata/3306/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
1 row in set (0.00 sec)

分别在mysql1 和mysql2上查看进程
mysql1
(root@localhost:mysql.sock) [test]> show full processlist;

mysql2
(root@localhost:mysql.sock) [(none)]> show full processlist;

创建mysql用户
[mysql@mysql1 scripts]$ mysql -u*-p'*l' -S /data/mysqldata/3306/mysql.sock
(root@localhost:mysql.sock) [(none)]> GRANT ALL PRIVILEGES ON *.* TO '*'@'*' IDENTIFIED BY "***";
(root@localhost:mysql.sock) [(none)]> flush privileges;


测试主主的可用性
mysql2
(root@localhost:mysql.sock) [(none)]> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
(root@localhost:mysql.sock) [test]> insert into t1 values(2,'BBBBBB');
Query OK, 1 row affected (0.00 sec)

(root@localhost:mysql.sock) [test]> commit;
Query OK, 0 rows affected (0.00 sec)

(root@localhost:mysql.sock) [test]> select * from test.t1;
+------+--------+
| id | name |
+------+--------+
| 1 | AAAAA |
| 2 | BBBBBB |
+------+--------+
2 rows in set (0.00 sec)

mysql1
(root@localhost:mysql.sock) [test]> select * from test.t1;
+------+--------+
| id | name |
+------+--------+
| 1 | AAAAA |
| 2 | BBBBBB |
+------+--------+
2 rows in set (0.00 sec)

这里主主的可用性已经测试完成,下面进行新的slave的搭建

新slave使用备份还原搭建主从(mysqldump)
*(远端)作为slave,连接到mysql2(*)
--mysqldump
[mysql@mysql2 backup]$ /usr/local/mysql/bin/mysqldump -h * -u*-p'***' -A -R -S /data/mysqldata/3306/mysql.sock
--single-transaction --default-character-set=utf8 --master-data=2 | gzip > /data/mysqldata/backup/mysqldumpfullbk_20180818.sql.gz
[mysql@mysql2 backup]$ ll
-rw-rw-r-- 1 mysql mysql 656 Jul 18 10:38 mysqldumpfullbk_20180818.sql.gz
复制到从机192.168.19.227
[mysql@mysql2 backup]$ scp mysqldumpfullbk_20180818.sql.gz *@*:/data/mysqldata/backup/.
从机192.168.19.227上执行,并导入sql文件
[mysql@mysql3 backup]$ gunzip mysqldumpfullbk_20180818.sql.gz
[mysql@mysql3 backup]$ mysql -usystem -pmysql -h127.0.0.1 -P3306 < mysqldumpfullbk_20180818.sql
--登录并flush privileges;
获取复制的文件和位置点
[mysql@mysql3 backup]$ head -n 30 mysqldumpfullbk_20180818.sql
-- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=3510;
登录192.168.19.227(slave)并执行
(root@localhost:mysql.sock) [(none)]> change master to
master_host='*',
master_port=3306,
master_user='*',
master_password='***',
master_log_file='mysql-bin.000003',
master_log_pos=3510;
(root@localhost:mysql.sock) [(none)]> start slave;
(root@localhost:mysql.sock) [(none)]> show slave statusG;

5 Keepalived搭建


安装配置keepalived
[root@mysql1 ~]# yum install -y keepalived
配置文件
[root@mysql1 ~]# rpm -ql keepalived
/etc/keepalived
/etc/keepalived/keepalived.conf
/etc/sysconfig/keepalived

mysql1 keepalived配置文件
[root@mysql1 ~]# cd /etc/keepalived/
[root@mysql1 keepalived]# ll
总用量 16
-rwxr-xr-x. 1 root root 731 6月 15 17:59 ka_check_mysql.sh
-rw-r--r--. 1 root root 549 6月 19 15:24 keepalived.conf
[root@mysql1 keepalived]# cat /etc/keepalived/keepalived.conf

! Configuration File for keepalived

vrrp_script  check_run {
  script "/etc/keepalived/ka_check_mysql.sh"
  interval 10
}

vrrp_instance VI_1 { 
     state BACKUP  
     interface ens192 
     virtual_router_id 41 
     priority 100   
     advert_int 1 
     nopreempt                                                                                            
     authentication { 
         auth_type PASS 
         auth_pass 1111 
     } 
     virtual_ipaddress { 
         */24
     } 
     track_script {
      check_run
     }
}

mysql2 keepalived配置文件
[root@mysql2 ~]# cd /etc/keepalived/
[root@mysql2 keepalived]# cat /etc/keepalived/keepalived.conf

! Configuration File for keepalived

vrrp_script  check_run {
  script "/etc/keepalived/ka_check_mysql.sh"
  interval 10
}

vrrp_instance VI_1 {
     state BACKUP
     interface ens192
     virtual_router_id 41
     priority 90
     advert_int 1                                                                                        
     authentication {
         auth_type PASS
         auth_pass 1111
     }
     virtual_ipaddress {
         */24
     }
     track_script {
      check_run
     }
}

启动keepalived,keepalived有3个进程,一个主进程,一个health_check进程,一个vrrp进程
[root@mysql1 keepalived]# systemctl start keepalived
[root@mysql1 ~]# ps -ef|grep keepalived
[root@mysql1 backup]# ip addr
[root@mysql1 keepalived]# ip addr |grep 192.168
[root@mysql1 keepalived]# systemctl status keepalived
查看keepalived的日志
[root@mysql1 backup]# tail -f -n 100 /var/log/messages
登录vip进行mysql查看
[mysql@mysql1 scripts]$ /usr/local/mysql/bin/mysql -u* -p'*l' -P3306 -h* -S /data/mysqldata/3306/mysql.sock -e "select @@server_id;"
Warning: Using a password on the command line interface can be insecure.
+-------------+
| @@server_id |
+-------------+
| 2018060701 | #目前连接的是mysql1
+-------------+

mysql 心跳检查脚本
[mysql@mysql1 3306]$ /usr/local/mysql/bin/mysql -u* -p'***' -P3306 -h127.0.0.1 -S /data/mysqldata/3306/mysql.sock -e "show status;"
[mysql@mysql1 3306]$ /usr/local/mysql/bin/mysql -u* -p'***' -P3306 -h127.0.0.1 -S /data/mysqldata/3306/mysql.sock -e "select user();"
[root@mysql1 keepalived]# cat ka_check_mysql.sh

#!/bin/bash
host=127.0.0.1
username=*
password=*
port=3306
CHECK_TIME=3
#mysql  is working MYSQL_IS_OK is 1 , mysql down MYSQL_IS_OK is 0
MYSQL_IS_OK=1
function check_mysql_status (){
    /usr/local/mysql/bin/mysql -u$username -p"$password" -P$port  -h$host -e "select user();" >/dev/null 2>&1
    if [ $? = 0 ] ;then
    MYSQL_IS_OK=1
    else
    MYSQL_IS_OK=0
    fi
    return $MYSQL_IS_OK 
}
while [ $CHECK_TIME -ne 0 ]
do
    let "CHECK_TIME -= 1"
    
    check_mysql_status
if [ $MYSQL_IS_OK = 1 ] ; then
    CHECK_TIME=0
   #/bin/systemctl stop keepalived
    echo 1
    exit 0
fi

if [ $MYSQL_IS_OK -eq 0 ] &&  [ $CHECK_TIME -eq 0 ]
then
    /bin/systemctl stop keepalived
    echo 0
    exit 1
fi
sleep 3
done

运行心跳检测脚本,返回1则正常
[root@mysql1 keepalived]# ./ka_check_mysql.sh
1

mysql相关运维脚本,注意所以脚本都适合在二进制和脚本安装的mysql中,如果是yum安装的mysql,需要修改所有脚本的mysql路径
二进制,脚本安装都的mysql的路径,所有脚本里面涉及到/usr/local/mysql/bin/的地方全部替换为/usr/bin/
二进制安装与脚本安装的mysql的可执行文件的路径为/usr/local/mysql/bin/
yum安装后,mysql的可执行文件路径为/usr/bin/

脚本都需要可执行权限
[mysql@mysql1 scripts]$ chmod +x *.sh
[mysql@mysql1 scripts]$ ll /data/mysqldata/scripts/
total 28
-rw-rw-r-- 1 mysql mysql 0 Jun 22 16:00 0

相关自启动服务
[root@hongquan1 ~]# cat /etc/rc.d/rc.local
#mysql auto startup
sudo -i -u mysql /data/mysqldata/scripts/mysql_db_startup.sh > /home/mysql/mysql_db_startup.log 2>&1
#zabbix auto startup
#/usr/local/zabbix/sbin/zabbix_agentd -c /usr/local/zabbix/etc/zabbix_agentd.conf
[root@hongquan1 ~]#

mysql运维工具的安装
[root@mysql1 keepalived]# cd /data/soft/
[root@mysql1 keepalived]# tar zxvf percona-toolkit-2.2.17.tar.gz
[root@mysql1 soft]# cd percona-toolkit-2.2.17/bin/
简单命令测试
1 返回除information_schema|mysql|performance_schema之外的database的名字
[mysql@mysql1 scripts]$ /usr/local/mysql/bin/mysql -u* -p* -h127.0.0.1 -S /data/mysqldata/3306/mysql.sock -e "show databases" | sed '1d' | egrep -v "information_schema|mysql|performance_schema"
test
2 pt检查某个db的数据是否一致,比如test库
[root@mysql1 backup]# /data/soft/percona-toolkit-2.2.17/bin/pt-table-checksum --nocheck-replication-filters --no-check-binlog-format h=*,u=system,p=mysql,P=3306 --databases=test --replicate=test.checksums
TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
07-18T11:49:35 0 0 2 1 0 0.006 test.t1

3 备份脚本,请将备份脚本加入crontab中,定时备份,目前采用mysqldump,也提供了xtrabackup的备份脚本
注意执行权限,路径,以及用户名等
[mysql@mysql1 scripts]$ ./mysqldump_per_db.sh
添加到crontab中
35 02 * * * mysql /data/mysql/scripts/mysqldump_per_db.sh >> /data/mysql/backup/mysql_back1.log 2>&1

4 提供zabbix监控mysql的项目,注意zabbix用户必须在数据库中创建
(root@localhost:mysql.sock) [(none)]> grant select,RELOAD,SUPER, REPLICATION CLIENT on *.* to '*'@'127.0.0.1' identified by '*';

[root@hongquan1 soft]# cat /usr/local/zabbix/etc/zabbix_agentd.conf | grep -v '^#' | grep -v '^$'
LogFile=/tmp/zabbix_agentd.log
Server=127.0.0.1
ListenPort=10050
Hostname=*
Include=/usr/local/zabbix/etc/zabbix_agentd.conf.d/*.conf
UnsafeUserParameters=1
UserParameter=mysql.ping,/usr/local/mysql/bin/mysqladmin -u*-p* -h127.0.0.1 -P3306 2>/dev/null ping|grep alive|wc -l
UserParameter=mysql.status[*],/usr/local/mysql/bin/mysql -h 127.0.0.1 -P3306 -u* -p'*' -N -e "show global status where Variable_name='$1'" 2>/dev/null | cut -f2
UserParameter=mysql_variable_status[*],/usr/local/mysql/bin/mysql -h 127.0.0.1 -P3306 -u* -p'*' -N -e "show variables where Variable_name='$1'" 2>/dev/null | cut -f2
UserParameter=mysql.version,/usr/local/mysql/bin/mysql -V
UserParameter=mysql.mysql_process_status,sh /usr/local/zabbix/lib/alertscripts/check_mysql_status_3306.sh

5 提供检测主从复制的脚本
[mysql@mysql1 scripts]$ cat check_slave_status_3306.sh

#!/bin/bash
host=127.0.0.1
username=*
password=*
port=3306
#mysql slave is working then return 2,others is error
/usr/local/mysql/bin/mysql -u$username -p"$password" -P$port -h$host  2>/dev/null 
-e "show slave statusG"  |grep -E "Slave_IO_Running|Slave_SQL_Running"|awk '{print $2}'|grep -c Yes

[mysql@mysql1 scripts]$ ./check_slave_status_3306.sh
2
zabbix监控建议
{check_slave_status_3306:mysql.check_slave_status_3306.last()} <>2

6 提供检测mysql状态的脚本
[root@hongquan1 scripts]# cat check_mysql_status_3306.sh

[mysql@mysql1 scripts]$ ./check_mysql_status_3306.sh
1

7 物理备份与逻辑备份策略,mysqldump,innobackupex


下载Percona-XtraBackup 软件
https://www.percona.com/downloads/XtraBackup/LATEST/
[root@mysql1 soft]# rpm -ivh libev-4.15-1.el6.rf.x86_64.rpm
[root@mysql1 soft]# whereis innobackupex
innobackupex: /usr/bin/innobackupex /usr/share/man/man1/innobackupex.1.gz
[mysql@mysql2 scripts]$ ./mysql_backup_dump_xtra.sh


mysql宕机测试,并完整恢复dual M-M+keepalived
noted:如果停掉*:3306,则*:3306会失效,必须重新指定新的master或者恢复226

1 * 关闭mysql
[mysql@mysql1 scripts]$ ./mysql_db_shutdown.sh
shutdown mysql service:localhost_3306
查看vip,已经不再*上面
[root@mysql1 soft]# ip addr |grep 192.168
inet */24 brd * scope global ens192
查看keepalived日志,
[root@mysql1 soft]# tail -f -n 100 /var/log/messages

2 登录 192.168.19.226 查看vip,已经漂移过来
[root@mysql2 ~]# ip addr |grep 192.168
根据vip查看,实例已经转移到*
[mysql@mysql1 scripts]$ /usr/local/mysql/bin/mysql -*-*' -P3306 -h* -S /data/mysqldata/3306/mysql.sock -e "select @@server_id;"
Warning: Using a password on the command line interface can be insecure.
+-------------+
| @@server_id |
+-------------+
| 2018060702 |
+-------------+
进行数据插入测试,连接*
use test;
create table tt6(id int,name varchar(10));
insert into tt6 values(1,'AAAAA');
commit;
select * from test.tt6;
启动宕机的mysql1 ,登录并查看复制状态
[mysql@mysql1 scripts]$ ./mysql_db_startup.sh
[mysql@mysql1 scripts]$ ./mysqlplus.sh
(system@127.0.0.1:3306) [(none)]> show slave statusG;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: *
Master_User: *
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 10374
Relay_Log_File: relaylog.000002
Relay_Log_Pos: 7056
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: No #这里都为no,需要手动开启slave
Slave_SQL_Running: No
(system@127.0.0.1:3306) [(none)]> start slave; #开启主从同步
Query OK, 0 rows affected (0.00 sec)
(system@127.0.0.1:3306) [(none)]> show slave statusG;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: *
Master_User: *
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 10697
Relay_Log_File: relaylog.000004
Relay_Log_Pos: 606
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes #这里为yes,表示已经连接主库
Slave_SQL_Running: Yes #这里为yes,表示从已经可以正确的进行同步
(system@127.0.0.1:3306) [(none)]> select * from test.tt6; #查看刚刚的数据,已经成功的同步到mysl1
+------+-------+
| id | name |
+------+-------+
| 1 | AAAAA |
这时候,需要开启keepalived,保证双主+keepalived
[root@mysql1 soft]# systemctl start keepalived
[root@mysql1 soft]# ip addr |grep 192.168 #再次查看ip,没有被强占过来,表示正确

常见问题处理

1 keepalived报错
Jun 8 18:07:17 localhost Keepalived_vrrp[18638]: VRRP_Instance(VI_1) ignoring received advertisment...
Jun 8 18:07:18 localhost Keepalived_vrrp[18638]: ip address associated with VRID not present in received packet :*
Jun 8 18:07:18 localhost Keepalived_vrrp[18638]: one or more VIP associated with VRID mismatch actual MASTER advert
Jun 8 18:07:18 localhost Keepalived_vrrp[18638]: bogus VRRP packet received on ens192 !!!
在同一网段内非同一套keepalive集群中的 virtual_router_id 值不能相同,如果相同会在messages中收到VRRP错误包 ,所以需要更改 virual_router_id
如果内容交换机有采用VRRP协议进行切换,保证router_id不能相同

2 网卡名字导致错误
[root@mysql1 soft]# ifconfig
ens192: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
修改/etc/keepalived/keepalived.conf,保证网卡名字一致

3 服务器未关闭SELinux 导致
Jun 19 17:21:41 localhost setroubleshoot: SELinux is preventing /usr/local/mysql/bin/mysql from read access on the directory /etc/my.cnf.d. For complete SELinux messages. run sealert -l b6c62ef5-9dc6-4a6a-9d13-fd2d7e91031a
Jun 19 17:21:41 localhost python: SELinux is preventing /usr/local/mysql/bin/mysql from read access on the directory /etc/my.cnf.d.#012#012***** Plugin restorecon_source (99.5 confidence) suggests *****************#012#012If you want to fix the label. #012/usr/local/mysql/bin/mysql default label should be bin_t.#012Then you can run restorecon.#012Do#012# /sbin/restorecon -v /usr/local/mysql/bin/mysql#012#012***** Plugin catchall (1.49 confidence) suggests **************************#012#012If you believe that mysql should be allowed read access on the my.cnf.d directory by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'mysql' --raw | audit2allow -M my-mysql#012# semodule -i my-mysql.pp#012
Jun 19 17:21:44 localhost setroubleshoot: SELinux is preventing mysql from open access on the directory /etc/my.cnf.d. For complete SELinux messages. run sealert -l dd2b9eaa-eee0-4bc5-a3cd-a2c5252fff7f
Jun 19 17:21:44 localhost python: SELinux is preventing mysql from open access on the directory /etc/my.cnf.d.#012#012***** Plugin catchall (100. confidence) suggests **************************#012#012If you believe that mysql should be allowed open access on the my.cnf.d directory by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'mysql' --raw | audit2allow -M my-mysql#012# semodule -i my-mysql.pp#012
Jun 19 17:27:06 localhost systemd: Stopping LVM2 PV scan on device 8:2...

[root@hqmysql1 ~]# vim /etc/selinux/config
SELINUX=disabled
[root@hongquan1 keepalived]# getenforce
Disabled

4 如果启用了iptables,不设置"系统接收VRRP协议"的规则,就会出现脑裂
centos6
-A INPUT -s 192.168.1.0/24 -p vrrp -j ACCEPT #允许VRRP(虚拟路由器冗余协)通信
用tcpdump抓包vrrp协议,这里关闭了firewalld,只在master1上看到vrrp
[root@mysql1 keepalived]# tcpdump -i ens192|grep VRRP
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens192, link-type EN10MB (Ethernet), capture size 65535 bytes
09:18:29.228973 IP mysql1 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 41, prio 100, authtype simple, intvl 1s, length 20
09:18:40.241449 IP mysql1 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 41, prio 100, authtype simple, intvl 1s, length 20
09:18:41.242613 IP mysql1 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 41, prio 100, authtype simple, intvl 1s, length 20
[root@mysql1 keepalived]# iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
添加
iptables -I INPUT 4 -p vrrp -j ACCEPT

centos7
[root@mysql3 ~]# firewall-cmd --direct --permanent --add-rule ipv4 filter INPUT 0 --in-interface ens160 --destination 224.0.0.18 --protocol vrrp -j ACCEPT
success
[root@mysql3 ~]# firewall-cmd --reload
success
[root@mysql3 ~]# iptables -S

5预防keepalived脑裂
用cronta定时ping网关,比如一次5count,3次不通就停掉keepalived
vim ping_gateway.sh
#!/bin/bash
host=127.0.0.1
CHECK_TIME=3
VIP=*
GATEWAY=*
eth=ens192
#mysql is working MYSQL_IS_OK is 1 , mysql down MYSQL_IS_OK is 0
keepalived_IS_OK=1
function check_keepalived_status (){
/sbin/arping -I $eth -c 5 -s $VIP $GATEWAY >/dev/null 2>&1
if [ $? = 0 ] ;then
keepalived_IS_OK=1
else
keepalived_IS_OK=0
fi
return $keepalived_IS_OK
}
while [ $CHECK_TIME -ne 0 ]
do
let "CHECK_TIME -= 1"

check_keepalived_status
if [ $keepalived_IS_OK = 1 ] ; then
CHECK_TIME=0
#/bin/systemctl stop keepalived
echo 1
exit 0
fi

if [ $keepalived_IS_OK -eq 0 ] && [ $CHECK_TIME -eq 0 ]
then
#/bin/systemctl stop keepalived
echo 0
exit 1
fi
sleep 3
done

也可以检测ip漂移情况,发现backup有vip就发警告,此脚本部署在backup机器上
vim check_brain_keepalived.sh
#!/bin/bash
# 检查脑裂的脚本,在备节点上进行部署
KP01_VIP=*
KP01_IP=*
KP02_IP=*
while true
do
ping -c 2 -W 3 $KP01_VIP &>/dev/null
if [ $? -eq 0 -a `ip add|grep "$KP01_VIP"|wc -l` -eq 1 ];then
echo "ha is brain."
else
echo "ha is ok"
fi
sleep 5
done

原文地址:https://www.cnblogs.com/yhq1314/p/9930578.html