AWS多元复制到EC2机器

需求：把多个rds同步到一个rds,实现跨实例数据查询。

由于AWS rds本身不支持多元复制，只能利用ec2自建mysql，然后把rds数据同步下来。

(1)确定4个rds slave 需要同步的库
 select concat('replicate_wild_do_table=',TABLE_SCHEMA,'.%') from
(select distinct table_schema from information_schema.TABLES where TABLE_SCHEMA not in('information_schema','mysql','awsdms_control','sys','performance_schema'))ma;

然后添加到my.cnf,我的复制相关主要配置如下：

#gtid
master_info_repository = TABLE
relay_log_info_repository = TABLE
gtid_mode = on
binlog_checksum =NONE
enforce_gtid_consistency = 1
log_slave_updates=1
binlog_gtid_simple_recovery
enforce-gtid-consistency=on
transaction_write_set_extraction = XXHASH64
slave_parallel_workers=4
slave_parallel_type=LOGICAL_CLOCK

然后在目标端建表结构，如果用AWS同步工具也可以建表结构，但是很坑，工具转出来的表结构问题很多，导致后面复制经常出问题，因此推荐用mysqldump操作。

(2)用mysqldump导出表结构,然后在目标机器先把表结构建起来
#CALL mysql.rds_stop_replication;
#在rds slave 制度副本停止同步
mysql -hpro-cxxxx.rds.amazonaws.com -uadmin -pxxx -e "show databases;"| grep -vEw "Database|information_schema|mysql|awsdms_control|sys|performance_schema"|xargs mysqldump --single-transaction -hxxxx.rds.amazonaws.com -uadmin -pxxxxx -B -R -d > ./copy_$(date +%F).sql

然后倒到目标机器，有几个源，就这样操作几次。

数据初始化方法1：利用AWS自带收费同步统计DMS，开始尝试过一次，特别坑，好多字段都给我转错了，遇到不少坑，不推荐。

数据初始化方法2：利用mysqldump，但是太慢了，也不推荐。

数据初始化方法3：利用mydumper多线程导入。

系统是Ubuntu 18.04，mysql是5.7.28

(3)导数据
sudo apt-get update -y
apt-get install -y mydumper

mydumper -u admin -p xxxx -h pxxxxs.com -k -R -E -G -c -K -t 6 
myloader -u root -p xxx -S /tmp/mysql.sock -t 6 -v 2 -d /root/sql/export-20200717-160449

接着在源端建好同步用户，并且配置好权限：

create user repl_user@'%' identified by 'xx#';
grant replication slave, replication client on *.* to repl_user@'%';
show grants for repl_user@'%';

最后purge gtid,然后起同步进程：

reset master;
SET @@GLOBAL.GTID_PURGED='6d90ea98-9ff7-11ea-b5b4-0e134e486bba:1-205057735,
38a5a557-a189-11ea-a593-0a7a1a2d24b0:1-2694418,99d788b4-a189-11ea-a11d-0a6b6ede7002:1-15177358,
1fbf39d5-9c10-11ea-8bc2-0a6232fd7ba6:1-56184400,78eb9741-9cd3-11ea-8ce3-06e09c3a69e8:1-200';
CHANGE MASTER TO MASTER_HOST='host1', MASTER_PORT=3306, Master_User='repl_user',Master_Password='xxxx' for channel 'channel1';
CHANGE MASTER TO MASTER_HOST='host2', MASTER_PORT=3306, Master_User='repl_user',Master_Password='xxxx' for channel 'channel2';
CHANGE MASTER TO MASTER_HOST='host3', MASTER_PORT=3306, Master_User='repl_user',Master_Password='xxxx' for channel 'channel3';
CHANGE MASTER TO MASTER_HOST='host4', MASTER_PORT=3306, Master_User='repl_user',Master_Password='xxxx' for channel 'channel4';
start slave;

这里特别需要注意的地方就是gtid了，purge的gtid是上面从rds的slave倒数据停止从rds主库同步数据的时候gtid，从rds 只读副本倒数据不像自建的mysql可以加master-data=2,rds的没有这种操作，需要确保源端的数据没有写，类似利用mysqldump的时候加锁一样。

原理跟自建mysql的建slave类似，只是有一些操作不同，受制于平台，操作方式有一些不一样而已。

slave起来以后，可能会有一些报错，需要排查，比如我遇到的：

Worker 2 failed executing transaction '1fbf39d5-9c10-11ea-8bc2-0a6232fd7ba6:59345201' at master log mysql-bin-changelog.016295, end_log_pos 5465018; Column 8
of table 'fcoin.fcoinjournals' cannot be converted from type 'datetime' to type 'datetime'


mysqlbinlog -vv relay-bin.010875 --base64-output=decode-rows --start-position=694756|more
 relay-bin.000100

#200716 10:05:35 server id 372383214  end_log_pos 825728 CRC32 0xd9ed5587     Table_map: `fcoin`.`fcoinjournals` mapped to number 893
# at 825961
#200716 10:05:35 server id 372383214  end_log_pos 825821 CRC32 0xe8a9f75c     Write_rows: table id 893 flags: STMT_END_F
### INSERT INTO `fcoin`.`fcoinjournals`
### SET
###   @1=816067 /* LONGINT meta=0 nullable=0 is_null=0 */
###   @2=132231 /* INT meta=0 nullable=0 is_null=0 */
###   @3=0.01000 /* DECIMAL(18,5) meta=4613 nullable=0 is_null=0 */
###   @4=1 /* INT meta=0 nullable=0 is_null=0 */
###   @5='2020-07-16 10:05:35.097668' /* DATETIME(6) meta=6 nullable=0 is_null=0 */
###   @6='瓜分F钻活动获取' /* VARSTRING(1536) meta=1536 nullable=0 is_null=0 */
# at 826054
#200716 10:05:35 server id 372383214  end_log_pos 825885 CRC32 0x712ae0e1     Table_map: `fcoin`.`settledetails` mapped to number 895
# at 826118
#200716 10:05:35 server id 372383214  end_log_pos 825989 CRC32 0x23bce7c7     Update_rows: table id 895 flags: STMT_END_F

我遇到这个就是最开始利用aws同步工具，表结构不是自己建的时候，工具给我把字段建错了，后面都是我自己从源端导入。

可能需要用到：

select * from performance_schema.replication_applier_status_by_worker
show slave status;
start slave for channel 'channel1';
start slave for channel 'channel2';
start slave for channel 'channel3';
start slave for channel 'channel3';


STOP SLAVE;
set @@SESSION.GTID_NEXT= '1fbf39d5-9c10-11ea-8bc2-0a6232fd7ba6:62647265';
begin;commit;
SET SESSION GTID_NEXT = AUTOMATIC;
start slave;

类似跳过的，具体的看报错情况来处理。