MySQL MTS复制: hitting slave_pending_jobs_size_max

测试步骤:

从库停止复制:stop slave;

主库创建大表400万条记录。

开启从库复制:start slave;

监测从库error log持续输出:

2018-12-06T10:40:52.616289+08:00 4 [Note] Multi-threaded slave: Coordinator has waited 2431 times hitting slave_pending_jobs_size_max; current event size = 8207.
2018-12-06T10:40:52.647618+08:00 4 [Note] Multi-threaded slave: Coordinator has waited 2441 times hitting slave_pending_jobs_size_max; current event size = 8207.
2018-12-06T10:40:52.679589+08:00 4 [Note] Multi-threaded slave: Coordinator has waited 2451 times hitting slave_pending_jobs_size_max; current event size = 8207.
2018-12-06T10:40:52.711510+08:00 4 [Note] Multi-threaded slave: Coordinator has waited 2461 times hitting slave_pending_jobs_size_max; current event size = 8207.
2018-12-06T10:40:52.750250+08:00 4 [Note] Multi-threaded slave: Coordinator has waited 2471 times hitting slave_pending_jobs_size_max; current event size = 8207.
2018-12-06T10:40:52.785731+08:00 4 [Note] Multi-threaded slave: Coordinator has waited 2481 times hitting slave_pending_jobs_size_max; current event size = 8207.

搜索发现报错有以下两种情况
第一种
Last_Error: Cannot schedule event Rows_query, relay-log name ./db-s18-relay-bin.000448, position 419156572 to Worker thread because its size 18483519 exceeds 16777216 of slave_pending_jobs_size_max.
第二种
[Note] Multi-threaded slave: Coordinator has waited 701 times hitting slave_pending_jobs_size_max; current event size = 8167.

BUG地址:https://bugs.mysql.com/bug.php?id=68462
以上两种报错,初步判断问题可能在 slave_pending_jobs_size_max 的大小上,此值,官方默认是 16M,此值可以动态调整
slave-pending-jobs-size-max参数说明
在多线程复制时,在队列中Pending的事件所占用的最大内存,默认为16M,如果内存富余,或者延迟较大时,可以适当调大;注意这个值要比主库的max_allowed_packet大
slave-pending-jobs-size-max有如下几种情况:
1.- 如果event大小已经超过了等待任务大小的上限(配置slave-pending-jobs-size-max ),就报event太大的错,然后返回;
2.- 如果event大小+已经在等待的任务大小超过了slave-pending-jobs-size-max,就等待,至到等待队列变小;
3.- 如果当前的worker的队列满的话,也等待。
---------------------
 
检查slave_pending_jobs_size_max参数值为默认:
+-----------------------------+----------+
| Variable_name               | Value    |
+-----------------------------+----------+
| slave_pending_jobs_size_max | 16777216 |
+-----------------------------+----------+
 
调大该参数:
root@localhost:3306.sock [(none)]>set global slave_pending_jobs_size_max=16777216*8;
Query OK, 0 rows affected (0.02 sec)
root@localhost:3306.sock [(none)]>show variables like '%job%';
+-----------------------------+-----------+
| Variable_name               | Value     |
+-----------------------------+-----------+
| slave_pending_jobs_size_max | 134217728 |
+-----------------------------+-----------+
1 row in set (0.00 sec)
 
重新测试,告警消失。

 
WL#11348: Defaults: Increase Slave's Multi-Threaded Event Applier Buffer
看来MySQL 8.0的下一个版本会将默认值提高了。
 
 
 
原文地址:https://www.cnblogs.com/DataArt/p/10075800.html