【PXC】关于限流的参数,状态值说明

一.什么是流控(FC)?如何工作?

节点接收写集并把它们按照全局顺序组织起来,节点将接收到的未应用和提交的事务保存在接收队列中,
当这个接收队列达到一定的大小,将触发限流;此时节点将暂停复制,节点会先处理接收队列中的任务。
当接收队列减小到一个可管理的值后,复制将恢复。

它普遍存在于galera集群系统。

二.流控是发生了什么,会有哪些全局值可以观察到流控? 

mysql>  show global status like 'wsrep_flow%';
+----------------------------------+----------------+
| Variable_name                    | Value          |
+----------------------------------+----------------+
| wsrep_flow_control_paused_ns     | 0              |
| wsrep_flow_control_paused        | 0.000000       |
| wsrep_flow_control_sent          | 0              |
| wsrep_flow_control_recv          | 0              |
| wsrep_flow_control_interval      | [ 1024, 1024 ] |
| wsrep_flow_control_interval_low  | 1024           |
| wsrep_flow_control_interval_high | 1024           |
| wsrep_flow_control_status        | OFF            |
+----------------------------------+----------------+

wsrep_flow_control_paused_ns

限流发生时,复制同步暂停的时间(各节点都可能出现,不适合作为监控项)

wsrep_flow_control_paused

该状态值发生变化,含义为从上一次SHOW GLOBAL STATUS命令开始,
限流占全体同步数据时间的百分比(初始值0.0),理想情况下应该趋近于0.0;
当它比较大的时候(超过0.6),我们需要采取一些手段(添加新节点、删除慢节点,调高wsrep_slave_threads值)
来改善限流情况

wsrep_flow_control_sent

本地节点发送给集群的限流事件信息数量,可以用来当做监控项,来确认哪个节点导致了限流产生

wsrep_flow_control_recv

本地节点收到的集群限流事件信息数量

wsrep_flow_control_interval

wsrep_flow_control_interval_low

wsrep_flow_control_interval_high

wsrep_flow_control_status

如何进行限流调优?

1. wsrep_slave_threads
The number of threads to use for applying slave write sets.
用于设置读节点执行写集的线程个数

mysql> show global variables like 'wsrep_slave_threads';
+---------------------+-------+
| Variable_name       | Value |
+---------------------+-------+
| wsrep_slave_threads | 24    |
+---------------------+-------+

默认值1是远远不够的,我们需要根据另外一个状态值进行调整
公司安装的PXC集群,默认该参数值为16(也是不够的)

2. wsrep_cert_deps_distance
可以并行执行的最高与最低队列值之间的平均距离
代表可以同时并行执行多少个写集的操作

mysql> show global status like 'wsrep_cert_deps_distance';
+--------------------------+-----------+
| Variable_name            | Value     |
+--------------------------+-----------+
| wsrep_cert_deps_distance | 95.288623 |
+--------------------------+-----------+

我们可以将wsrep_slave_threads的值按照wsrep_cert_deps_distance的值设置

注意:刚做完SST的时候,这个状态值会非常高,然后缓慢下降,此时该值不具备参考性

mysql> show global status like 'wsrep_cert_deps_distance';
+--------------------------+-------------+
| Variable_name            | Value       |
+--------------------------+-------------+
| wsrep_cert_deps_distance | 6015.952973 |
+--------------------------+-------------+
1 row in set (0.00 sec)
 
mysql> show global status like 'wsrep_cert_deps_distance';
+--------------------------+-------------+
| Variable_name            | Value       |
+--------------------------+-------------+
| wsrep_cert_deps_distance | 5840.756210 |
+--------------------------+-------------+
1 row in set (0.00 sec)
 
mysql> show global status like 'wsrep_cert_deps_distance';
+--------------------------+-------------+
| Variable_name            | Value       |
+--------------------------+-------------+
| wsrep_cert_deps_distance | 5421.252076 |
+--------------------------+-------------+
1 row in set (0.00 sec)

其他参数、状态值

1. wsrep_local_recv_queue_%

mysql> show global status like 'wsrep_local_recv_queue_avg';
+----------------------------+----------+
| Variable_name              | Value    |
+----------------------------+----------+
| wsrep_local_recv_queue_avg | 0.110581 |
+----------------------------+----------+
1 row in set (0.00 sec)

When the node returns a value higher than 0.0 it means that the node cannot apply write-sets as fast as it receives them,
which can lead to replication throttling.
简单地说:这个值高于0.0,说明发生同步延迟,将会引起限流

mysql> show global status like 'wsrep_local_recv_queue_m%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| wsrep_local_recv_queue_max | 3788  |
| wsrep_local_recv_queue_min | 0     |
+----------------------------+-------+
2 rows in set (0.00 sec)

In addition to this status variable, you can also use wsrep_local_recv_queue_max and wsrep_local_recv_queue_min
to see the maximum and minimum sizes the node recorded for the local received queue.

 
原文地址:https://www.cnblogs.com/asea123/p/10096814.html