Prometheus 监控 mysql

MySQL Server Exporter

用于Prometheus收集MySQL 服务器的指标的 export

支持版本:

  • MySQL >= 5.6.
  • MariaDB >= 10.1

开始

下载地址:

https://github.com/prometheus/mysqld_exporter/releases/download/v0.12.1/mysqld_exporter-0.12.1.linux-amd64.tar.gz

添加一个用户授权

GRANT PROCESS, REPLICATION CLIENT ON *.* TO 'pron'@'localhost' IDENTIFIED BY 'Prometh.123' WITH MAX_USER_CONNECTIONS 3;

建议设置对用户的最大连接数限制,以免在高峰期导致业务挂掉。

运行

使用环境变量启动

export DATA_SOURCE_NAME='user:password@(hostname:3306)/'
# 通过socket连接
export DATA_SOURCE_NAME=pron:Prometh.123@unix(/mysql_pxc/mysql-smy.sock)/
nohup ./mysqld_exporter &

使用配置文件 ~/.my.cnf 启动

cat <<EOF> ~/.my.cnf
[client]
user=pron
password=Prometh.123
EOF

# 启动
nohup ./mysqld_exporter --config.my-cnf=~/.my.cnf &

curl localhost:9104/metrics 即可访问到数据

......
mysql_global_status_buffer_pool_page_changes_total{operation="flushed"} 1.270503e+07
mysql_global_status_buffer_pool_page_changes_total{operation="lru_flushed"} 0
mysql_global_status_buffer_pool_page_changes_total{operation="made_not_young"} 0
mysql_global_status_buffer_pool_page_changes_total{operation="made_young"} 6436
# HELP mysql_global_status_buffer_pool_pages Innodb buffer pool pages by state.
# TYPE mysql_global_status_buffer_pool_pages gauge
mysql_global_status_buffer_pool_pages{state="data"} 562310
mysql_global_status_buffer_pool_pages{state="free"} 2.563656e+06
mysql_global_status_buffer_pool_pages{state="misc"} 19754
mysql_global_status_buffer_pool_pages{state="old"} 207408
# HELP mysql_global_status_bytes_received Generic metric from SHOW GLOBAL STATUS.
# TYPE mysql_global_status_bytes_received untyped
mysql_global_status_bytes_received 1.18030182493e+11
# HELP mysql_global_status_bytes_sent Generic metric from SHOW GLOBAL STATUS.
# TYPE mysql_global_status_bytes_sent untyped
mysql_global_status_bytes_sent 1.410236025522e+12
# HELP mysql_global_status_commands_total Total number of executed MySQL commands.
# TYPE mysql_global_status_commands_total counter
mysql_global_status_commands_total{command="admin_commands"} 2.25534e+06
......

添加到 Prometheus

 - job_name: 'mysqld'
   static_configs:
    - targets:
      - 192.168.1.2:9104

重启Prometheus即可

grafana 推荐使用模板

https://grafana.com/grafana/dashboards/7362

告警规则

mysql.yaml

groups:
- name: GaleraAlerts
  rules:
  - alert: MySQLGaleraNotReady
    expr: mysql_global_status_wsrep_ready != 1
    for: 5m
    labels:
      severity: warning
    annotations:
      description: '{{$labels.job}} on {{$labels.instance}} is not ready.'
      summary: Galera cluster node not ready
  - alert: MySQLGaleraOutOfSync
    expr: (mysql_global_status_wsrep_local_state != 4 and mysql_global_variables_wsrep_desync
      == 0)
    for: 5m
    labels:
      severity: warning
    annotations:
      description: '{{$labels.job}} on {{$labels.instance}} is not in sync ({{$value}}
        != 4).'
      summary: Galera cluster node out of sync
  - alert: MySQLGaleraDonorFallingBehind
    expr: (mysql_global_status_wsrep_local_state == 2 and mysql_global_status_wsrep_local_recv_queue
      > 100)
    for: 5m
    labels:
      severity: warning
    annotations:
      description: '{{$labels.job}} on {{$labels.instance}} is a donor (hotbackup)
        and is falling behind (queue size {{$value}}).'
      summary: xtradb cluster donor node falling behind
  - alert: MySQLReplicationNotRunning
    expr: mysql_slave_status_slave_io_running == 0 or mysql_slave_status_slave_sql_running
      == 0
    for: 2m
    labels:
      severity: critical
    annotations:
      description: Slave replication (IO or SQL) has been down for more than 2 minutes.
      summary: Slave replication is not running
  - alert: MySQLReplicationLag
    expr: (mysql_slave_lag_seconds > 30) and on(instance) (predict_linear(mysql_slave_lag_seconds[5m],
      60 * 2) > 0)
    for: 1m
    labels:
      severity: critical
    annotations:
      description: The mysql slave replication has fallen behind and is not recovering
      summary: MySQL slave replication is lagging
  - alert: MySQLReplicationLag
    expr: (mysql_heartbeat_lag_seconds > 30) and on(instance) (predict_linear(mysql_heartbeat_lag_seconds[5m],
      60 * 2) > 0)
    for: 1m
    labels:
      severity: critical
    annotations:
      description: The mysql slave replication has fallen behind and is not recovering
      summary: MySQL slave replication is lagging
  - alert: MySQLInnoDBLogWaits
    expr: rate(mysql_global_status_innodb_log_waits[15m]) > 10
    labels:
      severity: warning
    annotations:
      description: The innodb logs are waiting for disk at a rate of {{$value}} /
        second
      summary: MySQL innodb log writes stalling

个人看法

个人觉得如果只是监控MySQL的话,用pmm2比较好,因为pmm2还有慢sql监控,送上几张图片

原文地址:https://www.cnblogs.com/fsckzy/p/13331200.html