自动化运维工具之Zabbix

一、部署zabbix

1、配置master节点

  • 准备LAMP环境和zabbix的yum源
# yum install httpd php mariadb-server -y
# vim /etc/my.cnf
[mysqld]
log-bin=master-log
innodb_file_per_table=ON
skip_name_resolve=ON
# systemctl start mariadb
# systemctl enable mariadb
# vim /etc/yum.repos.d/zabbix.repo
[zabbix]
name=zabbix
baseurl=https://mirrors.tuna.tsinghua.edu.cn/zabbix/zabbix/3.4/rhel/7/x86_64/
gpgcheck=0
[non-supported]
name=non-supported
baseurl=https://mirrors.tuna.tsinghua.edu.cn/zabbix/non-supported/rhel/7/x86_64/
gpgcheck=0
  • 安装并配置zabbix
# yum install zabbix-server-mysql zabbix-web-mysql zabbix-agent -y
# mysql
MariaDB [(none)]> create database zabbix character set utf8 collate utf8_bin;
MariaDB [(none)]> grant all privileges on zabbix.* to zabbix@localhost identified by 'zbxpass';
MariaDB [(none)]> grant all privileges on zabbix.* to zabbix@127.0.0.1 identified by 'zbxpass';
MariaDB [(none)]> grant all privileges on zabbix.* to zabbix@'192.168.0.%' identified by 'zbxpass';
MariaDB [(none)]> quit
# zcat /usr/share/doc/zabbix-server-mysql*/create.sql.gz | mysql -uzabbix -h192.168.0.8 -pzbxpass zabbix
# vim /etc/zabbix/zabbix_server.conf
DBHost=192.168.0.8
DBPassword=zbxpass
# systemctl start zabbix-server
# systemctl enable zabbix-server
# vim /etc/httpd/conf.d/zabbix.conf
php_value date.timezone Asia/Shanghai
# systemctl start httpd
# systemctl enable httpd

浏览器访问:http://192.168.0.8/zabbix/,默认用户名密码:Admin/zabbix

2、配置被监控节点

  • 配置zabbix的yum源,和master节点一致
  • 安装zabbix-agent
# yum install zabbix-agent zabbix-sender
  • 配置agent参数
# vim /etc/zabbix/zabbix_agentd.conf
Server=192.168.0.8  #zabbix_master的IP地址,建议使用主机名
ServerActive=192.168.0.8
Hostname=node01.zabbix.com
  • 启动agent
# systemctl start zabbix-agent
# systemctl enable zabbix-agent

二、监控系统

1、基本具有的功能

数据采集功能:

  • ssh/telnet
  • SNMP
  • IPMI
  • JMX
  • agent

数据存储:

  • SQL
  • NoSQL
  • rrd

可视化:

  • grafana

告警:

2、zabbix

  • zabbix server
  • zabbix database(MySQL)
  • zabbix web gui(LAMP)
  • zabbix proxy
  • zabbix agent

3、监控基本术语

主机(host) -- 主机组(host group)

监控项(item) -- 应用(application)

触发器(trigger):阈值,trigger event

动作(action):conditions(条件),operations(操作)

三、zabbix基本监控流程(以下为webGUI操作)

1、添加主机及主机组

Configuration -- Hosts -- Create host -- Add

Host name: node01.zabbix.com
Visible name: node01
New group: MyServers
Agent interfaces: 
	IP address: 192.168.0.9
	Port: 10050

2、创建Item

Configuration -- Hosts -- Items -- Create item -- Add

Item:
	Name: inbound packets
	Type: Zabbix agent
	Key: net.if.in[eth0,packets]
	Host interface: 192.168.0.9:10050
	Type of information: Numeric(unsigned) #无符号整数
	Units: packets/second
	Update interval: 5s
	History storage period: 90d  #历史数据保存90天
	Trend storage period: 365d  #趋势数据
	Show value: As is  #数据状态转换(不转换)
	New application: net traffic
	Populates host inventory field: None  #是否加入资产清单
Preprocessing:  #数据预处理
	Preprocessing steps:
		Name: Change per second  #计算每秒钟的变化量

3、克隆Item

Configuration -- Hosts -- Items -- inbound packets -- Clone -- Add

Name: inbound bytes
Key: net.if.in[eth0,bytes]
Units: Bps
Name: outbound packets
Key: net.if.out[eth0,packets]
Units: packets/second
Name: outbound bytes
Key: net.if.out[eth0,bytes]
Units: Bps

4、创建Trigger

Configuration -- Hosts -- Triggers -- Create trigger -- Add

Name: inbound packets too fast
Expression: {node01.zabbix.com:net.if.in[eth0,packets].last(#1)}>100
	Add:
		item: node01: inbound packets
		Function: last()-Last(most recent) T value
		Last of(T): 1 Count
		Result > 100
OK event generation: Expression  #事件恢复处理
PROBLEM event generation mode: Single  #事件报告触发一次
OK event closes: All problems  #恢复后关闭事件

5、创建Action

action: event dirven 事件驱动,触发动作

​ conditions 条件

​ operations 操作

​ OK - PROBLEM operations

​ PROBLEM - OK recovery operations

​ ackownlegement operations

remote command 远程命令

send message 发送消息

1、在node01上安装nginx

# yum install nginx -y
# systemctl start nginx

2、将nginx加入item

Configuration -- Hosts -- Items -- Create item -- Add

Name: nginx service state
Key: net.tcp.port[192.168.0.9,80]
Update interval: 5s
Show value: Service state
New application: nginx status

3、定义一个Trigger

Configuration -- Hosts -- Triggers -- Create trigger -- Add

Name: nginx down
Severity: High
Expression: {node01.zabbix.com:net.tcp.port[192.168.0.9,80].last(#3)}=0

4、定义一个Action,Event source:Triggers

Configuration -- Actions -- Create action(注意:事件源选择Triggers event) -- Add

Action: 动作
	Name: nginx service
	Type of calculation: And  #满足以下俩个条件触发操作
	Conditions: 条件
		Trigger = node01: nginx down
		Maintenance status not in maintenance  #非维护期间
Operations: 操作
	New:
		Steps: 1-1
		Operation type: Remote command
Recovery operations: 恢复操作
	Target list: Current host
	Type: Custom script
	Execute on: Zabbix agent
	Commands: sudo /usr/bin/systemctl restart nginx.service
Acknowledgement operations: 确认操作

5、利用agent来执行远程命令时需要授予zabbix用户sudo权限,并且修改agent的配置文件,在node01做以下配置

[root@node01 ~]# visudo
root    ALL=(ALL)   ALL
zabbix  ALL=(ALL)   NOPASSWD: ALL
[root@node01 ~]# vim /etc/zabbix/zabbix_agentd.conf
EnableRemoteCommands=1
[root@node01 ~]# systemctl restart zabbix-agent

6、创建媒介,实现通过邮件发送报警信息

1、在master节点安装mailx,利用脚本发送邮件

[root@master ~]# yum install mailx -y

2、添加媒介

Administration -- Media types -- Create media type -- Add

Media type:
	Name: local email
	Type: Email
	SMTP server: localhost
	SMTP server port: 25
	SMTP helo: localhost
	SMTP email: zabbix@localhost
	Connection security: None
	Authentication: None
Options:
	Concurrent sessions: Unlimited

3、为admin用户添加端点

Administration -- Admin -- Media -- Add -- Update

Type: local email
Send to: dongfei@localhost

4、继nginx service的Action添加报警升级操作

Configuration -- Actions -- nginx service -- Operations -- New -- Add -- Update

Steps: 2-2
Operation type: Send message
Send to Users: Admin (Zabbix Administrator)
Send only to: local email
Recovery operations:  #恢复后发送邮件
	Send to Users: Admin (Zabbix Administrator)
	Send only to: local email

5、测试验证:可以将nginx的端口改成8080,然后将nginx进程杀掉,查看监控信息,到master节点切换到dongfei用户使用mail命令接受邮件查看报警

四、macro宏 -- 预设的文本替换模式

zabbix中宏有三个级别

  • 全局级别
  • 模板级别
  • 主机级别

1、内建宏

引用方法:{MACRO_NAME}

参考:https://www.zabbix.com/documentation/3.4/manual/appendix/macros/supported_by_location

2、自定义宏

引用方法:{$MACRO_NAME}

全局宏定义:Administration -- General -- Macros(右侧下拉列表) -- Add -- Update

主机宏定义:Configuration -- Hosts -- node01 -- Macros -- Add -- Update

模板宏定义:Configuration -- Templates -- Template OS Linux -- Macros -- Add -- Update

五、Template 模板

1、将模板连接至主机:Configuration -- Hosts -- node01 -- Templates

2、自定义模板:Configuration -- Templates -- Create template

Template name: my template
Visible name: template for os linux
New group: my template

3、导入模板:Configuration -- Templates -- Import

4、到https://share.zabbix.com/下载模板,找到项目所在的GitHub站点

# yum install git -y
# git clone https://github.com/cuimingkun/zbx_tem_redis.git
# sz zbx_tem_redis/redis_templates_for_zbx_3.4.xml  #导出到windows(本地),web导入到zabbix的模板
# scp zbx_tem_redis/userparameter_redis_lld_plus.conf node01:/etc/zabbix/zabbix_agentd.d/  #需要将自定义key的配置文件放到agent上

六、自定义key

1、直接定义key

1、在agent端定义

[root@node01 ~]# vim /etc/zabbix/zabbix_agentd.d/test.conf
UserParameter=memory.used,/usr/bin/free | /usr/bin/awk '/^Mem/{print $3}'
UserParameter=memory.shm,/usr/bin/free | /usr/bin/awk '/^Mem/{print $5}'
[root@node01 ~]# systemctl restart zabbix-agent.service

2、在master端测试

[root@master ~]# yum install zabbix-get -y
[root@master ~]# zabbix_get -s node01 -p 10050 -k "memory.used"
181076  #获取到的数据
[root@master ~]# zabbix_get -s node01 -p 10050 -k "memory.shm"

2、key的参数传递

[root@node01 ~]# vim /etc/zabbix/zabbix_agentd.d/memory.conf
UserParameter=memory.usage[*],/usr/bin/awk '/^$1/{print $$2}' /proc/meminfo  #此处awk中的$2需要做逃逸
[root@node01 ~]# systemctl restart zabbix-agent.service
[root@master ~]# zabbix_get -s node01 -p 10050 -k "memory.usage[MemFree]"
[root@master ~]# zabbix_get -s node01 -p 10050 -k "memory.usage[Shmem]"

3、在host上创建item

Name: memory MemFree
Key: memory.usage[MemFree]  #将参数MemFree传递给Key来获取空闲内存值
New application: memory stats
Name: memory Buffers
Key: memory.usage[Buffers]

七、Discovery 自动发现

1、创建自动发现扫描规则

Configuration -- Discovery -- Create discovery rule -- Add

Name: My Net 1
IP range: 192.168.0.1-20
Update interval: 30s  #做测试用,30s扫描一次
Checks: Zabbix agent "system.uname"
Device uniqueness criteria: Zabbix agent "system.uname"

2、在node02上安装agent

[root@node02 ~]# yum install zabbix-agent zabbix-sender -y
[root@node02 ~]# vim /etc/zabbix/zabbix_agentd.conf
Server=master.zabbix.com
ServerActive=Server=master.zabbix.com
Hostname=node02.zabbix.com
[root@node02 ~]# systemctl start zabbix-agent.service 
[root@node02 ~]# systemctl enable zabbix-agent.service

3、添加发现行为

Configuration -- Actions -- Event source(Discovery) -- Create action -- Add

Action:
	Name: Add My Net Hosts
	Type of calculation: And
	Conditions:
		A	Discovery rule = My Net 1
		B	Discovery status = Discovered
Operations:
	Operations:
		Send message to users: Admin (Zabbix Administrator) via local email
		Add host
		Link to templates: Template OS Linux

八、主动监控方式(默认是被动方式)

在agent的基本配置:

  • ServerActive=master.zabbix.com
  • Hostname=node02.zabbix.com
  • HostnameItem=system.hostname

1、主动检测的数据发送方式

Configuration -- Hosts -- Items -- Create item -- Add

Item:
	Name: net traffic in bytes
	Type: Zabbix agent (active)  #agent主动向zabbix_server发送数据
	Units: bps
	Applications: net traffic
Preprocessing:
	Preprocessing steps:
		Change per second

2、zabbix_sender的数据发送方式

Configuration -- Hosts -- Items -- Create item -- Add

Name: test sender metric
Type: Zabbix trapper
Key: test.sender.metric
New application: sender data

在node02端定义发送的数据

[root@node02 ~]# zabbix_sender -z master.zabbix.com -s "node02.zabbix.com" -k "test.sender.metric" -o "875"
[root@node02 ~]# zabbix_sender -z master.zabbix.com -s "node02.zabbix.com" -k "test.sender.metric" -o "`free -m |awk '/^Mem/{print $3}'`"

九、web监控

监控指定的站点的资源下载速度,及页面响应时间,还有响应代码

web.test.in[Scenario,Step,bps]:传输速率
web.test.time[Scenario,Step]:响应时长
web.test.rspcode[Scenario,Step]:响应码

创建web监控:Configuration -- Hosts -- Web -- Create web scenario

Scenario:
	Name: node02 web ui
	New application: node02 web ui performance
	Update interval: 10s
	Agent: Chrome 38.0(Linux)
Steps: 1:	home page	15s	http://192.168.0.10/index.html		200
	Add:
		Name: home page
		URL: http://192.168.0.10/index.html
		Retrieve only headers: √
		Required status codes: 200

十、SNMP监控

Simple Network Management Protocol:简单网络管理协议

  • agent/manager
  • Net-SNMP
  • net-snmp-utils

SNMP的三个版本

  • v1
  • v2c:community name is the password , public
  • v3:支持认证和加密传输

MIB:管理信息库,OID == Object Id

1、配置zabbix支持SNMP

在被监控主机中安装

# yum install net-snmp net-snmp-utils -y  #net-snmp-utils用来测试用

配置启动服务

# vim /etc/snmp/snmpd.conf
#view    systemview    included   .1.3.6.1.2.1.1
#view    systemview    included   .1.3.6.1.2.1.25.1.1
view    systemview    included   .1.3.6.1
# systemctl start snmpd
# systemctl enable snmpd

在本机测试

# snmptranslate -Tp .1.3.6.1.2.1 |more
# snmpget -v 2c -c public 192.168.0.10 .1.3.6.1.2.1.1.1.0  #获取系统描述信息
# snmpwalk -v 2c -c public 192.168.0.10 .1.3.6.1.2.1.25.4.2.1.2  #获取进程列表

2、在zabbix配置监控

Configuration -- Hosts -- Create Host -- Add

Host name: node02
Visible name: node02
New group: my linux servers
SNMP interfaces: 192.168.0.10 DNS 161

-- Item -- Create Item -- Add

Item:
	Name: net traffic in bytes
	Type: SNMPv2 agent
	Key: net.if.in.bytes
	SNMP OID: .1.3.6.1.2.1.2.2.1.10.2
	SNMP community: public
	Units: bps
	Update interval: 5s
	New application: net traffic
Preprocessing:
	Preprocessing steps: Change per second

十一、JMX监控

JMX:java管理扩展

1、在node02上安装配置Tomcat

[root@node02 ~]# yum install java-1.8.0-openjdk-devel tomcat tomcat-admin-webapps tomcat-webapps tomcat-docs-webapps -y
[root@node02 ~]# vim /etc/tomcat/tomcat.conf  #加入以下配置
CATALINA_OPTS="-Djava.rmi.server.hostname=192.168.0.10 -Djavax.management.bui
lder.initial= -Dcom.sun.management.jmxremote=true   -Dcom.sun.management.jmxr
emote.port=12345  -Dcom.sun.management.jmxremote.ssl=false  -Dcom.sun.managem
ent.jmxremote.authenticate=false"
[root@node02 ~]# systemctl start tomcat
[root@node02 ~]# ss -tnl |grep 12345
LISTEN     0      50          :::12345
[root@node02 ~]# systemctl enable tomcat

2、在zabbix-server上安装配置zabbix-java-gateway(如果有大量的jvm需要被监控,那么java-gateway需要独立安装到一台服务器中)

[root@master ~]# yum install zabbix-java-gateway -y
[root@master ~]# vim /etc/zabbix/zabbix_java_gateway.conf
LISTEN_PORT=10052
START_POLLERS=5
[root@master ~]# systemctl start zabbix-java-gateway
[root@master ~]# ss -tnl |grep 10052
LISTEN     0      50          :::10052
[root@master ~]# vim /etc/zabbix/zabbix_server.conf
JavaGateway=192.168.0.8
JavaGatewayPort=10052
StartJavaPollers=5
[root@master ~]# systemctl restart zabbix-server

3、在zabbix的webGUI中配置监控

Configuration -- Hosts -- Create Host -- Add

JMX interfaces: 192.168.0.10 12345
Linked templates: Template App Apache Tomcat JMX

十二、zabbix的分布式监控

1、配置zabbix_proxy,192.168.0.11

[root@zabbix_proxy ~]# yum install mariadb-server zabbix-proxy-mysql zabbix-get zabbix-agent zabbix-sender -y
[root@zabbix_proxy ~]# vim /etc/my.cnf
[mysqld]
skip_name_resolve=0
[root@zabbix_proxy ~]# systemctl start mariadb
[root@zabbix_proxy ~]# systemctl enable mariadb
[root@zabbix_proxy ~]# mysql
MariaDB [(none)]> CREATE DATABASE zbxproxy character set utf8 collate utf8_bin;
MariaDB [(none)]> GRANT ALL PRIVILEGES ON zbxproxy.* TO zabbix@localhost IDENTIFIED BY 'zbxpass';
[root@zabbix_proxy ~]# zcat /usr/share/doc/zabbix-proxy-mysql-3.4.13/schema.sql.gz |mysql -uzabbix -pzbxpass zbxproxy
[root@zabbix_proxy ~]# vim /etc/zabbix/zabbix_proxy.conf
Server=192.168.0.8
Hostname=zabbix_proxy  #注意此处的主机名必须可以被解析
ListenPort=10051
DBName=zbxproxy
DBUser=zabbix
DBPassword=zbxpass
HeartbeatFrequency=20
ConfigFrequency=10
DataSenderFrequency=1
[root@zabbix_proxy ~]# systemctl start zabbix-proxy.service 
[root@zabbix_proxy ~]# systemctl enable zabbix-proxy.service

2、在zabbix的webGUI中配置proxy

Administration -- Proxies -- Create proxy -- Add

Proxy name: zabbix_proxy  #此处的主机名一定需要被解析

3、加入被proxy代理的被监控端,注意:被监控的agent需要配置允许proxy监控

Configuration -- Hosts -- Create host -- Add

Host:
	Host name: master.dongfei.tech
	Visible name: k8s_master
	New group: my linux servers
	Agent interfaces: 192.168.0.12 10050
	Monitored by proxy: zabbix_proxy
Templates:
	Linked templates: Template OS Linux

4、由proxy的自动发现

Configuration -- Create discovery rule -- Add

Name: My Net 2
Discovery by proxy: zabbix_proxy
IP range: 192.168.0.1-20
Update interval: 1h
Checks: Zabbix agent "system.uname"
Device uniqueness criteria: Zabbix agent "system.uname"
原文地址:https://www.cnblogs.com/L-dongf/p/9589243.html