linux 6 安装 Nagios服务

Nagios

Nagios是一款用于系统和网络监控的应用程序。它可以在你设定的条件下对主机和服务进行监控,在状态变差和变好的时候给出告警信息。

Nagios更进一步的特征包括:

  1. 监控网络服务(SMTP、POP3、HTTP、NNTP、PING等);
  2. 监控主机资源(处理器负荷、磁盘利用率等);
  3. 简单地插件设计使得用户可以方便地扩展自己服务的检测方法;
  4. 并行服务检查机制;
  5. 具备定义网络分层结构的能力,用"parent"主机定义来表达网络主机间的关系,这种关系可被用来发现和明晰主机宕机或不可达状态;
  6. 当服务或主机问题产生与解决时将告警发送给联系人(通过EMail、短信、用户定义方式);
  7. 具备定义事件句柄功能,它可以在主机或服务的事件发生时获取更多问题定位;
  8. 自动的日志回滚;
  9. 可以支持并实现对主机的冗余监控;
  10. 可选的WEB界面用于查看当前的网络状态、通知和故障历史、日志文件等;

有许多插件可用于监控不同的设备和服务,包括:

  1. HTTP、POP3、IMAP、FTP、SSH、DHCP
  2. CPU负荷、磁盘利用率、内存占用、当前用户数
  3. Unix/Linux、Windows和Netware服务器
  4. 路由器和交换机
  5. 等等

服务端

建立ngios用户

# useradd nagios
# passwd nagios
Changing password for user nagios.
New password: 
BAD PASSWORD: it is WAY too short
BAD PASSWORD: is a palindrome
Retype new password: 
passwd: all authentication tokens updated successfully.
#
# groupadd nagcmd
# usermod -G nagcmd nagios
#
# id nagios
uid=501(nagios) gid=501(nagios) groups=501(nagios),502(nagcmd)

安装服务环境

确认系统上已经安装如下软件包再继续。

  1. Apache
  2. GCC编译器
  3. GD库与开发库

可以用yum命令来安装这些软件包,键入命令:

yum install httpd

yum install gcc

yum install glibc glibc-common

yum install gd gd-devel

下载

# wget http://prdownloads.sourceforge.net/sourceforge/nagios/nagios-4.0.7.tar.gz

# wget http://osdn.dl.sourceforge.net/sourceforge/nagios/nagios-3.0rc1.tar.gz

# wget http://nchc.dl.sourceforge.net/project/nagios/nrpe-2.x/nrpe-2.14/nrpe-2.14.tar.gz

编译与安装Nagios

展开Nagios源程序包
# tar -xvf nagios-4.0.7.tar.gz ... # cd nagios-4.0.7

运行Nagios配置脚本并使用先前开设的用户及用户组: # ./configure -with-command-group=nagcmd ... Review the options above for accuracy. If they look okay, type 'make all' to compile the main program and CGIs.

编译Nagios程序包源码 # make all

安装二进制运行程序、初始化脚本、配置文件样本并设置运行目录权限 # make install
# make install-init
# make install-commandmode
# make install-config

配置WEB接口

安装nagios的apache配置文件
# make install-webconf
/usr/bin/install -c -m 644 sample-config/httpd.conf /etc/httpd/conf.d/nagios.conf

*** Nagios/Apache conf file installed *

web界面
# cp -R contrib/eventhandlers/ /usr/local/nagios/libexec/
# chown -R nagios:nagios /usr/local/nagios/libexec/eventhandlers

检测配置文件是否有错 #
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg Nagios Core 4.0.7 ... Total Warnings: 0 Total Errors: 0 Things look okay - No serious problems were detected during the pre-flight check #

创建一个nagiosadmin的用户用于Nagios的WEB接口登录

# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
New password: 
Re-type new password: 
Adding password for user nagiosadmin

重启Apache服务以使设置生效

# service httpd start
Starting httpd: httpd: apr_sockaddr_info_get() failed for nagios
httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1 for ServerName
                                                           [  OK  ]
# 

将服务加入至开机自启

# chkconfig --add nagios
# chkconfig nagios on
# chkconfig httpd on

启动Nagios服务

# /etc/init.d/nagios start
Starting nagios: done.
# 
# /etc/init.d/nagios status
nagios (pid 16066) is running...

登录WEB接口

浏览器打开 http://localhost/nagios/,在提示下输入你的用户名(nagiosadmin)和口令(刚刚设置的).

被监控客户端 

安装环境

yum install -y gcc glibc glibc-common gd gd-devel xinetd openssl-devel

安装安装Nagios插件和NRPE插件

# useradd nagios
# passwd nagios
#
# tar -xvf nagios-plugins-2.0.3.tar.gz # cd nagios-plugins-2.0.3 # ./configure --with-nagios-user=nagios --with-nagios-group=nagios # make # make install
# chown -R nagios:nagios /usr/local/nagios

#  tar -xvf nrpe-2.14.tar.gz 
#  cd nrpe-2.14
#  ./configure
#  make
#  make install
#  make install-plugin
# make install-daemon #按照安装文档的说明,是将NRPE deamon作为xinetd下的一个服务运行
# make install-daemon-config
# make install-xinetd
#
# chkconfig --add xinetd
# chkconfig xinetd on

 在/etc/xinetd.d/nrpe文件最后一行添加监控主机的IP地址

# tail /etc/xinetd.d/nrpe 
        port            = 5666    
        wait            = no
        user            = nagios
        group           = nagios
        server          = /usr/local/nagios/bin/nrpe
        server_args     = -c /usr/local/nagios/etc/nrpe.cfg --inetd
        log_on_failure  += USERID
        disable         = no
        only_from       = 127.0.0.1 192.168.10.19
}
# 

编辑/etc/services 文件,增加NRPE服务 ,在文件最后 增加一行

# tail /etc/services 
3gpp-cbsp       48049/tcp               # 3GPP Cell Broadcast Service Protocol
isnetserv       48128/tcp               # Image Systems Network Services
isnetserv       48128/udp               # Image Systems Network Services
blp5            48129/tcp               # Bloomberg locator
blp5            48129/udp               # Bloomberg locator
com-bardac-dw   48556/tcp               # com-bardac-dw
com-bardac-dw   48556/udp               # com-bardac-dw
iqobject        48619/tcp               # iqobject
iqobject        48619/udp               # iqobject
nrpe            5666/tcp                # nrpe
# 

重启xinted服务,查看nrpe是否已经启动,查看所定端口是否已经被监控

# service xinetd restart
.Stopping xinetd:                                          [  OK  ]
Starting xinetd:                                           [  OK  ]
# netstat -lantup | grep 5666
tcp        0      0 :::5666                     :::*                        LISTEN      46773/xinetd        
# 

本地测试nrpe,成功启动,会返回版本号

# /usr/local/nagios/libexec/check_nrpe -H localhost
NRPE v2.14

默认只允许本地访问,需要配置增加Nagios服务端的访问IP(192.168.10.19)

# cat /usr/local/nagios/etc/nrpe.cfg | grep -v "^$" | grep -v "^#"
log_facility=daemon
pid_file=/var/run/nrpe.pid
server_port=5666
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=127.0.0.1,192.168.10.19
 
dont_blame_nrpe=0
allow_bash_command_substitution=0
debug=0
command_timeout=60
connection_timeout=300
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200 

再去nagios服务端测试一下,192.168.10.18为被监控机器IP,成功启动 ,会返回版本号,被监控机就配好了。

# /usr/local/nagios/libexec/check_nrpe -H 192.168.10.18
NRPE v2.14

nagios服务端添加被监控主机

监控端的配置信息文件/usr/local/nagios/etc/objects/ localhost.cfg 添加被监控的IP,增加监控服务。

# cat /usr/local/nagios/etc/objects/localhost.cfg | grep -v "^#" | grep -v "^$"
define host{
        use                     linux-server            ; Name of host template to use
                                                        ; This host definition will inherit all variables that are defined
                                                        ; in (or inherited by) the linux-server host template definition.
        host_name               localhost
        alias                   localhost
        address                 127.0.0.1,192.168.10.18
        }
define hostgroup{
        hostgroup_name  linux-servers ; The name of the hostgroup
        alias           Linux Servers ; Long name of the group
        members         localhost     ; Comma separated list of hosts that belong to this group
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             PING
        check_command                   check_ping!100.0,20%!500.0,60%
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             HTTP
        check_command                   check_http
        notifications_enabled           0
        }
... #

检查配置文件,没有错误和警告,就重启nagios服务

#  /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg 

Nagios Core 4.0.7
...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
# service nagios restart

 在浏览器查看添加情况。

 

#更多的监控设置看这个网址的nagios中文手册,很齐全了

http://nagios-cn.sourceforge.net/nagios-cn/cgiconfig.html

Nagios监控体系的框架

 

Nagios通过NRPE来远端管理服务

  1. Nagios执行安装在它里面的check_nrpe插件,并告诉check_nrpe去检测哪些服务。
  2. 通过SSL,check_nrpe连接远端 机子上的NRPE daemon
  3. NRPE运行本地的各种插件去检测本地的服务和状态
  4. 最后,NRPE把检测的结果传给主机端check_nrpe,check_nrpe在把结果 送到Nagios状态队列中,
  5. Nagios依次读取队列中信息,再把结果显示出来

Server安装了nagios软件,对监控的数据做处理,并且提供web界面查看和管理,当然也可以对本机自身的信息 进行监控

Client安装了NRPE等客户端,根据监控机的请求执行监控,然后将结果回传给监控机。

原文地址:https://www.cnblogs.com/zwj-linux/p/11499004.html