(转)CentOS搭建Nagios监控

A.Nagios服务端
1.安装软件包

  1. yum install -y httpd

2.下载nagios

  1. wget http://syslab.comsenz.com/downloads/linux/nagios-3.0.5.tar.gz
  2. wget http://syslab.comsenz.com/downloads/linux/nagios-plugins-1.4.13.tar.gz
  3. wget http://syslab.comsenz.com/downloads/linux/nrpe-2.12.tar.gz

3.添加nagios账号

  1. useradd nagios

4.编译安装nagios

  1. mkdir /opt/hadoop/
  2. tar -xzvf nagios-3.0.5.tar.gz
  3. cd nagios-3.0.5
  4. ./configure --prefix=/opt/hadoop/nagios
  5. make all
  6. make fullinstall
  7. mkdir /opt/hadoop/nagios/etc
  8. mkdir /opt/hadoop/nagios/etc/objects
  9. cp ./sample-config/cgi.cfg /opt/hadoop/nagios/etc/
  10. cp ./sample-config/nagios.cfg /opt/hadoop/nagios/etc/
  11. cp ./sample-config/resource.cfg /opt/hadoop/nagios/etc/
  12. cp ./sample-config/template-object/commands.cfg /opt/hadoop/nagios/etc/objects/
  13. cp ./sample-config/template-object/contacts.cfg /opt/hadoop/nagios/etc/objects/
  14. cp ./sample-config/template-object/timeperiods.cfg /opt/hadoop/nagios/etc/objects/
  15. cp ./sample-config/template-object/templates.cfg /opt/hadoop/nagios/etc/objects/
  16. cp ./sample-config/template-object/localhost.cfg /opt/hadoop/nagios/etc/objects/
  17. touch /opt/hadoop/nagios/var/nagios.log
  18. chmod -R 755/opt/hadoop/nagios/etc/
  19. chown -R nagios:nagios /opt/hadoop/nagios

5.编译安装nagios-plugins

  1. tar zxvf nagios-plugins-1.4.13.tar.gz
  2. cd nagios-plugins-1.4.13
  3. ./configure --prefix=/opt/hadoop/nagios --with-nagios-user=nagios --with-nagios-group=nagios
  4. make && make install

检查是否已经安装成功,看这个目录下是否有插件文件

  1. ls /opt/hadoop/nagios/libexec/

6.安装nrpe

  1. tar zxvf nrpe-2.12.tar.gz
  2. cd nrpe-2.12
  3. ./configure --prefix=/opt/hadoop/nagios --enable-ssl --enable-command-args
  4. make all
  5. make install-plugin
  6. make install-daemon
  7. make install-daemon-config

7.配置httpd
添加web账号

  1. htpasswd -c /opt/hadoop/nagios/etc/htpasswd.users nagiosadmin

B.Nagios客户端
1.准备软件包

  1. wget http://syslab.comsenz.com/downloads/linux/nagios-plugins-1.4.13.tar.gz
  2. wget http://syslab.comsenz.com/downloads/linux/nrpe-2.12.tar.gz

2.添加nagios账号,准备安装目录

  1. mkdir /opt/hadoop/nagios
  2. useradd nagios

3.编译安装nrpe

  1. tar -xzvf nrpe-2.12.tar.gz
  2. cd nrpe-2.12
  3. ./configure --prefix=/opt/hadoop/nagios --enable-ssl --enable-command-args
  4. make all
  5. make install-plugin
  6. make install-daemon
  7. make install-daemon-config

4.安装nagios-plugin

  1. tar -xzvf nagios-plugins-1.4.13.tar.gz
  2. cd nagios-plugins-1.4.13
  3. ./configure --prefix=/opt/hadoop/nagios --with-nagios-user=nagios --with-nagios-group=nagios
  4. make && make install

检查是否已经安装成功,看这个目录下是否有插件文件

  1. ls /opt/hadoop/nagios/libexec/

5. 配置nrpe

  1. vim /opt/hadoop/nagios/etc/nrpe.cfg
  2. 找到”allowed_hosts=127.0.0.1改成allowed_hosts=127.0.0.1,10.130.2.72”,后边的IPnagios服务端IP
  3. 找到” dont_blame_nrpe=0改成dont_blame_nrpe=1

6.一段nrpe启停脚本,放在/etc/init.d/nrpe里

  1. #!/bin/bash
  2. #
  3. # chkconfig: 2345 55 25
  4. # description: NRPE Daemon
  5. #
  6.  
  7. # source function library
  8. ./etc/rc.d/init.d/functions
  9.  
  10. RETVAL=0
  11.  
  12. prog='nrpe'
  13. NRPE_CFG='/opt/hadoop/nagios/etc/nrpe.cfg'
  14. NRPE_PRG='/opt/hadoop/nagios/bin/nrpe'
  15. NRPE_OPT='-d'
  16. PID_FILE='/var/run/nrpe.pid'
  17.  
  18. start()
  19. {
  20. echo -n $"Starting $prog: "
  21. [-f $PID_FILE ]&& rm -f $PID_FILE
  22. $NRPE_PRG -c $NRPE_CFG $NRPE_OPT
  23. pid=`ps aux | grep -v grep | grep $NRPE_PRG | awk '{print $2}'`
  24. echo $pid > $PID_FILE
  25.  
  26. if ps aux | grep -v grep | grep -q $NRPE_PRG ;then
  27. RETVAL=0
  28. success
  29. else
  30. RETVAL=1
  31. failure
  32. fi
  33. echo
  34. }
  35.  
  36. stop()
  37. {
  38. echo -n $"Stopping $prog: "
  39. ps --pid=`cat $PID_FILE`&>/dev/null
  40. if[ $?-eq 0];then
  41. kill -9`cat $PID_FILE`
  42. RETVAL=0
  43. fi
  44. success
  45. echo
  46. RETVAL=0
  47. }
  48.  
  49. case"$1"in
  50. start)
  51. start
  52. ;;
  53. stop)
  54. stop
  55. ;;
  56. restart)
  57. stop
  58. start
  59. ;;
  60. status)
  61. status -p $PID_FILE $prog
  62. RETVAL=$?
  63. ;;
  64. *)
  65. echo $"Usage: $0 {start|stop|restart|status}"
  66. RETVAL=1
  67. esac
  68. exit $RETVAL

6. 启动nrpe

  1. /etc/init.d/nrpe start

C.Nagios服务端添加被监控机
1.配置监控机目录

  1. mkdir /opt/hadoop/nagios/etc/servers
  2. vim /opt/hadoop/nagios/etc/nagios.cfg 追加cfg_dir=/opt/hadoop/nagios/etc/servers

2.添加配置的机器

  1. vim /opt/hadoop/nagios/etc/servers/10.130.2.22.cfg
  2. define host{
  3. use linux-server
  4. host_name 10.130.2.22
  5. alias 10.130.2.22
  6. address 10.130.2.22
  7. }
  8. define service{
  9. use generic-service
  10. host_name 10.130.2.22
  11. service_description check_ping
  12. check_command check_ping!100.0,20%!200.0,50%
  13. max_check_attempts 5
  14. normal_check_interval 1
  15. }
  16. define service{
  17. use generic-service
  18. host_name 10.130.2.22
  19. service_description check_ssh
  20. check_command check_ssh
  21. max_check_attempts 5
  22. normal_check_interval 1
  23. }

3.reload nagios服务端使配置生效

  1. service nagios reload

重新加载nagios后就可以在nagios的界面上看到新的被监控的机器了
4.添加使用nrpe的监控

  1. 在/opt/hadoop/nagios/etc/objects/commands.cfg里增加如下行
  2. define command{
  3. command_name check_nrpe
  4. command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
  5. }

在服务器监控配置文件中加入如下行,确保被监控机的nrpe服务是开的

  1. define service{
  2. use generic-service
  3. host_name 10.130.2.22
  4. service_description check_load
  5. check_command check_nrpe!check_load
  6. max_check_attempts 5
  7. normal_check_interval 1
  8. }

重新加载nagios使配置生效。

  1. service nagios reload

5.自定义监控脚本
编写脚本check_diskmount.sh

  1. vim /opt/hadoop/nagios/libexec/check_diskmount.sh
  2. #!/bin/bash
  3. num=`cat /proc/mounts | grep '/disk' | wc -l`
  4. if[ $num -eq 12];then
  5. echo "OK - mount disk is $num"
  6. exit 0
  7. else
  8. echo "Critical - mount disk is $num"
  9. exit 1
  10. fi

加上可执行权限

  1. chmod +x /opt/hadoop/nagios/libexec/check_diskmount.sh

在被监控机的nrpe里加入自定义脚本路径

  1. vim /opt/hadoop/nagios/etc/nrpe.cfg
  2. command[check_diskmount]=/opt/hadoop/nagios/libexec/check_diskmount.sh

重启nrpe

  1. /etc/init.d/nrpe restart

在nagios服务端加入配置

  1. vim /opt/hadoop/nagios/etc/servers/10.130.2.22.cfg
  2. define service{
  3. use generic-service
  4. host_name s9xplan2.isv.cm6
  5. service_description check_diskmount
  6. check_command check_nrpe!check_diskmount
  7. max_check_attempts 3
  8. normal_check_interval 1
  9. }

重新加载nagios,使得配置生效

    1. service nagios reload

摘自:http://www.opstool.com/article/236

原文地址:https://www.cnblogs.com/newmanzhang/p/3270094.html