使用windos电脑模拟搭建集群(三)实现全网监控

这里我们采用小米监控 open-falcon  这是server端就是 192.168.5.200 这台主机, agent就是负责将数据提交到 server端       agent整个集群所有主机都需要            dashboard就是用来将收集到的信息展示在网页上生成图表    

参考:https://book.open-falcon.org/zh_0_2/quick_install/backend.html

      http://www.cnblogs.com/benjamin77/p/8472632.html#auto_id_2

1.环境准备

调整时区为上海时区

[root@mage-monitor-01 ~]# ansible all -m shell -a "timedatectl  set-timezone Asia/Shanghai"
[root@mage-monitor-01 ~]# ansible all -m shell -a "timedatectl"

查看时间是否同步

安装redis

yum install redis -y

安装mysql-server

rpm -ivh http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm  
yum install -y mysql-server

启动mysql 激活开机自启动

systemctl start mysql;systemctl enable mysql;systemctl status mysql

初始化密码

[root@mage-monitor-01 ~]# mysql_secure_installation

数据库授权访问的网络,这里测试环境,就直接 放开

mysql -uroot -p123

grant all privileges on *.* to 'root'@'%' identified by '123';
flush privileges;

安装git

yum install git -y

下载openfalcon的一些表结构

 cd /tmp/ && git clone https://github.com/open-falcon/falcon-plus.git 

导入表结构

cd /tmp/falcon-plus/scripts/mysql/db_schema/
mysql -h 127.0.0.1 -u root -p < 1_uic-db-schema.sql
mysql -h 127.0.0.1 -u root -p < 2_portal-db-schema.sql
mysql -h 127.0.0.1 -u root -p < 3_dashboard-db-schema.sql
mysql -h 127.0.0.1 -u root -p < 4_graph-db-schema.sql
mysql -h 127.0.0.1 -u root -p < 5_alarms-db-schema.sql
rm -rf /tmp/falcon-plus/

安装go开发包

yum install golang -y

设置go 环境变量

export GOROOT=/usr/lib/golang
export GOPATH=/home

2.单机安装open-falcon server和agent 

下载

[root@mage-monitor-01 db_schema]# source /etc/profile
[root@mage-monitor-01 db_schema]# cd
[root@mage-monitor-01 ~]# export FALCON_HOME=/home/work
[root@mage-monitor-01 ~]# export WORKSPACE=$FALCON_HOME/open-falcon

 [root@mage-monitor-01 ~]# cd /home/work/open-falcon/
 [root@mage-monitor-01 open-falcon]# wget https://github.com/open-falcon/falcon-plus/releases/download/v0.2.1/open-falcon-v0.2.1.tar.gz

更改配置文件的mysql用户密码

[root@mage-monitor-01 open-falcon]# sed -i 's/root:/root:123/g' aggregator/config/cfg.json
[root@mage-monitor-01 open-falcon]# sed -i 's/root:/root:123/g' graph/config/cfg.json
[root@mage-monitor-01 open-falcon]# sed -i 's/root:/root:123/g' hbs/config/cfg.json
[root@mage-monitor-01 open-falcon]# sed -i 's/root:/root:123/g' nodata/config/cfg.json
[root@mage-monitor-01 open-falcon]# sed -i 's/root:/root:123/g' api/config/cfg.json
[root@mage-monitor-01 open-falcon]# sed -i 's/root:/root:123/g' alarm/config/cfg.json

重载配置

curl 127.0.0.1:1988/config/reload

修改agent配置

[root@mage-monitor-01 config]# pwd
/home/work/open-falcon/agent/config
[root@mage-monitor-01 config]# sed -i 's/0.0.0.0/192.168.5.200/g' cfg.json       

启动server 和agent  并检查状态

[root@mage-monitor-01 open-falcon]# ./open-falcon start
[falcon-graph] 8793
[falcon-hbs] 8799
[falcon-judge] 8802
[falcon-transfer] 8808
[falcon-nodata] 8814
[falcon-aggregator] 8822
[falcon-agent] 8835
[falcon-gateway] 8843
[falcon-api] 8847
[falcon-alarm] 8856
[root@mage-monitor-01 open-falcon]# ./open-falcon start agent
[falcon-agent] 8835
[root@mage-monitor-01 open-falcon]#  ./open-falcon check
        falcon-graph         UP            8793 
          falcon-hbs         UP            8799 
        falcon-judge         UP            8802 
     falcon-transfer         UP            8808 
       falcon-nodata         UP            8814 
   falcon-aggregator         UP            8822 
        falcon-agent         UP            8835 
      falcon-gateway         UP            8843 
          falcon-api         UP            8847 
        falcon-alarm         UP            8856 

3.在其他主机上开启agent

 使用 ansible 创建open-falcon的工作目录  ,复制agent 目录 和  运行执行脚本 open-falcon   到远端

[root@mage-monitor-01 ~]# cd /home/work/open-falcon/
[root@mage-monitor-01 open-falcon]# ansible all -m shell -a "export HOME=/home/work;export WORKSPACE=$HOME/open-falcon"                    
[root@mage-monitor-01 open-falcon]# ansible all -m copy -a "path=/home/work/open-falcon/open-falcon/agent dest=/home/work/open-falcon group=501 owner=501 mode=0755"
[root@mage-monitor-01 open-falcon]# ansible all -m copy -a "src=/home/work/open-falcon/open-falcon dest=/home/work/open-falcon group=501 owner=501 mode=0755"

启动程序后  在网页输入  192.168.5.200:8081       需要注册一个用户  第一个注册的用户是管理员,具有管理用户的功能

暂时先使用 小米监控的自带模板,后期数据库 缓存等 监控 后面再添加。

 5.添加 服务启动的定时任务

server端开启三个   server   dashboard    agent

 ansible 192.168.5.200 -m cron -a "name='start open-falcon agent' special_time=reboot job='cd /home/work/open-falcon/;./open-falcon start agent'"
[root@mage-monitor-01 ~]# ansible 192.168.5.200 -m cron -a "name='start open-falcon server' special_time=reboot job='cd /home/work/open-falcon;./open-falcon start;./open-falcon check'" 
# 这个用定时任务有问题,开启后最好再check一下,实在不行手动开启一下。
ansible 192.168.5.200 -m cron -a "name='start open-falcon dashboard' special_time=reboot job='cd  /home/work/open-falcon/dashboard;bash control start'" 

重启验证下 这样的骚操作是否有效

 很稳,妥妥的有效

 

这下 再批量添加一下   其他所有节点  只需要启动个 agent

[root@mage-monitor-01 ~]# ansible all -m cron -a "name='start open-falcon agent' special_time=reboot job='cd /home/work/open-falcon/;./open-falcon start agent'" 

这下 理论上机器健康的活着  它就得给我监控。除非意外进程挂了。

磁盘满了, cpu 资源耗光,负载过高      这种意外,难道不是监控应该先发现吗,所以上面这话没毛病。 

原文地址:https://www.cnblogs.com/benjamin77/p/9267061.html