第九讲：企业级监控数据采集方法

　　第九讲内容如下

　　1）prometheus服务端安装和后台稳定运行

　　2）prometheus服务端配置文件写法

　　3）node_exporter安装和后台运行

　　4）node_exporter观察和采集数据

　　5）prometheus查询采集回来的各种数据

　　6) 使用我们之前学过的prometheus命令行的形式练习组合各种监控图

　　（一）prometheus服务端的安装和后台稳定运行

　　下载地址：https://github.com/prometheus/prometheus/releases/tag/v2.10.0

　　解压

mv prometheus-2.10.0.linux-amd64.tar.gz /usr/local/
cd /usr/local/
tar -xf prometheus-2.10.0.linux-amd64.tar.gz
mv prometheus-2.10.0.linux-amd64 prometheus

　　运行

cd prometheus
./prometheus

　　我们需要让prometheus_server 运⾏在后台⽽不是前端

　　第一种方法安装screen

yum -y install screen

　　使用screen启动

#进入screen
screen
#启动prometheus
./prometheus

　　ctrl+a+d退出

　　查看放入后台的进程

screen -ls

　　screen还有另外⼀个好处就是可以随时切换进⼊程序前台窗⼜查看各种调试信息

screen -r

　　screen 也有不好的地⽅ • 不够正规化总觉得还是个临时办法

　　• screen -l 提供的后台列表不够⼈性化，很多时候你记不住到底哪个是哪个
　　• 很容易被误关闭操作的时候 ctrl +ad / ctrl +d 不⼩⼼操作错了直接就退出去了
　　

　　方法二

　　使⽤daemonize 放⼊后台⽅式

　　编译安装

git clone git://github.com/bmc/daemonize.git && cd daemonize
 ./configure && make && make install
daemonize -v

　　指定运行启动脚本

# cat /usr/local/prometheus/up.sh 
/usr/local/prometheus/prometheus  --config.file="prometheus.yml" --web.listen-address="0.0.0.0:9090"  --web.read-timeout=5m  --web.max-connections=10 --storage.tsdb.retention=15d --storage.tsdb.path="/usr/local/prometheus/data" --query.max-concurrency=20 --query.timeout=2m

　　参数解释

–config.file=“prometheus.yml” 指定配置文件

–web.read-timeout=5m 请求链接的最大等待时间，防止太多的空闲链接占用资源

–web.max-connections=512 针对prometheus，获取数据源的时候，建立的网络链接数，做一个最大数字的限制，防止链接数过多造成资源过大的消耗

–storage.tsdb.retention=15d 重要参数，prometheus 开始采集监控数据后，会存在内存和硬盘中；对于保存期限的设置。时间过长，硬盘和内存都吃不消；时间太短，要查历史数据就没了。企业15天最为合适。

–storage.tsdb.path="/usr/local/prometheus/data" 存储数据路径，不要随便定义

–query.max-concurrency=20 用户查询最大并发数

–query.timeout=2m 慢查询强制终止

　　设置up.sh执行权限

chmod +x /usr/local/prometheus/up.sh

　　后台启动

daemonize -c /usr/local/prometheus /usr/local/prometheus/up.sh

　　查看进程是否启动

　　放入开机自启动

# cat /etc/rc.local 
touch /var/lock/subsys/local
daemonize -c /usr/local/prometheus /usr/local/prometheus/up.sh

　　重启正常启动即可

　　prometheus对时间比较敏感，需要设置ntpdate随时同步时间

　　数据目录

　　其中这些长串字母的是历史数据保留⽽当前近期数据实际上保留在内存中

　　并且按照⼀定间隔存放在 wal / ⽬录中防⽌突然断电或者重启以⽤来恢复内存中的数据

　　（⼆） prometheus 服务端配置⽂件写法

　　配置文件

# cat prometheus.yml 
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090','192.168.1.101:9100','192.168.1.102:9100','192.168.1.11:9100']

　　注意：targets也可以使用域名，前提是需要设置好DNS或者是修改了本机的hosts

　　（三） node_exporter 安装和后台运⾏

　　下载地址：https://github.com/prometheus/node_exporter/releases/tag/v0.15.2

　　解压运行

tar -xf node_exporter-0.15.2.linux-amd64.tar.gz
mv node_exporter-0.15.2.linux-amd64 node_exporter
cd node_exporter
./node_exporter &

　　设置systemctl启动

# cat /usr/lib/systemd/system/node_exporter.service 
[Unit]
Description=Prometheus Node Exporter
After=network.target

[Service]
ExecStart=/usr/local/node_exporter/node_exporter
User=nobody

[Install]
WantedBy=multi-user.target

　　启动和设置成自启动

systemctl start node_exporter 
systemctl enable node_exporter

　　默认运行端口是9100

　　本机可以使用curl目录查看数据

curl localhost:9100/metrics

　　然后我们去到 node_exporter在 github上的地址来看看我们伟⼤的社区开发者们都给咱们提供了哪些有⽤的采集项⽬

　　https://github.com/prometheus/node_exporter

　　（五）prometheus查询采集回来的各种数据

　　接下来我们回到 prometheus的主界⾯验证⼀下我们新部署的监控机器上的node_exporter 是否给我们正确返回了数据随便挑⼏个 key 就可以查看
另外 prometheus 的命令⾏本⾝也⽀持suggest 功能（输⼊提⽰）
　　随便找个key 查询⼀下是否有输出图输出就可以了本⾝node_exporter提供的 keys 实在太多了（因为都是从 Linux系统中的底层各种挖掘数据回来）我们没有时间也没有必要把每⼀个key 都掌握只要知道⼀部分重要的必须的key 就⾜够了
　　

　　（六）使⽤我们之前的学过的 prometheus 命令⾏的形式练习组合各种监控图

　　接下来咱们找⼀个⽐较重要的key 然后⽤我们学过的命令⾏⽅式给他组成⼀个临时监控图

　　比如

　　node_cpu

　　node_memory

　　node_disk