Prometheus自定义metrics监控进程存活状态

一. 监控进程存活

有时候我们需要监控进程的状态,由于我们常用的node_exporter并不能覆盖所有监控项,这里我们使用自定义的方式对进程进行监控。

二. 自定义Python脚本定义metrics值

2.1 安装pip

yum install -y python-pip

2.2 编写py脚本

# coding: utf-8
import sys
import psutil

from prometheus_client import CollectorRegistry, Gauge, write_to_textfile

monitor_list = [{'name': 'gitlab','desc':'gitlab-process'},{'name': 'nginx','desc':'nginx-process'},]

def checkProcessCount(process_name):
    count = 0
    for proc in psutil.process_iter():
        try:
            if process_name.lower() in proc.name().lower():
                count +=1
        except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess):
            pass
    print count
    return count


def save_metrics():
    registry = CollectorRegistry()
    gauge = Gauge('process_number', 'Number of Process',['name'], registry=registry)
    for p in monitor_list:
        count = checkProcessCount(p['name'])
        gauge.labels(name=p['name']).set(count)
    write_to_textfile('/var/lib/node_exporter/textfile_collector/metadata.prom', registry)


if __name__ == '__main__':
    save_metrics()

把脚本加入到定时任务中,根据自定义定的metrics值去匹配做rules报警

原文地址:https://www.cnblogs.com/zhangzihong/p/10684319.html