VictoriaMetrics vmalert 说明

vmalert 可以执行一系列给定的rule(基于metricsql),然后发送报警到Alertmanager

特性

  • 集成VictoriaMetrics TSDB
  • MetricsQL 表达式校验
  • prometheus 报警规则格式支持
  • 集成Alertmanager
  • 轻量级没有额外的依赖

使用

  • 构建
    目前需要自己构建,很简单 make vmalert 就可以了
  • 启动
    依赖一系列的规则(promql&&metricssql),数据源地址,通知地址Alertmanager地址,方便处理,聚合报警以及发送通知
    命令:
 
./bin/vmalert -rule=alert.rules 
        -datasource.url=http://localhost:8428 
        -notifier.url=http://localhost:9093

一个参考规则

groups:
  - name: groupGorSingleAlert
    rules:
      - alert: VMRows
        for: 10s
        expr: vm_rows > 0
        labels:
          label: bar
          host: "{{ $labels.instance }}"
        annotations:
          summary: "{{ $value|humanize }}"
          description: "{{$labels}}"
  - name: TestGroup
    rules:
      - alert: Conns
        expr: sum(vm_tcplistener_conns) by(instance) > 1
        annotations:
          summary: "Too high connection number for {{$labels.instance}}"
          description: "It is {{ $value }} connections for {{$labels.instance}}"
      - alert: ExampleAlertAlwaysFiring
        expr: sum by(job)
          (up == 1)
  • 参考命令
vmalert-20200511-085826-heads-cluster-0-g6c88e352
Usage of ./vmalert:
  -datasource.basicAuth.password string
        Optional basic auth password for -datasource.url
  -datasource.basicAuth.username string
        Optional basic auth username for -datasource.url
  -datasource.url string
        Victoria Metrics or VMSelect url. Required parameter. E.g. http://127.0.0.1:8428
  -enableTCP6
        Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP is used
  -envflag.enable
        Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set
  -envflag.prefix string
        Prefix for environment variables if -envflag.enable is set
  -evaluationInterval duration
        How often to evaluate the rules. Default 1m (default 1m0s)
  -external.url string
        External URL is used as alert's source for sent alerts to the notifier
  -http.disableResponseCompression
        Disable compression of HTTP responses for saving CPU resources. By default compression is enabled to save network bandwidth
  -http.maxGracefulShutdownDuration duration
        The maximum duration for graceful shutdown of HTTP server. Highly loaded server may require increased value for graceful shutdown (default 7s)
  -http.pathPrefix string
        An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
  -http.shutdownDelay duration
        Optional delay before http server shutdown. During this dealy the servier returns non-OK responses from /health page, so load balancers can route new requests to other servers
  -httpListenAddr string
        Address to listen for http connections (default ":8880")
  -loggerFormat string
        Format for logs. Possible values: default, json (default "default")
  -loggerLevel string
        Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
  -loggerOutput string
        Output for the logs. Supported values: stderr, stdout (default "stderr")
  -memory.allowedPercent float
        Allowed percent of system memory VictoriaMetrics caches may occupy. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage (default 60)
  -notifier.url string
        Prometheus alertmanager URL. Required parameter. e.g. http://127.0.0.1:9093
  -remoteread.basicAuth.password string
        Optional basic auth password for -remoteread.url
  -remoteread.basicAuth.username string
        Optional basic auth username for -remoteread.url
  -remoteread.lookback duration
        Lookback defines how far to look into past for alerts timeseries. For example, if lookback=1h then range from now() to now()-1h will be scanned. (default 1h0m0s)
  -remoteread.url vmalert
        Optional URL to Victoria Metrics or VMSelect that will be used to restore alerts state. This configuration makes sense only if vmalert was configured with `remotewrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428
  -remotewrite.basicAuth.password string
        Optional basic auth password for -remotewrite.url
  -remotewrite.basicAuth.username string
        Optional basic auth username for -remotewrite.url
  -remotewrite.url string
        Optional URL to Victoria Metrics or VMInsert where to persist alerts state in form of timeseries. E.g. http://127.0.0.1:8428
  -rule value
        Path to the file with alert rules. 
        Supports patterns. Flag can be specified multiple times. 
        Examples:
         -rule /path/to/file. Path to a single file with alerting rules
         -rule dir/*.yaml -rule /*.yaml. Relative path to all .yaml files in "dir" folder, 
        absolute path to all .yaml files in root.
  -rule.validateTemplates
        Indicates to validate annotation and label templates (default true)
  -version
        Show VictoriaMetrics version
 

vmalert 暴露的服务

http://<vmalert-addr>/api/v1/alerts - 所有激活的报警;
http://<vmalert-addr>/api/v1/<groupName>/<alertID>/status" - 根据id获取报警状态
http://<vmalert-addr>/metrics -  应用程序的metrics
http://<vmalert-addr>/-/reload -  热加载(目前暂时还没实现)

说明

vmalert 是一个很不错的工具,弥补了VictoriaMetrics以前需要依赖外部进行报警的缺陷,以前我们需要基于原生prometheus
或者promxy或者grafana 等工具进行报警的处理,vmalert依托metrcisql 增强的扩展完善了VictoriaMetrics

参考资料

https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmalert

原文地址:https://www.cnblogs.com/rongfengliang/p/12878392.html