通过filebeat收集并通过elsaticsearch的pipeline功能解析nginx访问日志

通过filebeat收集并通过elsaticsearch的pipeline功能解析nginx访问日志

部分方案直接通过nginx写入json格式的数据,filebeat/logstash 直接解析收集上传至es

本篇方式为nginx写入标准文件,filebeat上传,并调用es的pipeline方法,在es 处通过pipeline 以grok方式解析kv,保存

elasticsearch pipeline介绍 https://www.elastic.co/guide/en/elasticsearch/reference/current/pipeline.html

file->filebeat->es pipeline grok ->esfile->filebeat->logstash grok ->es

把负载较高的grok放在服务端,客户端只使用轻量的filebeat,这也是filebeat官方一直不支持grok的原因,更适合client性能较弱,减少client负载的场景,例如嵌入式设备

废话不多说,亮配置

  • nginx 设置

        log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                          '$status $body_bytes_sent "$http_referer" '
                          '"$http_user_agent" "$http_x_forwarded_for"';
    
        access_log  /usr/local/nginx/logs/access.log  main;  
    

    日志样例

    110.242.68.3 - kibana [15/Jan/2021:11:29:59 +0800] "POST /api/v1 GET HTTP/1.1" 200 1044234 "https://www.github.com/app/kibana" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36" "-"

  • filebeat 配置

    # cat /usr/local/nginx/filebeat_conf/nginx.yaml
    filebeat.inputs:
    - type: log
      enabled: true
      paths:
        - /usr/local/nginx/logs/access*.log
      fields: #因需添加自定义kv
        kw_nginxHost: "192.168.100.200" 
        kw_group: "frond-end-api"
    processors:
      - drop_fields: #移除无意义kv
          fields: ["offset","prospector", "source", "input","beat"]      
    filebeat.registry_file: /usr/local/nginx/filebeat_registry/nginx_registry # offset维护文件
    output.elasticsearch:
      hosts: ["192.168.100.111:9200","192.168.100.112:9200","192.168.100.113:9200"]
      pipeline: nginx-pipeline
    setup.kibana:
      host: "192.168.100.222:5601"
    
  • 注册es pipeline

    PUT http://192.168.100.111:9200/_ingest/pipeline/nginx-pipeline

    {
        "description": "解析nginx日志",
        "processors": [
            {
                "grok": {
                    "field": "message",
                    "patterns": ["%{IPV4:kw_remoteAddr} - %{USER:kw_remoteUser} \[%{TIMESTAMP_ISO8601:date_timeLocal}\] "%{WORD:kw_requestType} %{DATA:kw_requestUrl}" %{NUMBER:long_status} %{NUMBER:long_bodyBytesSent} "%{DATA:kw_httpReferer}" "%{DATA:kw_httpUserAgent}" "%{DATA:kw_httpXForwardedFor}""]
                }
            },
            {
                "remove": {
                    "field": [
                        "message"
                    ]
                }
            },
            {
                "date": {
                    "field": "date_timeLocal",
                    "target_field": "@timestamp",
                    "formats": [
                        "dd/MMM/yyyy':'HH:mm:ss"
                    ],
                    "timezone" : "Europe/Amsterdam"
                }
            }
        ]
    }
    
    

End

原文地址:https://www.cnblogs.com/zihunqingxin/p/14459640.html