elasticsearch基本操作及ELK日志查询与分析系统搭建

一、elasticsearch搭建及基本操作
1.elasticsearch简介
elasticsearch(以后简称es)的作用就不做太多说明,官网一句You Know, for Search已经概括其核心功能。对应原理性的东西,索引、集群、切片等功能不是本文重点,可另寻文章学习。
es的版本区别(摘自es官网):
Elasticsearch 5.6.0   支持多type,可通过配置index.mapping.single_type:true设置为仅支持单个type
Elasticsearch 6.x    Indices created in 6.x only allow a single-type per index(仅支持单个type)
Elasticsearch 7.X    Specifying types in requests is deprecated(指定type的请求方式已过时),
原有请求的type部分固定为_doc,其实就是固定一个名为_doc的type,官方称其为虚类型,作为_search、_source这样的endpoint端点进行理解
Elasticsearch 8.x Specifying types in requests is no longer supported(不再支持还有type的请求方式)

2.安装
a.拉取镜像
docker pull elasticsearch:7.3.0
b.运行容器
docker run -d -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" --name es01 elasticsearch:7.3.0
c.修改配置
进入容器,修改/usr/share/elasticsearch/config/elasticsearch.yml
network.host: 0.0.0.0

d.重启容器

3.创建索引和记录
restapi操作语法
a.curl -X PUT http://host:port/index?pretty=true
b.curl -X PUT http://host:port/index/type/id -H 'Content-Type:application/json' -d 'json content' 对于7.X的版本,type部分固定为_doc即可
c.curl -X POST http://host:port/index/type -H 'Content-Type:application/json' -d 'json content' 对于7.X的版本,type部分固定为_doc即可
注意b和c区别,请求的方法不同,并且b指定了id,c未指定id,这两者就是自行指定id和es自动生成id的创建记录方法


book索引下面创建一个1的记录
curl -X PUT http://localhost:9200/book/_doc/1?pretty=true -H 'Content-Type:application/json' -d '{"name":"thinking in java","ver":"1.0"}'

{
"_index" : "book",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
  "total" : 2,
  "successful" : 1,
  "failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}

若重复PUT同一个记录则会更新,结果结果如下
{
"_index" : "book",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"result" : "updated",
"_shards" : {
  "total" : 2,
  "successful" : 1,
  "failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1
}

继续在book索引下面名为python的type下面创建一个1的记录
curl -X PUT http://localhost:9200/book/python/1?pretty=true -H 'Content-Type:application/json' -d '{"name":"python1","version":"1.0.0","description":"this is for python"}'
{
"error" : {
  "root_cause" : [
    {
    "type" : "illegal_argument_exception",
    "reason" : "Rejecting mapping update to [book] as the final mapping would have more than 1 type: [_doc, python]"
    }
  ],
  "type" : "illegal_argument_exception",
  "reason" : "Rejecting mapping update to [book] as the final mapping would have more than 1 type: [_doc, python]"
},
"status" : 400
}
可以看到创建结果是失败,原因是6.0以上的版本不允许一个index下面有多个type

4.查询记录
restapi操作语法
curl http://host:port/index/_search 查询索引的所有记录
curl http://host:port/index/type/1 查询id=1的记录

查询示例
curl http://localhost:9200/book/_search?pretty=true
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
        {
          "_index" : "book",
          "_type" : "_doc",
          "_id" : "1",
          "_score" : 1.0,
          "_source" : {
            "name" : "thinking in java",
            "ver" : "1.0"
          }
        }
    ]
  }
}

curl http://localhost:9200/book/_doc/1?pretty=true
{
  "_index" : "book",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "name" : "thinking in java",
    "ver" : "1.0"
  }
}
条件查询语法
a.查询条件
"query" : {
"term" : { "user" : "kimchy" }
}
b.排序
"sort":[
{ "post_date":{ "order":"asc" }},
{ "name":"desc" }
]
c.窗口控制
"from":0
"size":10
d.字段投影
"_source": {
"include": [ "filed1", "field2" ],
"exclude": [ "field3" ]
}

5.删除索引和记录
restapi操作语法
curl -X DELETE http://host:port/index
curl -X DELETE http://host:port/index/type/record

删除示例
curl -X DELETE http://localhost:9200/book?pretty=true
curl -X DELETE http://localhost:9200/book/_doc/1?pretty=true

6.更新记录
restapi操作语法
curl -X PUT http://host:port/index/type/id -H 'Content-Type:application/json' -d 'json content' 创建已存在的记录则会执行更新操作
curl -X POST http://host:port/index/type/id/_update -H 'Content-Type:application/json' -d '{"doc":{"filed":"value"}}' 通过_update端点更新7.X以前的写法
curl -X POST http://host:port/index/_update/id -H 'Content-Type:application/json' -d '{"doc":{"filed":"value"}}' 通过_update端点更新7.X以后的写法

更新示例
curl http://localhost:9200/book/_update/1?pretty=true -X POST -H 'Content-Type:application/json' -d '{"doc":{"name":"python programming"}}'
{
  "_index" : "book",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 3,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 5,
  "_primary_term" : 1
}


二、ELK日志采集分析系统搭建
1.日志采集分析过程
数据输入(采集)->数据处理->数据输出(存储)
a.采集日志文件信息,实时跟踪日志文件的变化,获取变化结果 filebeat
b.日志格式化 logstash
c.日志存储于查询 es
d.可视化工具 kibana

2.filebeat搭建及配置
logstash本身就支持直接从文件中读取数据进行处理,还要经过filebeat采集是因为其轻量级且无需java运行环境等优点

a.拉取镜像
docker pull prima/filebeat:6.4.2

b.编写配置文件
filebeat.inputs:
- type: log
enabled: true
paths:
- /home/shared_disk/tomcat_logs/*.log
#output.logstash:
# enabled: true
# hosts: ["localhost:5044"]
output.console:
enabled: true
pretty: true
简单的配置,采集到的的日志输出到控制台。复杂的配置目前还没过多研究,可以查阅其他资料

c.启动容器
docker run -d --name filebeat01 -v /home/shared_disk/filebeat/filebeat.yml:/filebeat.yml -v /home/shared_disk/tomcat_logs:/home/shared_disk/tomcat_logs prima/filebeat:6.4.2
注意挂载的目录和文件,确保能收集到日志文件
此时/home/shared_disk/tomcat_logs/目录下的.log文件日志有信息则可以看到filebeat的控制台有以下输出,说明采集成功
{
  "@timestamp": "2019-08-23T06:51:37.889Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.4.2"
  },
  "source": "/home/shared_disk/tomcat_logs/docker-learn-info.log",
  "offset": 47547,
  "message": "2019-08-23 06:51:34,030 [http-nio-8083-exec-9] INFO com.allen.dockerlearn.DockerLearnApplication - this is the message for test",
  "prospector": {
    "type": "log"
  },
  "input": {
  "type": "log"
  },
  "beat": {
    "hostname": "efd575c7aa99",
    "version": "6.4.2",
    "name": "efd575c7aa99"
  },
  "host": {
    "name": "efd575c7aa99"
  }
}

3.logstash搭建及配置
logstash的作用就是对输入的数据进行格式化、类型转换、新增字段、删除字段等处理,得到用户想要的数据格式并输出到指定位置

a.拉取镜像
docker pull logstash:7.3.0

b.启动容器
docker run -d -p 5044:5044 -p 9600:9600 -v /home/shared_disk/tomcat_logs/:/home/shared_disk/tomcat_logs/ --name logstash01 logstash:7.3.0
需要注意的是我们这里是logstash的输入是日志文件,需要注意tomcat_logs目录权限问题,因为docker容器使用logstash用户启动,如果没有权限,将无法访问挂载目录

c.进入容器/usr/share/logstash/config目录下修改配置文件
修改pipelines.yml
- pipeline.id: logstash-1
path.config: "/usr/share/logstash/config/*.conf"
创建my-logstash.conf
input {
  file {
    path => "/home/shared_disk/tomcat_logs/*.log"
    type => "system"
    start_position => "beginning"
  }
}

filter {

}

output {
  stdout {
  }

}

d.重启docker容器,查看容器日志可以看到控制台打印了如下信息
{
"path" => "/home/shared_disk/tomcat_logs/docker-learn-info.log",
"@timestamp" => 2019-08-23T09:32:57.085Z,
"host" => "d677fed8b760",
"@version" => "1",
"message" => "2019-08-23 09:32:56,976 [http-nio-8083-exec-3] INFO com.allen.dockerlearn.DockerLearnApplication - this is the message for test",
"type" => "system"
}

4.kibana搭建
a.拉取镜像
docker pull kibana:7.3.0
b.启动容器
docker run -d -p 5601:5601 --name kibana01 kibana:7.3.0
c.修改配置文件
/usr/share/kibana/config/kibana.yml
server.host: "0.0.0.0"
elasticsearch.hosts: [ "http://192.168.16.84:9200" ]
d.重启docker容器

5.elk集成
基于上述的搭建及配置过程,我们现在只需要修改filebeat的输出,logstash的输入和输出即可完成elk的集成

a.filebeat输出修改
output.logstash:
enabled: true
hosts: "192.168.16.84:5044"

b.logstash出入修改
beats {
  port => 5044
}

c.logstash输出修改
elasticsearch {
  hosts => "192.168.16.84:9200"
  index => "api-log"
}

d.日志格式化配置
logstash接收到的filebeat传过来的数据有很多,有filebeat自带的还有logstash自带的,但是我们最为关注的只有message部分,也就是以下部分
"message" => "2019-08-23 09:32:56,976 [http-nio-8083-exec-3] INFO com.allen.dockerlearn.DockerLearnApplication - this is the message for test"
所以我们需要读数据进行处理,在logstash的过滤器添加以下配置:
filter {
  grok {
    match => {
      "message" => "%{TIMESTAMP_ISO8601:log_time}s+[%{DATA:thread_name}]s+%{LOGLEVEL:log_level}s+%{DATA:class_name}s+-s+(?<log_info>(w+s*)*)"
    }
  }
  mutate {
    add_field => {
      "host_name" => "%{[host][name]}"
    }
    remove_field => ["source","beat","host","@version","@timestamp","prospector","input","tags","offset","message"]
  }
}
经过此过滤器,我们提取了message中的部分字段,起名为log_time、thread_name、log_level、class_name、log_info,
新增了host_name字段,其值是从原有的host.name这个字段提取的,删除了source、beat...message等字段,最终的数据格式如下
{
"thread_name" => "http-nio-8083-exec-1",
"log_time" => "2019-08-26 13:31:11,029",
"host_name" => "efd575c7aa99",
"log_info" => "this is the message for test",
"class_name" => "com.allen.dockerlearn.DockerLearnApplication",
"log_level" => "INFO"
}

至此ELK集成已经完成
打开kibana主页,在设置里面创建索引之后,到discovery栏目即可进行日志搜索与查看,visualize栏目可以进行日志统计与分析

原文地址:https://www.cnblogs.com/xiao-tao/p/11412790.html