快速搭建应用服务日志收集系统(Filebeat + ElasticSearch + kibana)

概要说明

需求场景，系统环境是CentOS，多个应用部署在多台服务器上，平时查看应用日志及排查问题十分不变。索性搭建一个服务器日志收集系统，由于每日日志规模仅在G级别，所有前期暂先不搭建集群。 
技术方案是 Filebeat + ElasticSearch + kibana (日志服务器上安装ElasticSearch,Kibana,其他应用服务器上安装Filebeat); 没有考虑加上Logstash,Flume，Kafka，Redis等，一是Filebeat比较轻量级，占用资源少，且可直接将日志输出到elasticsearch,仅是方便查看线上服务日志；不需要对日志字段进行解析，所以不打算引入技术太多，增加复杂性；没用Redis,原因仅是当前已有Redis集群，但主要是交易系统使用，不想增加其额外风险。
另外，当前的选用的日志服务器内存等配置不高。不适合安装过多软件，仅此而已。


也可参考我之前写的另一篇博客：
Windows下ELK-5.4.3环境搭建 http://www.cnblogs.com/huligong1234/p/7108109.html

一、安装ElasticSearch

1.安装JDK8环境

[root@app-001 src]# cd /usr/local/src/
[root@app-001 src]# rpm -qa | grep jdk
java-1.6.0-openjdk-1.6.0.41-1.13.13.1.el6_8.x86_64
[root@app-001 src]# rpm -e java-1.6.0-openjdk-1.6.0.41-1.13.13.1.el6_8.x86_64
[root@app-001 src]# curl -L -O http://download.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/jdk-8u144-linux-x64.rpm?AuthParam=1506173332_32b98eb52c8955419974ec3efcba2209
[root@app-001 src]# rpm -ivh jdk-8u144-linux-x64.rpm
[root@app-001 src]# java -version

2.安装ElasticSearch

[root@app-001 src]# curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.1.rpm
[root@app-001 src]# rpm -ivh elasticsearch-5.6.1.rpm
[root@app-001 src]# chkconfig --add elasticsearch

安装支持中文分词插件
[root@app-001 src]# /usr/share/elasticsearch/bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.6.1/elasticsearch-analysis-ik-5.6.1.zip

3.配置ElasticSearch

[root@app-001 src]# vi /etc/elasticsearch/elasticsearch.yml

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
#cluster.name: my-application
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
#path.data: /data/elasticsearch/data
#
# Path to log files:
#
#path.logs: /data/elasticsearch/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
bootstrap.memory_lock: true

#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 192.168.1.106
#
# Set a custom port for HTTP:
#
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.zen.ping.unicast.hosts: ["host1", "host2"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#
#discovery.zen.minimum_master_nodes: 3
#
# For more information, consult the zen discovery module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"
#http.cors.allow-origin: "esmgr.domain.com"

注意：如果CentOS版本低于7.0还需要进行如下配置调整

bootstrap.memory_lock: false
bootstrap.system_call_filter: false

如果非内网访问，network.host需改成如下配置：

network.host: 0.0.0.0

4.启动ElasticSearch

[root@app-001 src]# service elasticsearch start
[root@app-001 src]# curl "http://192.168.1.234:9200" #查看启动情况

5.防火墙放开iptables 9200端口，允许内网其他机器访问

[root@app-001 src]# vi /etc/sysconfig/iptables
增加如下内容：
-A INPUT -s 192.168.1.0/24 -p tcp -m state --state NEW -m tcp --dport 9200 -j ACCEPT

[root@app-001 src]# service iptables restart

二、安装Filebeat

1.下载

[root@app-001 src]# curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.6.1-x86_64.rpm

2.安装

[root@app-001 src]# rpm -ivh filebeat-5.6.1-x86_64.rpm
[root@app-001 src]# chkconfig --add filebeat

3.配置

[root@app-001 src]# vi /etc/filebeat/filebeat.yml

###################### Filebeat Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The filebeat.full.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html

#=========================== Filebeat prospectors =============================

filebeat.prospectors:

# Each - is a prospector. Most options can be set at the prospector level, so
# you can use different prospectors for various configurations.
# Below are the prospector specific configurations.

- input_type: log

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    #- /var/log/*.log
    - /opt/tomcat-myapp/logs/myapp.log
    - /data/production/tomcat-myapp/logs/catalina.out
    #- c:programdataelasticsearchlogs*
  fields_under_root: true
  fields:
    log_type: myapp
  tags: ["myapp","tomcat-log"]
  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  exclude_lines: ["^DBG"]

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ["^ERR", "^WARN"]

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #exclude_files: [".gz$"]

  # Optional additional fields. These field can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

  ### Multiline options

  # Mutiline can be used for log messages spanning multiple lines. This is common
  # for Java Stack Traces or C-Line Continuation

  # The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
  multiline.pattern: ^[

  # Defines if the pattern set under pattern should be negated or not. Default is false.
  #multiline.negate: false

  # Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
  # that was (not) matched before or after or as long as a pattern is not matched based on negate.
  # Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
  multiline.match: after
  encoding: utf-8
  
- input_type: log
  paths:
    - /opt/tomcat-apiserver/logs/apiserver.log
    - /data/production/tomcat-apiserver/logs/catalina.out
  fields_under_root: true
  fields:
    log_type: apiserver
  tags: ["tomcat-log"]
  encoding: utf-8
  exclude_lines: ["^DBG"]
  multiline.pattern: ^[
  multiline.match: after
  
- input_type: log
  paths:
    - /usr/local/tengine/logs/error.log
  fields_under_root: true
  fields:
    log_type: nginx-error
  tags: ["nginx-log"]
  encoding: utf-8
  
- input_type: log
  paths:
    - /var/log/*.log
  fields_under_root: true
  fields:
    log_type: system
  tags: ["system-log"]
  encoding: utf-8
#================================ General =====================================

# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:

# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]
# Optional fields that you can specify to add additional information to the
# output.
fields:
    log_host: ip-106
#  env: staging

#================================ Outputs =====================================

# Configure what outputs to use when sending the data collected by the beat.
# Multiple outputs may be used.

#-------------------------- Elasticsearch output ------------------------------
output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["192.168.1.106:9200"]

  # Optional protocol and basic auth credentials.
  #protocol: "https"
  #username: "elastic"
  #password: "changeme"

#----------------------------- Logstash output --------------------------------
#output.logstash:
  # The Logstash hosts
  #hosts: ["localhost:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

#================================ Logging =====================================

# Sets log level. The default log level is info.
# Available log levels are: critical, error, warning, info, debug
#logging.level: debug

# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]

4.启动

[root@app-001 src]# /usr/bin/filebeat.sh -configtest
[root@app-001 src]# service filebeat start

[root@app-001 src]# service filebeat status #检查状态

[root@app-001 src]# tail -f /var/log/filebeat/filebeat

三、安装kibana

1.下载

[root@app-001 src]# curl -L -O https://artifacts.elastic.co/downloads/kibana/kibana-5.6.1-x86_64.rpm

2.安装

[root@app-001 src]# rpm -ivh kibana-5.6.1-x86_64.rpm
[root@app-001 src]# chkconfig --add kibana

3.配置

[root@app-001 src]# vi /etc/kibana/kibana.yml

server.port: 5601
server.host: "192.168.1.106"
elasticsearch.url: "http://192.168.1.106:9200"

4.启动

[root@app-001 src]# service kibana start

浏览器访问 http://192.168.1.106:5601/

四、安装插件

1.elasticsearch-head

1.1.安装NodeJS环境

[root@app-001 src]# curl --silent --location https://rpm.nodesource.com/setup_8.x | bash -
[root@app-001 src]# yum install -y nodejs
[root@app-001 src]# node -v 
[root@app-001 src]# npm -v

1.2.下载 elasticsearch-head

[root@app-001 src]# wget https://codeload.github.com/mobz/elasticsearch-head/zip/master
[root@app-001 src]# unzip master

1.3.修改配置

1.3.1.修改elasticsearch.yml，增加跨域的配置(需要重启es才能生效)
http.cors.enabled: true
http.cors.allow-origin: "*"

1.3.2.编辑elasticsearch-head/Gruntfile.js，修改服务器监听地址，connect节点增加hostname属性，将其值设置为*

connect: {
    server: {
        options: {
            hostname:'*',
            port: 9100,
            base: '.',
            keepalive: true
        }
    }
}

1.3.3.编辑elasticsearch-head/_site/app.js,
修改默认es地址http://localhost:9200/为http://192.168.1.106:9200/。

1.4.启动

[root@app-001 src]# npm run start
浏览器访问 http://192.168.1.106:9100/

2.bigdesk

http://bigdesk.org/

五、使用Basic Auth给ElasticSearch和Kibana配置访问权限

1.创建密码

[root@app-001 src]# htpasswd -c /usr/local/tengine/db/passwd.db loguser

2.修改Nginx配置文件nginx.conf

[root@app-001 src]# vi /usr/local/tengine/conf/nginx.conf

		server {
			listen          80;
			server_name     esmgr.domain.com;
			auth_basic "basic auth esmgr";
			auth_basic_user_file /usr/local/nginx/db/passwd.db;
			location / { 
				proxy_pass        http://192.168.1.106:9200; 
			}
			location /head/ {
				proxy_pass    http://192.168.1.106:9100/; 
			}
		}

3.更新Nginx生效

[root@app-001 src]# /usr/local/tengine/sbin/nginx -t
nginx: the configuration file /usr/local/tengine/conf/nginx.conf syntax is ok
nginx: configuration file /usr/local/tengine/conf/nginx.conf test is successful
[root@app-001 src]# /usr/local/tengine/sbin/nginx -s reload

六、CURL查询方式示例

curl -XGET "http://192.168.1.106:9200/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match": {"message": "JD_08011137015349778"}
  }
}'

curl -XGET "http://192.168.1.106:9200/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    
    "match": {"log_type": "myapp"}
  },"size": 5
}'

curl -XGET "http://192.168.1.106:9200/_search" -H 'Content-Type: application/json' -d'
{"query":{"bool":{"must":[{"match":{"message":"JD_08011137015349778"}}],"filter":[{"range":{"@timestamp":{"from":"now-1d","to":"now"}}}]}}}
'

curl -XGET "http://192.168.1.106:9200/_search" -H 'Content-Type: application/json' -d'
{"query":{"bool":{"must":[{"match":{"message":"JD_08011137015349778"}}],"filter":[{"range":{"@timestamp":{"gte":"1506441600000","lte":"1506527999000"}}}]}}}
'

七、Java HttpClient 方式调用简单封装示例

package org.jeedevframework.common.es;

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;

import javax.servlet.http.HttpServletRequest;

import org.apache.commons.lang3.StringUtils;
import org.apache.http.HttpResponse;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.ContentType;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.params.BasicHttpParams;
import org.apache.http.params.HttpConnectionParams;
import org.apache.http.util.EntityUtils;
import org.json.JSONArray;
import org.json.JSONException;
import org.json.JSONObject;

import org.jeedevframework.util.DateUtil;
/**
针对如下CURL请求方式的Java封装示例
curl -XGET "http://192.168.1.106:9200/_search" -H 'Content-Type: application/json' -d'
{"query":{"bool":{"must":[{"match":{"message":"JD_08011137015349778"}}],
"filter":[{"range":{"@timestamp":{"gte":"1506441600000","lte":"1506527999000"}}}]}}}
'
 * */
public class EsQueryService {
	
	public static DefaultHttpClient httpClient = null;
	public static DefaultHttpClient getHttpClientInstance() {
		if (null == httpClient) {
			BasicHttpParams httpParams = new BasicHttpParams();  
	        HttpConnectionParams.setConnectionTimeout(httpParams, 120000);  
	        HttpConnectionParams.setSoTimeout(httpParams, 120000);  
			DefaultHttpClient httpClient = new DefaultHttpClient(httpParams);
			httpClient.getParams().setParameter("http.protocol.content-charset", "UTF-8");
			
			return httpClient;
		}
		return httpClient;
	}

	public static String query(HttpServletRequest request) throws JSONException, ParseException {
		 String keywords = request.getParameter("keywords");
		 String startTime = request.getParameter("startTime");
		 String endTime = request.getParameter("endTime");
		 int pageSize = Integer.valueOf(StringUtils.defaultIfEmpty(request.getParameter("pageSize"), "10"));
		 String log_type = StringUtils.defaultIfEmpty(request.getParameter("log_type"), "");

		if(StringUtils.isEmpty(startTime)){
			startTime = DateUtil.format(new Date(),DateUtil.C_DATE_PATTON_DEFAULT)+" 00:00:00";
		}
		if(StringUtils.isEmpty(endTime)){
			endTime = DateUtil.format(new Date(),DateUtil.C_DATE_PATTON_DEFAULT)+" 23:59:59";
		}

		long startTimeDt = new SimpleDateFormat(DateUtil.C_TIME_PATTON_DEFAULT).parse(startTime).getTime();
		long endTimeDt = new SimpleDateFormat(DateUtil.C_TIME_PATTON_DEFAULT).parse(endTime).getTime();

		DefaultHttpClient httpClient = getHttpClientInstance();

		HttpPost httpPost = new HttpPost("http://192.168.1.106:9200/_search");

		JSONObject esQueryJo = new JSONObject();
		JSONObject queryJo = new JSONObject();
		JSONObject boolJo = new JSONObject();


		//esQueryJo.put("min_score", 1.2);

		JSONArray mustJoArr = new JSONArray();
		if(StringUtils.isNotEmpty(keywords)){
			JSONObject matchJo = new JSONObject();
			matchJo.put("message", keywords);
			JSONObject matchWrapJo = new JSONObject();
			matchWrapJo.put("match", matchJo);
			mustJoArr.put(matchWrapJo);
		}

		if(StringUtils.isNotEmpty(log_type)){
			JSONObject matchJo = new JSONObject();
			matchJo.put("log_type", log_type);
			JSONObject matchWrapJo = new JSONObject();
			matchWrapJo.put("match", matchJo);
			mustJoArr.put(matchWrapJo);
		}


		JSONArray filterJoArr = new JSONArray();
		JSONObject rangeJo = new JSONObject();
		JSONObject timestampJo = new JSONObject();
		timestampJo.put("gte", startTimeDt);
		timestampJo.put("lte", endTimeDt);
		//timestampJo.put("from", "now-1d");
		//timestampJo.put("to", "now");
		rangeJo.put("@timestamp", timestampJo);
		//mustJoArr.put("match", matchJo);
		JSONObject rangeWrapJo = new JSONObject();
		rangeWrapJo.put("range", rangeJo);
		filterJoArr.put(rangeWrapJo);

		boolJo.put("must",mustJoArr);
		boolJo.put("filter",filterJoArr);

		queryJo.put("bool", boolJo);
		esQueryJo.put("query", queryJo);
		esQueryJo.put("size", pageSize);
		String esQueryString = esQueryJo.toString();
		String resultContent = "";
		if(mustJoArr.length()>0){
			StringEntity reqEntity = new StringEntity(esQueryString ,ContentType.APPLICATION_JSON);
			httpPost.setEntity(reqEntity);
			try{
				 HttpResponse resp = httpClient.execute(httpPost);
				 resultContent = EntityUtils.toString(resp.getEntity(), "UTF-8");
				 return resultContent;
			}catch(Exception e){
				e.printStackTrace();
			}finally {
			}	
		}

		return "";
	}
		

}

八、相关参考资料


 ELK 性能(1) — Logstash 性能及其替代方案
http://www.cnblogs.com/richaaaard/p/6109595.html

CentOS下RPM安装ElasticSearch
http://www.netpc.com.cn/2361.html

Elasticsearch在Centos 7上的安装与配置
https://www.biaodianfu.com/centos-7-install-elasticsearch.html


ElasticSearch 5.0.0 安装部署常见错误或问题
http://www.dajiangtai.com/community/18136.do?origin=csdn-geek&dt=1214

elasticsearch5.0启动出现的错误 
http://blog.csdn.net/qq942477618/article/details/53414983

ElasticSearch 常用的查询过滤语句
http://www.cnblogs.com/ghj1976/p/5293250.html


filebeat专题
http://www.cnblogs.com/louis2008/p/filebeat.html

filebeat.yml（中文配置详解）
http://www.cnblogs.com/zlslch/p/6622079.html

28.Filebeat的高级配置-Filebeat部分
http://blog.csdn.net/a464057216/article/details/51233375


ELK+Filebeat+Nginx集中式日志解决方案（一）
http://zhengmingjing.blog.51cto.com/1587142/1907456

ELK日志服务使用-filebeat多文件发送
http://bbotte.com/logs-service/use-elk-processing-logs-multiple-log-file-send/

初探ELK-以收集 nginx 日志为例示范搭建一个 ELK 环境的基本步骤
http://nosmoking.blog.51cto.com/3263888/1855680


filebeat+kafka+ELK5.4安装与部署
http://xiangcun168.blog.51cto.com/4788340/1933509


Filebeat5+Kafka+ELK Docker搭建日志系统
http://www.jianshu.com/p/9dfac37885cb


通过HTTP RESTful API 操作elasticsearch搜索数据 
http://blog.csdn.net/stark_summer/article/details/48830493

Elasticsearch+Logstash+Kibana教程
http://www.cnblogs.com/xing901022/p/4704319.html