flume的安装和使用

1.下载

[linyouyi@hadoop01 software]$ wget https://mirrors.aliyun.com/apache/flume/1.9.0/apache-flume-1.9.0-bin.tar.gz
--2019-09-05 14:39:06--  https://mirrors.aliyun.com/apache/flume/1.9.0/apache-flume-1.9.0-bin.tar.gz
Resolving mirrors.aliyun.com (mirrors.aliyun.com)... 27.148.180.227, 119.147.111.230, 119.147.111.231, ...
Connecting to mirrors.aliyun.com (mirrors.aliyun.com)|27.148.180.227|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 67938106 (65M) [application/gzip]
Saving to: ‘apache-flume-1.9.0-bin.tar.gz’

100%[=======================================================================>] 67,938,106  30.0MB/s   in 2.2s   

2019-09-05 14:39:08 (30.0 MB/s) - ‘apache-flume-1.9.0-bin.tar.gz’ saved [67938106/67938106]

2.解压

[linyouyi@hadoop01 software]$ tar -zxvf apache-flume-1.9.0-bin.tar.gz -C /hadoop/module/
[linyouyi@hadoop01 software]$ cd /hadoop/module/
[linyouyi@hadoop01 module]$ cd apache-flume-1.9.0-bin/
[linyouyi@hadoop01 apache-flume-1.9.0-bin]$ ll
total 176
drwxr-xr-x  2 linyouyi linyouyi  4096 Sep  5 14:40 bin
-rw-rw-r--  1 linyouyi linyouyi 85602 Nov 29  2018 CHANGELOG
drwxr-xr-x  2 linyouyi linyouyi  4096 Sep  5 14:40 conf
-rw-r--r--  1 linyouyi linyouyi  5681 Nov 16  2017 DEVNOTES
-rw-r--r--  1 linyouyi linyouyi  2873 Nov 16  2017 doap_Flume.rdf
drwxrwxr-x 12 linyouyi linyouyi  4096 Dec 18  2018 docs
drwxrwxr-x  2 linyouyi linyouyi  4096 Sep  5 14:40 lib
-rw-rw-r--  1 linyouyi linyouyi 43405 Dec 10  2018 LICENSE
-rw-r--r--  1 linyouyi linyouyi   249 Nov 29  2018 NOTICE
-rw-r--r--  1 linyouyi linyouyi  2483 Nov 16  2017 README.md
-rw-rw-r--  1 linyouyi linyouyi  1958 Dec 10  2018 RELEASE-NOTES
drwxrwxr-x  2 linyouyi linyouyi  4096 Sep  5 14:40 tools
[linyouyi@hadoop01 apache-flume-1.9.0-bin]$ ll conf/
total 16

3.启动agent

使用名为flume-ng的shell脚本启动代理程序,该脚本位于Flume发行版的bin目录中。您需要在命令行上指定代理名称,config目录和配置文件:

bin/flume-ng agent -n $agent_name -c conf -f conf/flume-conf.properties.template

-n agent_name #取名称
-c conf              #配置文件夹
-f  conf/flume-conf.properties.template    #配置文件

4.一个简单的例子

http://flume.apache.org/FlumeUserGuide.html#netcat-tcp-source

在这里,我们给出一个示例配置文件,描述单节点Flume部署。此配置允许用户生成事件,然后将其记录到控制台。

[linyouyi@hadoop01 apache-flume-1.9.0-bin]$ vim conf/example.conf
#example.conf:单节点Flume配置

#在此代理上命名组件
a1.sources  =  r1 
a1.sinks  =  k1 
a1.channels  =  c1 

#描述/配置源
a1.sources.r1.type  =  netcat 
a1。 sources.r1.bind  =  localhost 
a1.sources.r1.port  =  44444 

#描述接收器
a1.sinks.k1.type  =  logger 

#使用一个缓冲内存中事件的通道
a1.channels.c1.type  =  memory 
a1.channels .c1.capacity  =  1000 
a1.channels.c1.transactionCapacity  = 100 

#将源和接收器绑定到通道
a1.sources.r1.channels  =  c1 
a1.sinks.k1.channel  =  c1

此配置定义名为a1的单个代理。a1有一个侦听端口44444上的数据的源,一个缓冲内存中事件数据的通道,以及一个将事件数据记录到控制台的接收器。配置文件命名各种组件,然后描述其类型和配置参数。给定的配置文件可能会定义几个命名的代理 当一个给定的Flume进程启动时,会传递一个标志,告诉它要显示哪个命名代理。

鉴于此配置文件,我们可以按如下方式启动Flume:

[linyouyi@hadoop01 apache-flume-1.9.0-bin]$ bin/flume-ng agent --conf conf --conf-file conf/example.conf --name a1 -Dflume.root.logger=INFO,console

请注意,在完整部署中,我们通常会包含一个选项: - conf=<conf-dir>所述<CONF-DIR>目录将包括一个shell脚本flume-env.sh和潜在的一个log4j的属性文件。在这个例子中,我们传递一个Java选项来强制Flume登录到控制台,我们没有自定义环境脚本。

从一个单独的终端,我们可以telnet端口44444并向Flume发送一个事件:

$ telnet localhost 44444
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
Hello world! <ENTER>
OK

原始的Flume终端将在日志消息中输出事件。

12/06/19 15:32:19 INFO source.NetcatSource: Source starting
12/06/19 15:32:19 INFO source.NetcatSource: Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:44444]
12/06/19 15:32:34 INFO sink.LoggerSink: Event: { headers:{} body: 48 65 6C 6C 6F 20 77 6F 72 6C 64 21 0D          Hello world!. }

恭喜 - 您已成功配置并部署了Flume代理!后续部分更详细地介绍了代理配置。

5.exec采集

http://flume.apache.org/FlumeUserGuide.html#exec-source

[linyouyi@hadoop01 apache-flume-1.9.0-bin]$ cp conf/example.conf conf/example-exec.conf
[linyouyi@hadoop01 apache-flume-1.9.0-bin]$ vim conf/example-exec.conf
#example.conf:单节点Flume配置

#在此代理上命名组件
a1.sources  =  r1 
a1.sinks  =  k1 
a1.channels  =  c1 

#描述/配置源
a1.sources.r1.type  =  exec
a1。 sources.r1.command  =  tail -F /hadoop/module/text.log 

#描述接收器
a1.sinks.k1.type  =  logger 

#使用一个缓冲内存中事件的通道
a1.channels.c1.type  =  memory 
a1.channels .c1.capacity  =  1000 
a1.channels.c1.transactionCapacity  = 100 

#将源和接收器绑定到通道
a1.sources.r1.channels  =  c1 
a1.sinks.k1.channel  =  c1

启动

[linyouyi@hadoop01 apache-flume-1.9.0-bin]$ bin/flume-ng agent --conf conf --conf-file conf/example-exec.conf --name a1 -Dflume.root.logger=INFO,console

打开另一个客户端往/hadoop/module/text.log不断写数据,发现原始的Flume终端消息中输出信息

[linyouyi@hadoop01 module]$ echo "flume" >> text.log
[linyouyi@hadoop01 module]$ echo "flume" >> text.log
[linyouyi@hadoop01 module]$ cat text.log 
flume
flume
flume
[linyouyi@hadoop01 module]$ echo "flume" >> text.log
[linyouyi@hadoop01 module]$ echo "hello linyouyi" >> text.log

原始终端输出信息

2019-09-05 15:38:29,208 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.source.ExecSource.start(ExecSource.java:170)] Exec source starting with command: tail -F /hadoop/module/text.log
2019-09-05 15:38:29,209 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:119)] Monitored counter group for type: SOURCE, name: r1: Successfully registered new MBean.
2019-09-05 15:38:29,209 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:95)] Component type: SOURCE, name: r1 started
2019-09-05 15:38:33,213 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 66 6C 75 6D 65                                  flume }
2019-09-05 15:38:33,213 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 66 6C 75 6D 65                                  flume }
2019-09-05 15:38:33,213 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 66 6C 75 6D 65                                  flume }
2019-09-05 15:38:35,263 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 66 6C 75 6D 65                                  flume }
2019-09-05 15:39:05,265 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 68 65 6C 6C 6F 20 6C 69 6E 79 6F 75 79 69       hello linyouyi }

在配置文件中使用环境变

http://flume.apache.org/FlumeUserGuide.html

原文地址:https://www.cnblogs.com/linyouyi/p/11466171.html