2、kafka消息到hdfs

目的:kafka消息通过flume上传到hdfs

  每天上报的日志可能含之前的日志,但每天上报的日志是在一个一日目录下,ymd

1、s102 flume配置

  kafka_hdfs.txt

a1.sources = r1
a1.channels = c1
a1.sinks = k1

a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource
a1.sources.r1.batchSize = 5000
a1.sources.r1.batchDurationMillis = 2000
a1.sources.r1.kafka.bootstrap.servers = s102:9092
a1.sources.r1.kafka.topics = raw-logs
a1.sources.r1.kafka.consumer.group.id = g10

a1.channels.c1.type=memory

a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = /user/centos/umeng/raw-logs/%Y%m/%d/%H%M
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 1
a1.sinks.k1.hdfs.roundUnit = minute
a1.sinks.k1.hdfs.rollInterval = 30
a1.sinks.k1.hdfs.rollSize = 10240
a1.sinks.k1.hdfs.rollCount = 500
a1.sinks.k1.hdfs.useLocalTimeStamp = true
a1.sinks.k1.hdfs.fileType = DataStream

a1.sources.r1.channels=c1
a1.sinks.k1.channel=c1

2、准备hdfs目录

hdfs dfs -mkdir -p /user/centos/umeng/raw-logs

3、启动flume

4、查看hdfs

渐变 --> 突变
原文地址:https://www.cnblogs.com/lybpy/p/9872547.html