Kafka安装

一、下载kafka:

http://kafka.apache.org/downloads

二、解压

tar -zxvf kafka_2.10-0.10.0.1.tgz 

三、kafka需要用到zookeeper,可以是单节点,也可以是zk集群。

(1)、单节点zk

kafka本身自带了一个测试zk,可以使用kafka自带的zk节点来测试。

1、启动单节点zookeeper

bin/zookeeper-server-start.sh config/zookeeper.properties

2、启动kafka 服务:

bin/kafka-server-start.sh config/server.properties

3、创建一个topic

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

4、创建一个produce,生产者角色,产生数据,并发送给kafka

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test

5、创建一个 consumer,消费者角色,消费数据,接收由produce产生,kafka传递过来的数据。

bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning

在produce控制台输入一些字符,就可以在消费者控制台看到数据了。

[hadoop@master2 kafka_2.10-0.10.0.1]$ bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
java
This is a message
This is another message

(2)、zk集群模式:

1、编辑vi config/server.properties 文件,将配置文件中zookeeper的地址改成zk集群节点和kafka数据存放路径

#zookeeper.connect=localhost:2181
zookeeper.connect=node1:2181,node2:2181,node3:2181

# kafka数据存放路径
# A comma seperated list of directories under which to store log files
log.dirs=/data/kafka_2.10-0.10.0.1/message-folder

delete.topic.enable=true
# 设置hostname,不然可能报org.apache.kafka.common.errors.TimeoutException的错误
# https://blog.csdn.net/lifuxiangcaohui/article/details/73350940
host.name=192.168.232.128

  

2、启动zk集群

3、使用修改后的server.properties文件启动kafka

 bin/kafka-server-start.sh config/server.properties 

 或者采用后台执行

nohup bin/kafka-server-start.sh config/server.properties > kafka_run.log 2>1 &

  启动日志:

[hadoop@master2 kafka_2.10-0.10.0.1]$ bin/kafka-server-start.sh config/server.properties
[2016-10-09 01:21:38,298] INFO KafkaConfig values: 
        request.timeout.ms = 30000
        log.roll.hours = 168
        inter.broker.protocol.version = 0.10.0-IV1
        log.preallocate = false
        security.inter.broker.protocol = PLAINTEXT
.......
 (kafka.server.KafkaConfig)
[2016-10-09 01:21:38,373] INFO starting (kafka.server.KafkaServer)
[2016-10-09 01:21:38,383] INFO Connecting to zookeeper on node1:2181,node2:2181,node3:2181 (kafka.server.KafkaServer)
[2016-10-09 01:21:38,414] INFO Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,414] INFO Client environment:host.name=master2 (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,415] INFO Client environment:java.version=1.7.0_79 (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,428] INFO Client environment:java.vendor=Oracle Corporation (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,428] INFO Client environment:java.home=/data/jdk1.7.0_79/jre (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,429] INFO Client environment:java.class.path=:/data/kafka_2.10-0.10.0.1/bin/../libs/aopalliance-repackaged-2.4.0-b34.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/argparse4j-0.5.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/connect-api-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/connect-file-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/connect-json-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/connect-runtime-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/guava-18.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/hk2-api-2.4.0-b34.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/hk2-locator-2.4.0-b34.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/hk2-utils-2.4.0-b34.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jackson-annotations-2.6.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jackson-core-2.6.3.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jackson-databind-2.6.3.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jackson-jaxrs-base-2.6.3.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jackson-jaxrs-json-provider-2.6.3.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jackson-module-jaxb-annotations-2.6.3.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/javassist-3.18.2-GA.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/javax.annotation-api-1.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/javax.inject-1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/javax.inject-2.4.0-b34.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/javax.servlet-api-3.1.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/javax.ws.rs-api-2.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-client-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-common-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-container-servlet-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-container-servlet-core-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-guava-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-media-jaxb-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-server-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-http-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-io-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-security-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-server-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-util-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jopt-simple-4.9.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka_2.10-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka_2.10-0.10.0.1-sources.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka_2.10-0.10.0.1-test-sources.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka-clients-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka-log4j-appender-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka-streams-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka-streams-examples-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka-tools-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/log4j-1.2.17.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/lz4-1.3.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/metrics-core-2.2.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/osgi-resource-locator-1.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/reflections-0.9.10.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/rocksdbjni-4.8.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/scala-library-2.10.6.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/slf4j-api-1.7.21.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/slf4j-log4j12-1.7.21.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/snappy-java-1.1.2.6.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/validation-api-1.1.0.Final.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/zkclient-0.8.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/zookeeper-3.4.6.jar (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,430] INFO Client environment:java.library.path=/usr/java/packages/lib/i386:/lib:/usr/lib (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,430] INFO Client environment:java.io.tmpdir=/tmp (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,430] INFO Client environment:java.compiler=<NA> (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,430] INFO Client environment:os.name=Linux (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,431] INFO Client environment:os.arch=i386 (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,431] INFO Client environment:os.version=2.6.18-92.el5 (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,431] INFO Client environment:user.name=hadoop (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,431] INFO Client environment:user.home=/home/hadoop (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,431] INFO Client environment:user.dir=/data/kafka_2.10-0.10.0.1 (org.apache.zookeeper.ZooKeeper)
[2016-10-09 01:21:38,433] INFO Starting ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
...........
[2016-10-09 01:21:39,870] INFO Kafka commitId : a7a17cdec9eaa6c5 (org.apache.kafka.common.utils.AppInfoParser)
[2016-10-09 01:21:39,872] INFO [Kafka Server 0], started (kafka.server.KafkaServer)

4、创建一个topic

bin/kafka-topics.sh --create --zookeeper node1:2181,node2:2181,node3:2181 --replication-factor 1 --partitions 1 --topic test

如果该topic已经存在,就报错:

[2016-10-09 01:23:35,106] ERROR kafka.common.TopicExistsException: Topic "test" already exists.
        at kafka.admin.AdminUtils$.createOrUpdateTopicPartitionAssignmentPathInZK(AdminUtils.scala:420)
        at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:404)
        at kafka.admin.TopicCommand$.createTopic(TopicCommand.scala:110)
        at kafka.admin.TopicCommand$.main(TopicCommand.scala:61)
        at kafka.admin.TopicCommand.main(TopicCommand.scala)
 (kafka.admin.TopicCommand$)

5、查看已经创建的topic

[hadoop@master2 kafka_2.10-0.10.0.1]$ bin/kafka-topics.sh --list --zookeeper node1:2181,node2:2181,node3:2181
test

6、创建一个数据生产者

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test

7、创建一个数据消费者

bin/kafka-console-consumer.sh --zookeeper node1:2181,node2:2181,node3:2181 --topic test --from-beginning

测试:

在数据生产者控制台输入数据

在数据消费者控制台可以看到相应的数据:

[hadoop@master2 kafka_2.10-0.10.0.1]$ bin/kafka-console-consumer.sh --zookeeper node1:2181,node2:2181,node3:2181 --topic test --from-beginning
java
This is a message
This is another message

四、安装kafka集群

我使用两台机安装了两个kafka节点。

1、把kafka复制到其他机器上去,

2、修改config/server.properties文件,分别把broker.id改为其他数字,一定要是正数,不能跟其他节点相同

broker.id=2

3、分别启动kafka

bin/kafka-server-start.sh config/server.properties

4、如果server.properties文件里配置(即log.dirs配置项)的kafka数据存放目录下,meta数据已经存在,需要清空该文件夹。否则可能会报以下错误。

或者修改kafka数据存放目录下meta.properties文件中broker.id配置项,使之跟server.properties中的broker.id一致。

[2016-10-12 00:09:10,898] FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.common.InconsistentBrokerIdException: Configured broker.id 1 doesn't match stored broker.id 0 in meta.properties. If you moved your data, make sure your configured broker.id matches. If you intend to create a new broker, you should remove all data in your data directories (log.dirs).
	at kafka.server.KafkaServer.getBrokerId(KafkaServer.scala:648)
	at kafka.server.KafkaServer.startup(KafkaServer.scala:187)
	at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
	at kafka.Kafka$.main(Kafka.scala:67)
	at kafka.Kafka.main(Kafka.scala)
[2016-10-12 00:09:10,900] INFO shutting down (kafka.server.KafkaServer)
[2016-10-12 00:09:10,914] INFO Shutting down. (kafka.log.LogManager)
[2016-10-12 00:09:11,113] INFO Shutdown complete. (kafka.log.LogManager)
[2016-10-12 00:09:11,115] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2016-10-12 00:09:11,136] INFO EventThread shut down (org.apache.zookeeper.ClientCnxn)
[2016-10-12 00:09:11,136] INFO Session: 0x257b7b394f70000 closed (org.apache.zookeeper.ZooKeeper)
[2016-10-12 00:09:11,140] INFO shut down completed (kafka.server.KafkaServer)
[2016-10-12 00:09:11,142] FATAL Fatal error during KafkaServerStartable startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
kafka.common.InconsistentBrokerIdException: Configured broker.id 1 doesn't match stored broker.id 0 in meta.properties. If you moved your data, make sure your configured broker.id matches. If you intend to create a new broker, you should remove all data in your data directories (log.dirs).
	at kafka.server.KafkaServer.getBrokerId(KafkaServer.scala:648)
	at kafka.server.KafkaServer.startup(KafkaServer.scala:187)
	at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
	at kafka.Kafka$.main(Kafka.scala:67)
	at kafka.Kafka.main(Kafka.scala)

5、在其中一台机上创建一个topic,

bin/kafka-topics.sh --create --zookeeper node1:2181,node2:2181,node3:2181 --replication-factor 2 --partitions 2 --topic test-3

6、查看topic,已经创建成功,

[hadoop@master1 kafka_2.10-0.10.0.1]$ bin/kafka-topics.sh --list --zookeeper node1:2181,node2:2181,node3:2181
test-3

查看数据存放目录:两台机器上都有了:

[hadoop@master2 message-folder]$ ll
total 24
-rw-rw-r-- 1 hadoop hadoop    4 Oct 12 00:51 cleaner-offset-checkpoint
-rw-rw-r-- 1 hadoop hadoop   54 Oct  9 20:55 meta.properties
-rw-rw-r-- 1 hadoop hadoop   26 Oct 12 00:52 recovery-point-offset-checkpoint
-rw-rw-r-- 1 hadoop hadoop   26 Oct 12 00:52 replication-offset-checkpoint
drwxrwxr-x 2 hadoop hadoop 4096 Oct 12 00:52 test-3-0
drwxrwxr-x 2 hadoop hadoop 4096 Oct 12 00:52 test-3-1

 kafka集群安装成功。

 五、server.properties常用配置项:

broker.id=0 # kafka节点id,必须是正数,不能相同
num.network.threads=2 # kafka处理消息的线程数
num.io.threads=8 #kafka IO线程数
# 等待IO线程处理的请求队列最大数
queued.max.requests = 500
# socket发送数据的缓冲区大小
socket.send.buffer.bytes=1048576  
# socket接收数据的缓冲区大小
socket.receive.buffer.bytes=1048576
# socket请求的最大字节数
socket.request.max.bytes=104857600
# kafka数据存放目录,多个目录使用逗号分隔
log.dirs=/data/kafka_2.10-0.10.0.1/message-folder
# 分区数量
num.partitions=2
# 数据保存时间,单位:小时,默认是7天
log.retention.hours=168
# 日志segment文件的大小的上限,-1表示不限制。
log.segment.bytes=536870912
# 日志片段文件的检查周期,查看它们是否达到了删除策略的设置(log.retention.hours或log.retention.bytes
log.retention.check.interval.ms=60000
# 是否开启压缩
log.cleaner.enable=false
# 对于压缩的日志保留的最长时间
log.cleaner.delete.retention.ms = 1 day
#zookeeper连接地址,多个用逗号分隔
zookeeper.connect=localhost:2181
# zookeeper连接超时时间
zookeeper.connection.timeout.ms=1000000

    

六、常用命令:

(1)、kafka-topics.sh 脚本命令

1、脚本参数

--alter           修改topic分区配置,比如分区数量,replica assignment等。
--config          配置项,
--create          创建一个topic  
--delete          删除一个topic
--delete-config   删除一个topic配置项
--describe        列出topic详细信息
--disable-rack-aware 	          Disable rack aware replica assignment 
--help                            打印帮助选项
--if-exists                       在alter、删除一个topic时,仅在topic存在时执行
--if-not-exists                   创建一个topic时,在topic不存在时执行
--list                            列出所有可用topic
--partitions                      设置分区数
--replica-assignment              A list of manual partition-to-broker
--topic                           设置topic名
--topics-with-overrides           if set when describing topics, only how topics that have overridden configs
--unavailable-partitions          在列出topic信息(即describe)时,列出不用的分区
--under-replicated-partitions     if set when describing topics, only show under replicated partitions  
--zookeeper                       zookeeper连接地址,格式host:port,host:port

示例: 

1、创建一个topic

创建一个名为test-1,partition备份数为1,分区数为1的topic。

bin/kafka-topics.sh --create --zookeeper node1:2181,node2:2181,node3:2181 --replication-factor 1 --partitions 1 --topic test-1

注意,partition备份数不可以超过kafka集群的数量,分区数可以。

2、查看topic列表

bin/kafka-topics.sh --list --zookeeper node1:2181,node2:2181,node3:2181

3、删除一个topic,

bin/kafka-topics.sh --delete --zookeeper node1:2181,node2:2181,node3:2181 --topic test-3

 再查看topic列表,其实并没有立刻删除。。

bin/kafka-topics.sh --list --zookeeper node1:2181,node2:2181,node3:2181

控制台显示:Topic test-3 is marked for deletion.

解决办法:

  A。手动删除方法:

  先删除每个broker节点的topic数据,目录在server.properties文件的log.dirs配置项,以要删除的topic 名字开头的文件夹。

  再删除zookeeper的数据:

  rmr /brokers/topics/{topic_name}
  rmr /admin/delete_topics/{topic_name}

  rmr /config/topics/{topic_name}

  B、kafka自动立刻删除:

  需要设置在启动broker时候开启删除topic的开关,即在server.properties中添加:

delete.topic.enable=true

 4、查看所有的topic

bin/kafka-topics.sh --list --zookeeper node1:2181,node2:2181,node3:2181

5、查看topic具体信息

bin/kafka-topics.sh --describe --zookeeper node1:2181,node2:2181,node3:2181 --topic test

 结果:

Topic:test	PartitionCount:2	ReplicationFactor:1	Configs:
	Topic: test	Partition: 0	Leader: 1002	Replicas: 1002	Isr: 1002
	Topic: test	Partition: 1	Leader: 1003	Replicas: 1003	Isr: 1003

6、修改topic分区

bin/kafka-topics.sh --zookeeper node1:2181,node2:2181,node3:2181 --alter --topic test --partitions 2

 结果:

WARNING: If partitions are increased for a topic that has a key, the partition logic or ordering of the messages will be affected
Adding partitions succeeded!

7、增加副本

不能使用kafka-topics.sh增加副本

新建一个json文件,里面的partition字段和replicas分别是分区号和副本号,这个需要用describe命令来查看

比如看到的

bin/kafka-topics.sh --describe --zookeeper  node1:2181,node2:2181,node3:2181 --topic test
Topic:test	PartitionCount:2	ReplicationFactor:1	Configs:
	Topic: test	Partition: 0	Leader: 1022	Replicas: 1022	Isr: 1022
	Topic: test	Partition: 1	Leader: 1020	Replicas: 1020	Isr: 1020

 这里有两个分区,一个副本,

{
    "version": 1,
    "partitions": [
        {
            "topic": "test",
            "partition": 0,
            "replicas": [
                1022,
                1020
            ]
        },
     {
            "topic": "test",
            "partition": 1,
            "replicas": [
               1022,
               1020
            ]
        }
    ]
}

 然后执行

bin/kafka-reassign-partitions.sh --zookeeper node1:2181,node2:2181,node3:2181  --reassignment-json-file add.json --execute

 结果:

bin/kafka-reassign-partitions.sh --zookeeper node1:2181,node2:2181,node3:2181  --reassignment-json-file add.json --execute

Current partition replica assignment

{"version":1,"partitions":[{"topic":"test","partition":1,"replicas":[1022,1020],"log_dirs":["any","any"]},{"topic":"test","partition":0,"replicas":[1022,1020],"log_dirs":["any","any"]}]}

Save this to use as the --reassignment-json-file option during rollback
Successfully started reassignment of partitions.

  再看看具体信息

 bin/kafka-topics.sh --describe --zookeeper  node1:2181,node2:2181,node3:2181 --topic test
Topic:test	PartitionCount:2	ReplicationFactor:2	Configs:
	Topic: test	Partition: 0	Leader: 1022	Replicas: 1022,1020	Isr: 1022,1020
	Topic: test	Partition: 1	Leader: 1020	Replicas: 1022,1020	Isr: 1020,1022

  

 参考:http://kafka.apache.org/quickstart

原文地址:https://www.cnblogs.com/fillPv/p/5942651.html