kafka高可用探究

kafka高可用探究

众所周知 kafka 的 topic 可以使用 --replication-factor 数和 partitions 数来保证服务的高可用性

 

问题发现

但在最近的运维过程中,3台集群的kafka,副本与分区都为3,有其中一台 broker 挂了导致整个集群成了不可用状态,消费者消费不到信息,这是为什么呢?

查了很多资料后发现是kafka本身的 topic __consumer_offsets 搞的鬼。

 

问题分析

在高版本的kakfa中,消费者的offset偏移量会保存在kafka自身一个叫做__consumer_offsets的topic中,由于这个topic是由kafka本身默认创建,所以副本数会配置文件中指定的默认副本数,一般为1。

查看副本分区情况一般为:

./kafka-topics.sh --zookeeper localhost:2181 --describe __consumer_offsets
Topic:__consumer_offsets  PartitionCount:50 ReplicationFactor:1 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
  Topic: __consumer_offsets Partition: 0  Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 1  Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 2  Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 3  Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 4  Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 5  Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 6  Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 7  Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 8  Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 9  Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 10 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 11 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 12 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 13 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 14 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 15 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 16 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 17 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 18 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 19 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 20 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 21 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 22 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 23 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 24 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 25 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 26 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 27 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 28 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 29 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 30 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 31 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 32 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 33 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 34 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 35 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 36 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 37 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 38 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 39 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 40 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 41 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 42 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 43 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 44 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 45 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 46 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 47 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 48 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 49 Leader: 1 Replicas: 1 Isr: 1

50个分区,每个分区1个副本。

50个分区是遍布在3台broker的,这就导致如果有其中一台broker服务挂了,在其broker的所有Partition将不能正常使用,就导致此Partition的消费者不知道自己的offset偏移量,就导致无法正常消费。

 

问题解决

方法1

由于现在kafka已经开始正常提供服务,所以只能动态修改:

先准备分区副本规划 json 文件

vim /data/vfan/consumer.json

{
    "version": 1, 
    "partitions": [
        {
            "topic": "__consumer_offsets", 
            "partition": 0, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 1, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 2, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 3, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 4, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 5, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 6, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 7, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 8, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 9, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 10, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 11, 
            "replicas": [
                3, 
                1, 
                2
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 12, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 13, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 14, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 15, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 16, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 17, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 18, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 19, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
​
        {
            "topic": "__consumer_offsets", 
            "partition": 20, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 21, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 22, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 23, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 24, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 25, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 26, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 27, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 28, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 29, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 30, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 31, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 32, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 33, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 34, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 35, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 36, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 37, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 38, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 39, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 40, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 41, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 42, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 43, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 44, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 45, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 46, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 47, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 48, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 49, 
            "replicas": [
                3, 
                2, 
                1
            ]
        }
    ]
}
各 replicas 所在的 broker id可以自定义修改,但不能有重复的broker

 

开始执行变更

./kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file /data/vfan/consumer.json --execute

校验变更是否完成

./kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file /data/vfan/consumer.json --verify

检查变更后效果

./kafka-topics.sh --zookeeper localhost:2181 --describe --topic __consumer_offsets
Topic:__consumer_offsets  PartitionCount:50 ReplicationFactor:3 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
  Topic: __consumer_offsets Partition: 0  Leader: 1 Replicas: 1,2,3 Isr: 3,2,1
  Topic: __consumer_offsets Partition: 1  Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 2  Leader: 3 Replicas: 3,2,1 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 3  Leader: 1 Replicas: 1,2,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 4  Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 5  Leader: 3 Replicas: 3,2,1 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 6  Leader: 1 Replicas: 1,2,3 Isr: 3,2,1
  Topic: __consumer_offsets Partition: 7  Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 8  Leader: 3 Replicas: 3,2,1 Isr: 2,1,3
  Topic: __consumer_offsets Partition: 9  Leader: 1 Replicas: 1,2,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 10 Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 11 Leader: 3 Replicas: 3,1,2 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 12 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
  Topic: __consumer_offsets Partition: 13 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 14 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 15 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 16 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 17 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 18 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 19 Leader: 3 Replicas: 3,2,1 Isr: 1,3,2
  Topic: __consumer_offsets Partition: 20 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 21 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 22 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 23 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
  Topic: __consumer_offsets Partition: 24 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 25 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 26 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 27 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 28 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 29 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 30 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 31 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 32 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 33 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 34 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 35 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
  Topic: __consumer_offsets Partition: 36 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 37 Leader: 3 Replicas: 3,2,1 Isr: 1,3,2
  Topic: __consumer_offsets Partition: 38 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 39 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
  Topic: __consumer_offsets Partition: 40 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 41 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
  Topic: __consumer_offsets Partition: 42 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 43 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 44 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 45 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
  Topic: __consumer_offsets Partition: 46 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 47 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
  Topic: __consumer_offsets Partition: 48 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
  Topic: __consumer_offsets Partition: 49 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3

副本数已经成为三个并分布在三个broker中,实现高可用。

 

方法2

直接在kafka服务启动前,修改系统创建topic默认副本分区参数

num.partitions=3 ;当topic不存在系统自动创建时的分区数
default.replication.factor=3 ;当topic不存在系统自动创建时的副本数
offsets.topic.replication.factor=3 ;表示kafka的内部topic consumer_offsets副本数,默认为1

设置完毕后,启动 zk kafka,随后测试生产 消费

## 生产
./kafka-console-producer.sh --broker-list localhost:9092 --topic test
​
## 消费,--from-beginning参数表示从头开始
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning

 

查看系统生成的topic 分区及副本数

## test
./kafka-topics.sh --zookeeper localhost:2181 --describe --topic test
Topic:test  PartitionCount:3  ReplicationFactor:3 Configs:
​
## __consumer_offsets
./kafka-topics.sh --zookeeper localhost:2181 --describe --topic __consumer_offsets
Topic:__consumer_offsets  PartitionCount:50 ReplicationFactor:3 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer

系统自动生成的topic也都已实现高可用

原文地址:https://www.cnblogs.com/v-fan/p/15353157.html