Kafka Consumer groups – partition assignment
|Kafka provides horizontal scalability through partitions and topic consumers. In a given consumer group, one partition can only be assigned to one consumer. Topic partitions can be increased to let more consumers work on a topic, having more consumers than the number partitions is not useful. Clients can control how partitions are assigned to available consumers by specifying assignment strategy in the consumer group config ( partition.assignment.strategy ). There are few assignors provided as part of kafka driver, clients can also customize the assignment process by implementing ConsumerPartitionAssignor. In this post we will see how some of these partition assignors work.
Group Coordinator
one of the brokers in the kafka cluster will act as group coordinator for the consumer group. Different consumer groups may have different kafka brokers assigned as group coordinator.
Group leader
The group coordinator will choose one of the consumers as group leader (generally the first consumer to join the group). The Group leader will receive the list of consumers and their subscriptions, based on this data it will assign partitions to the consumers.
Rebalancing event
The consumer group is a dynamic unit. The number of consumers may change over time (because more consumers are added or existing consumers are removed) and the number of partitions could also change. When these changes happen, a rebalance will occur to re-distribute the partitions among the available consumers (all the partitions that are assigned to dead consumers will be reclaimed and assigned to the ones that are still available in the group).
Partition assigners
- Range assignor (default 1)
- Round Robin assignor
- Sticky Assignor
- Cooperative Sticky Assignor (default 2)
Range Assignor
Range assignor works by breaking the partitions of a topic in (nearly) equal number of ranges and assigns these ranges to the consumer. Before assignment, partitions are sorted by numeric order and consumers are sorted by lexicographic order. Because of sorting the assignment will be deterministic for a given set of inputs. The same process is repeated for every topic present in the subscription.
partion_per_consumer = number_of_paritions / number_of_consumer
First few consumers will get one extra partition if partitions can’t be evenly distributed among the consumers.
number_of_paritions % number_of_consumer
if a consumer group is subscribed to multiple topics, then partitions from each topic will be colocated on the consumer (assuming all the topics have the same number of partitions). This also means you can’t increase parallelism. For example consider the following case, there are two topics with each three partitions. Only the first three consumers will get the partition assignment, remaining will be idle (C3, C4, C5 didn’t get any assignments in the following example).
Round robin assignor
This assignor lists all the partitions from all the topics that are subscribed to the consumers, then iterates over the partitions assigning them to consumers one by one. If all the consumers in the consumer group have the same subscription config, then the round robin assignor will uniformly distribute the partitions across consumers.
Sticky assignor
Sticky assignor tries to achieve the following goals in that order
- Distribute the partitions among the consumer as evenly as possible
- Tries to minimize the partition movement across consumers in case of rebalance
When all the consumers are subscribed to the same topics, sticky assignor performance is similar to round robin assignor. If subscriptions are not the same, then sticky assignor performance is better than round robin assignor.
Cooperative sticky assignor is a variation of sticky assignor which supports cooperative rebalancing.
What is the difference between consumer group leader and consumer group coordinator