Unbalanced Partitions Assignment

Hi!
First of all, I’m quite new to kafka.
I’m using Kafka 2.5.1 and Confluent 5.5.2
I have 2 instances of a stream app, with 4 stream threads each.
I am using localstate for performing aggregations.
So there are many topics that are read A,B,C. It’s values are transformed to an specific “dto” and sent to topic D. Topic D is consumed and a KTable is created for aggregating them. All have 24 partitions.
The problem is see is that while other topic partitions are evenly distributed among the different instances and stream threads. However topic D ( the one that is used to generate the aggregation ) partitions are all assigned to the same instance and just distributed among it’s stream threads.
So this is pretty unfortunate as this is one of the most resource intensive parts, so it would be much better and expected that topic D partitions be distributed among all the instances and streamthreads.
I have read that sticky assignor favors sending tasks to the same instances they were running before and this seems quite logical. That said i would expect it realizes that the assignment of tasks is totally unbalanced.
All other topics partitions are properly distributed/balanced.
Any suggestions?
Can someone point me in some reading or something that could allow me to address this?

thanks a lot

Hi @polaco,

That is a known issue. I guess the instance to which all stateful tasks that read topic D are assigned is the one you start first. Due to the sticky assignment strategy, the tasks do not migrate away from that instance anymore. That was indeed a bit unfortunate.

I say “was” because we improved the situation in KIP-441 which is included in Apache Kafka 2.6 and CP 6.0.0. This KIP proposes to balance the load without large state migrations that delay processing. More specifically a balanced assignment is computed, but the stateful tasks are not moved immediately to the new instances. First, we keep the stateful tasks on instances where the state exists and built-up the state on the the new instances. Then, when the state on the new instances are ready, we migrate the stateful tasks to the new instances to balance the workload. This balancing is done over multiple rebalances.

Note, that this KIP improves the situation, but Kafka Streams does not give any guarantees about the task assignment distribution. All is done on a best-effort basis.

Hope KIP-441 solves your issue.

Best,
Bruno

2 Likes

Hi @Bruno ,
Sorry for the late response, I had already read your response back then. Thanks for your support.
I have read KIP-441, thanks.
Then, our best bet would be to upgrade.
Seems fine to me.

Thanks!
Fede

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.