I have a system where I sync multiple remote storage providers. I keep in a kafka topic the changes to the files from one provider than need to be synced to another provider, let’s say a file create of file content change.
I could have a situation where one provider does a lot of changes and many messages are sent to the topic and consumers will be busy processing mostly those provider messages not processing much from other providers.
Ideally I would want to have a fair distribution, logically like a queue for each provider and consumers process round-robin from each queue. Like that all providers messages will be processed fairly.
Even if I send message to a partition by hash cluster, like provider_id % num_partitions
this will affect al providers in the same partition.