I have a Kafka instance with various topics. One topic in particular has the most data coming through, and the most consumers. It is about 100-200 messages per second, and each message ~500 bytes (so about 100KB/s).
Once we go over a certain number of consumers (on the order of 20-30), we begin to see spikes in consumer lag where a message that was produced will not be received by the consumer until several seconds later (even when they are on the same machine).
It is a single broker, the topic has single partition with one producer. A couple consumers need to process all the messages. The others are only interested in a subset of keys, but from what I’ve researched, you can’t consume a subset of messages by key, you need to know in advance what partition the key has been written to. However I don’t know how to determine which partition a key is being written to at the time of creating the consumer, nor do I know if this would even solve the problem we are having.