We are trying to have an accounting process that aggregates and counts messages using certain headers. We currently download the whole messages and discard the body and just keep the header and metadata. Then we do our analytics based on those. However the messages are big and the client is Java thus creating a lot of load for the Garbage collector and taking bandwidth from the brokers. We can see a spike in the load of the broker when the system run. We were running in an hourly based and had to reduce to read 15 minutes at the time. It will be great if the broker will just send header and metadata because these are really small.
The whole Record of the TCP protocol must be consumed. You’re not required to deserialize the key or value, though, so just use ByteArrayDeserializer for both.
How big are your messages? If it’s over 1MB maybe it’s worth considering using the Claim Check pattern.
That is what I am doing, I drop the body and only keep the headers however at a rate of 190K records per seconds that is a lot of messages. And Java make it worst because of the GC. GC uses a large chunk of the time slice.
Messages are couple Kilobytes but there are coming at a rate of 160K/sec as a mention in the other reply Java GC takes a good chunk of processing time.
What GC algorithm are you using? Have you tried changing it? Does your service need to be Java based?