Writting to kafka at 150K messages per sec in AZURE environemt

Hi All!

my team is working ot a java client that should receive over socket conenction tons of messages per sec.
The isues that we have are:

  • Kafka producer is not able to write the messages quickly enoght and after few minutes the channes is overloaded.
  • Producer initial params are as follows:
    acks, 0
    batchSize, 128MB
    bufferMemory, 6GB
    linger.ms, 100
  • Message size is ca: 225 chars.
  • JVM: W
    exec java -XX:+UseParallelGC -XX:+AggressiveHeap -Xmx20250m -cp . -jar /deployment/app.jar

Any suggestions how to optimize the performance?

Thanks in advance!
Vasil

my team is working ot a java client that should receive over socket conenction tons of messages per sec.

You mean, your team is building a Kafka consumer application that is reading from a Kafka topic, powered by a self-managed Apache Kafka cluster you are running in Azure? Or what is the “receiving over socket connection” referring to?

  • Kafka producer is not able to write the messages quickly enoght and after few minutes the channes is overloaded.

Any more details? This is little information to go with.

For example, Kafka producers can easily saturate network links in the cloud (example benchmark), so your problem might be related to the limits of your Azure environment rather than specifically to Kafka or some other software you are running.

Also, this means that your team is not only building a consumer application, but you are also building a producer application, perhaps for performance testing? Is the producer application written in Java, too?

batchSize, 128MB

What was the reason to set such a high batch size? For reference, the default setting is 16 kB.

Hello Miguno,

thankс for the prompt reply!

We are talking about a kafka producer. The raw data is received from a different channel: tcp socket which sends a huge amount of messages per sec. The java client reads this data, build a simple message and write it to the kafka topic.

It seams that the producer is not quick enough to write all data to kafka as it is received from the tcp socket. With the time it stuck and the server breaks the connection because our application can handle this troughput.

Regarding the kafka parameteres I have tried different batch sizes but all above 8MB. I will try with lower values to see how the app will behave. Meanwhile I will check the Azure environment setup.

Thanks ones again.

Best regards
Vasil