Cpu stays high on ksqldb even with no incoming eventst

Hi,
We have a ksqldb deployement on kubernetes using an auto-scaling base on cpu.
We are using kafka-connect to listen do changes on postgres database, and build ksql pipeline to process these data and load them to elasticsearch.

We notice that even when there is no traffic coming from kafka-connect ei, on the kafka topics, we still have a very high cpu usage on ksqldb instances and the auto-scaling stays at the max number.

Two questions:

  • why is the cpu high?
  • is there a configuration, or some guidelines, that can enable ksqldb to consume less when idle (no incoming event)?

Guess you would need to drop down to client level config. Cf Kafka Clients | Confluent Documentation

Thanks Matthias for the quick reply.
I profiled the ksqldb jvm using jstack, I found that all the threads spend time running this method: sun.nio.ch.EPoll.wait, which is normal i presume since there is no new data from the topics.

stackTrace:
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPoll.wait(java.base@11.0.10/Native Method)
at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@11.0.10/EPollSelectorImpl.java:120)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@11.0.10/SelectorImpl.java:124)
- locked <0x00000007a1e33db8> (a sun.nio.ch.Util$2)
- locked <0x00000007a19f23d0> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(java.base@11.0.10/SelectorImpl.java:136)
at org.apache.kafka.common.network.Selector.select(Selector.java:869)
at org.apache.kafka.common.network.Selector.poll(Selector.java:465)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:558)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:265)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1292)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1233)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1206)
at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:860)
at org.apache.kafka.streams.processor.internals.StreamThread.pollPhase(StreamThread.java:820)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:657)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:559)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:539)

When you say ‘drop down to client level config’ are you referring to max.poll.interval.ms or you had another parameter in mind.

Regards,
Richard

If threads are in wait they should be blocked and not utilize the CPU – wait should “suspend” the thread until it’s woken up again (no busy wait).

Thus, it’s still unclear why CPU utilization is high.

And yes, I did mean consumer/producer client configs. max.poll.interval.ms itself should not impact CPU utilization, but others could. However, if threads are blocking/waiting anyway, it seems the system behaves as expected and tuning configs seems not to be necessary.

This topic was automatically closed after 30 days. New replies are no longer allowed.