Using Modular Topologies in Kafka Streams to Scale ksql’s Persistent Queries

Using Modular Topologies in Kafka Streams to Scale ksql’s Persistent Queries
Date : April 25, 2022
Time : 4:00 PM - 4:45 PM BST


  • Walker Carlson, Software Engineer, Confluent
  • A. Sophie Blee-Goldman, Senior Software Engineer, Confluent

ksql is a streaming database that uses Kafka Streams to execute queries against data in Apache Kafka®. Historically, each query was compiled into its own Kafka Streams program to be executed inside the ksql servers. As ksql moved to support broader and more complex use cases, this query execution strategy became the bottleneck for scaling up the number of persistent queries. This talk will examine the problems faced and how we addressed them.

Using too many Kafka Streams instances requires too many resources in both threads and consumers. One way to avoid this is using Modular Topologies, which are coming to Kafka Streams in KIP-809. Modular Topologies allow us to dynamically change the workload of a Kafka Streams application while it’s running and share resources such as consumer/producer clients and processing threads. This makes it possible to use a single Kafka Streams runtime for multiple topologies that share consumers and threads across them. We will see in detail how this makes it possible for ksqlDB to consolidate queries into a shared Kafka Streams runtime.

Kafka Streams developers will take away from this talk an understanding of how to utilize ModularTopologies, and dynamically upgrade their Kafka Streams workload effectively.