Streaming from stream vs streaming from kafka - throughput

programista4k · 22 May 2021 18:17

Hello frens,
is there a difference in efficiency between:

a) streaming from stream:

final KStream<String, String> firstStream = streamsBuilder
                .stream(myTopic, ...);

firstStream.to(myTopic2, ...);

final KStream<String, String> secondStream = firstStream
                .filter(...)
                .transformValues(...);

secondStream.to(myTopic3, ...);

b) streaming from kafka:

final KStream<String, String> firstStream = streamsBuilder
                .stream(myTopic, ...);

firstStream.to(myTopic2, ...);

final KStream<String, String> secondStream = streamsBuilder
                .stream(myTopic2, ...)
                .filter(...)
                .transformValues(...);

secondStream .to(myTopic3, ...);

The second is more comfortable, cuz I can move it to separate class/method.

mjsax · 22 May 2021 19:22

Your example is hard to understand. You cannot call sourceStream.to(myTopic2, ...); before you create it in your examples. Can you update your example accordingly?

programista4k · 22 May 2021 19:24

Sorry, Fren. I’ve made an amendment.

mjsax · 22 May 2021 19:46

The first example should be slightly more performant, because it saves you one read operations, resulting in higher throughput and lower latency.

The first program will basically “fan-out” (thing of it like a broadcast) the stream, and does the write into the topic “on-the-side” while also forwarding the data in-memory:

myTopic --+--> filter() --> transformValues() --> myTopic3
          |
          +--> myTopic2

For the second example the data must be read back from the topic though:

myTopic --> myTopic2 --> filter() --> transformValues() --> myTopic3

You can get information how a topology is setup via Topology t = streamsBuilder.build(); t.describe();.

Topic		Replies	Views
Throughput &metrics Kafka Streams	0	46	24 January 2025
What's the difference between using Kafka Streams and the Kafka Consumer? Kafka Streams	3	6962	4 December 2020
Import topic to store with simple processing Kafka Streams	1	3051	7 February 2022
Performance Optimization of Realtime Kafka Streams Processing Stream Processing	12	1035	5 September 2024
Can we use generic serdes for serialization and deserialization Kafka Streams	3	987	9 February 2024

Streaming from stream vs streaming from kafka - throughput

Related topics