Streaming from stream vs streaming from kafka - throughput

Hello frens,
is there a difference in efficiency between:

a) streaming from stream:

final KStream<String, String> firstStream = streamsBuilder
                .stream(myTopic, ...);

firstStream.to(myTopic2, ...);

final KStream<String, String> secondStream = firstStream
                .filter(...)
                .transformValues(...);

secondStream.to(myTopic3, ...);

b) streaming from kafka:

final KStream<String, String> firstStream = streamsBuilder
                .stream(myTopic, ...);

firstStream.to(myTopic2, ...);

final KStream<String, String> secondStream = streamsBuilder
                .stream(myTopic2, ...)
                .filter(...)
                .transformValues(...);

secondStream .to(myTopic3, ...);

The second is more comfortable, cuz I can move it to separate class/method.

Your example is hard to understand. You cannot call sourceStream.to(myTopic2, ...); before you create it in your examples. Can you update your example accordingly?

Sorry, Fren. I’ve made an amendment.

1 Like

The first example should be slightly more performant, because it saves you one read operations, resulting in higher throughput and lower latency.

The first program will basically “fan-out” (thing of it like a broadcast) the stream, and does the write into the topic “on-the-side” while also forwarding the data in-memory:

myTopic --+--> filter() --> transformValues() --> myTopic3
          |
          +--> myTopic2

For the second example the data must be read back from the topic though:

myTopic --> myTopic2 --> filter() --> transformValues() --> myTopic3

You can get information how a topology is setup via Topology t = streamsBuilder.build(); t.describe();.

1 Like