Does kafka stream works well when there are one input topic and two sink processor node?

choiwonpyo · 20 May 2022 08:06

Hello. I’m begginer developer of kafka stream.

I’m using spring boot and kafka stream.

When I check the topolgoy, there are one input source node and two sink output node

topology is simply like below
[hello topic] - A processing - A sink
[hello topic] - B processing - B sink

I knew that, application.id property is used for “consumer group id”

That means that the topology is executed by one consumer (if partition is one)

so… when i think about the situation application shut down and,

“A processing and Sink succeed, and B processing and sink failed situation”

when this situation,
kafka stream works well? (no duplicate? or redo well?)

abellemare · 20 May 2022 12:47

Hi @choiwonpyo , welcome to the forum!

It is common to have multiple output topics in a Kafka Streams app, as in your example.

It is also possible that events may be written to “A sink” topic, but a failure / crash prevents the processing results from B from being written to “B sink” topic. In this case, your application should be halted, the problem fixed, and the application restarted.

Since the consumer group for that application did not commit its offsets, it will reconsume the same batch of events and process them again. This can lead to duplicates. It is possible that you will end up with duplicate results in our output topics “A sink” AND possible in “B sink” as well.

no duplicate?

You could turn on exactly-once processing in the kafka streams configuration by setting processing.guarantee=exactly_once or processing.guarantee=exactly_once_v2 (depending on your broker version).

Here is a blog that explains it in more detail: Enabling Exactly-Once in Kafka Streams | Confluent

Note that exactly once semantics may slow down your processing throughput.

redo well?

You can also accept that duplicates may occur, and program your consumers of “A sink” and “B sink” topics to handle the data accordingly. In many cases (though not all), it is possible to write idempotent business logic that simply doesn’t care about duplicates.

Hopefully this helps !

Topic		Replies	Views
Can kafka stream guarantees transactional processing event if writing to multiple partitions? Kafka Streams	1	3182	8 September 2021
Deduplication layer Kafka Streams	3	5782	25 February 2022
Processing guarantee when demultiplexing data to multiple stores Kafka Streams	2	3277	14 September 2021
Should we stick with at-least-once or switch to exactly-once-v2? Kafka Streams	5	274	30 October 2024
Using Multiple Connectors VS Multiple Topics on Single Connector Kafka Connect	1	1355	24 April 2024

Does kafka stream works well when there are one input topic and two sink processor node?

Related topics