Processing guarantee when demultiplexing data to multiple stores

Hi,

I have 2 questions related to kafka streams,

  1. Is there minimum replication factor required to enable exactly_once for topic involved in transactions?

  2. At-least-once related question,
    Matthias wrote on StackOverflow:
    β€œThe contract of a Processor in Kafka Streams is, that you have fully processed an input record and forward() all corresponding output messages before process() return. – This contract implies that Kafka Streams is allowed to commit the corresponding offset after process() returns.”

Does this also include storeBackedByChangelog.put done in a processor?
Following the source code, context.forward and storeBackedByChangelog.put all endup in StreamsProducer.send. But wanted to confirm.

For example, we want to use kafka 2.8, at-least-once, on a giant β€œdemultiplexer” topology, with outputs to multiple stores: is at-least-once guarantee in such case?

Thanks!

Topologies:
Sub-topology: 0
Source: KSTREAM-SOURCE-0000000000 (topics: [events])
β†’ KSTREAM-BRANCH-0000000001
Processor: KSTREAM-BRANCH-0000000001 (stores: )
β†’ KSTREAM-BRANCHCHILD-0000000005, KSTREAM-BRANCHCHILD-0000000002, KSTREAM-BRANCHCHILD-0000000003, KSTREAM-BRANCHCHILD-0000000004, KSTREAM-BRANCHCHILD-0000000006, KSTREAM-BRANCHCHILD-0000000007, KSTREAM-BRANCHCHILD-0000000008
← KSTREAM-SOURCE-0000000000
Processor: KSTREAM-BRANCHCHILD-0000000002 (stores: ) (Branch on criteria 1)
β†’ KSTREAM-BRANCH-0000000009
← KSTREAM-BRANCH-0000000001
Processor: KSTREAM-BRANCHCHILD-0000000003 (stores: )
β†’ KSTREAM-BRANCH-0000000022
← KSTREAM-BRANCH-0000000001
Processor: KSTREAM-BRANCHCHILD-0000000004 (stores: )
β†’ KSTREAM-BRANCH-0000000035
← KSTREAM-BRANCH-0000000001
Processor: KSTREAM-BRANCHCHILD-0000000005 (stores: )
β†’ KSTREAM-BRANCH-0000000048
← KSTREAM-BRANCH-0000000001
Processor: KSTREAM-BRANCHCHILD-0000000006 (stores: )
β†’ KSTREAM-BRANCH-0000000061
← KSTREAM-BRANCH-0000000001
Processor: KSTREAM-BRANCHCHILD-0000000007 (stores: )
β†’ KSTREAM-BRANCH-0000000074
← KSTREAM-BRANCH-0000000001
Processor: KSTREAM-BRANCHCHILD-0000000008 (stores: )
β†’ KSTREAM-BRANCH-0000000087
← KSTREAM-BRANCH-0000000001
Processor: KSTREAM-BRANCH-0000000009 (stores: ) (Branch on criteria 2)
β†’ KSTREAM-BRANCHCHILD-0000000010, KSTREAM-BRANCHCHILD-0000000011, KSTREAM-BRANCHCHILD-0000000012, KSTREAM-BRANCHCHILD-0000000013, KSTREAM-BRANCHCHILD-0000000014, KSTREAM-BRANCHCHILD-0000000015
← KSTREAM-BRANCHCHILD-0000000002
Processor: KSTREAM-BRANCH-0000000022 (stores: )
β†’ KSTREAM-BRANCHCHILD-0000000023, KSTREAM-BRANCHCHILD-0000000024, KSTREAM-BRANCHCHILD-0000000025, KSTREAM-BRANCHCHILD-0000000026, KSTREAM-BRANCHCHILD-0000000027, KSTREAM-BRANCHCHILD-0000000028
← KSTREAM-BRANCHCHILD-0000000003
Processor: KSTREAM-BRANCH-0000000035 (stores: )
β†’ KSTREAM-BRANCHCHILD-0000000036, KSTREAM-BRANCHCHILD-0000000037, KSTREAM-BRANCHCHILD-0000000038, KSTREAM-BRANCHCHILD-0000000039, KSTREAM-BRANCHCHILD-0000000040, KSTREAM-BRANCHCHILD-0000000041
← KSTREAM-BRANCHCHILD-0000000004
Processor: KSTREAM-BRANCH-0000000048 (stores: )
β†’ KSTREAM-BRANCHCHILD-0000000049, KSTREAM-BRANCHCHILD-0000000050, KSTREAM-BRANCHCHILD-0000000051, KSTREAM-BRANCHCHILD-0000000052, KSTREAM-BRANCHCHILD-0000000053, KSTREAM-BRANCHCHILD-0000000054
← KSTREAM-BRANCHCHILD-0000000005
Processor: KSTREAM-BRANCH-0000000061 (stores: )
β†’ KSTREAM-BRANCHCHILD-0000000062, KSTREAM-BRANCHCHILD-0000000063, KSTREAM-BRANCHCHILD-0000000064, KSTREAM-BRANCHCHILD-0000000065, KSTREAM-BRANCHCHILD-0000000066, KSTREAM-BRANCHCHILD-0000000067
← KSTREAM-BRANCHCHILD-0000000006
Processor: KSTREAM-BRANCH-0000000074 (stores: )
β†’ KSTREAM-BRANCHCHILD-0000000075, KSTREAM-BRANCHCHILD-0000000076, KSTREAM-BRANCHCHILD-0000000077, KSTREAM-BRANCHCHILD-0000000078, KSTREAM-BRANCHCHILD-0000000079, KSTREAM-BRANCHCHILD-0000000080
← KSTREAM-BRANCHCHILD-0000000007
Processor: KSTREAM-BRANCH-0000000087 (stores: )
β†’ KSTREAM-BRANCHCHILD-0000000088, KSTREAM-BRANCHCHILD-0000000089, KSTREAM-BRANCHCHILD-0000000090, KSTREAM-BRANCHCHILD-0000000091, KSTREAM-BRANCHCHILD-0000000092, KSTREAM-BRANCHCHILD-0000000093
← KSTREAM-BRANCHCHILD-0000000008
Processor: KSTREAM-BRANCHCHILD-0000000010 (stores: )
β†’ KSTREAM-PROCESSOR-0000000016
← KSTREAM-BRANCH-0000000009
…
Processor: KSTREAM-BRANCHCHILD-0000000093 (stores: )
β†’ KSTREAM-PROCESSOR-0000000099
← KSTREAM-BRANCH-0000000087
Processor: KSTREAM-PROCESSOR-0000000016 (stores: [snapshot-1])
β†’ none
← KSTREAM-BRANCHCHILD-0000000010
…
Processor: KSTREAM-PROCESSOR-0000000099 (stores: [snapshot-20])
β†’ none
← KSTREAM-BRANCHCHILD-0000000093

Well, technically you can configure the system to use EOS with any replication factor. However, using a lower replication factor than 3 effectively voids EOS. Thus, it’s strongly recommended to use a replication factor of 3.

Yes, it also includes all state store updates. In the end, we use the same producer for writing into output topic and internal changelog and repartition topics.

Again, Thanks for the clarification.

1 Like