Does a tombstone in toTable() propagate to downstream groupBy/aggregate if the key never existed?

Dawid · 29 January 2026 15:38

Hi,
I have a questions about the behavior of tombstones in Kafka Streams topology when using toTable() followed by groupBy() and aggregate().
Simplified code looks like this:

 <Some 1 partition KStream>
.process(MyProcessor::new)
.toTable(...) // with KeyValueStore
.groupBy((k, v) -> new KeyValue<>(value.a(), value)
.aggregate(...)

The situation:
In MyProcessor, I forward tombstones for keys that may have never existed in the downstream toTable() store. This is done “just to be sure” that any potential stale data is cleaned up.

My questions:

When toTable() receives a tombstone for a key that never existed in its state store (i.e., store.get(key) returns null), does it forward this tombstone downstream to the groupBy() operator?
If it does propagate, does the aggregate() have to do any meaningful work?
From a performance perspective, if ~33% of my events are such “redundant” tombstones (for non-existent keys), what is the actual cost? We recently experienced a lag of over 90 million events on the internal repartition topic (the one created before toTable(), named like KSTREAM-TOTABLE-0000000XXX-repartition). It took about 1 day to process. The source KStream for this repartition topic has only 1 partition.

Thanks in advance for any clarification!

mjsax · 29 January 2026 16:27

In general toTable would forward the tombstone blindly (there is some exceptions when you are using versioned state stores, but versioned stores are not the default, so I guess it doesn’t apply to your case).

Kafka Streams follows an “emit on update” policy, ie, even if a row does not change (could also be an idempotent put(key, myValue) with updates myValue to myValue), the update will be forwarded. (At some point there was a proposals to change to “emit on change” policy, but this policy caused issues and it was never implemented.)

The aggregation would still process the update: this implies to read the current aggregation result and an idempotent put to write it back (as the aggregation result didn’t change), plus forwarding the result record again (following emit on update policy again).

How much overhead this implies is hard to say. Depends on many tuning parameters (eg cache size, RocksDB, you name it), and also how you deploy it (ie, what hardware)… If my math is right, assuming 24h processing time for 90M records, it would be roughly 1K rec/sec. This does sound a little bit low – KS should be able to do in the range of 10K-50K rec/sec (on a single partition) depending on the operation.

So you might want to dig into state store performance, as your workload is most likely I/O bound.

Dawid · 2 February 2026 08:22

Thanks for the detailed explanation! I have a follow-up regarding the “emit on update” policy in the context of a groupBy operation.

What happens if a tombstone arrives for a key that has never existed in the state store? Since a KTable#groupBy needs to extract a grouping key from the message content. Does the message still somehow reach the aggregate step?

mjsax · 2 February 2026 20:19

It’s complicated… Maybe check the code for yourself?

Bottom line is, the tombstone should be dropped on the floor and not get forwarded to the downstream aggreation.

github.com/apache/kafka

streams/src/main/java/org/apache/kafka/streams/kstream/internals/KTableRepartitionMap.java

5498eedf9


      
          public void process(final Record<K, Change<V>> record) {
              // the original key should never be null
              if (record.key() == null) {
                  throw new StreamsException("Record key for the grouping KTable should not be null.");
              }
          
              final boolean isLatest = record.value().isLatest;
              if (useVersionedSemantics && !isLatest) {
                  // skip out-of-order records when aggregating a versioned table, since the
                  // aggregate should include latest-by-timestamp records only. as an optimization,
                  // do not forward the out-of-order record downstream to the repartition topic either.
                  return;
              }
          
              // if the value is null, we do not need to forward its selected key-value further
              final KeyValue<? extends K1, ? extends V1> newPair = record.value().newValue == null ? null :
                  mapper.apply(record.key(), record.value().newValue);
              final KeyValue<? extends K1, ? extends V1> oldPair = record.value().oldValue == null ? null :
                  mapper.apply(record.key(), record.value().oldValue);

This file has been truncated. show original

system · 9 February 2026 20:20

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
KTable aggregation with tombstones Kafka Streams	9	793	9 September 2024
Tombstone messages not propagated ksqlDB	2	4510	30 September 2021
Delete row from aggregation table/performance effects if not doing so ksqlDB	1	1472	28 September 2023
Kafka Stream newbie question Kafka Streams	3	3164	5 October 2021
Ktable flatMap+groupBy Kafka Streams	0	2868	11 September 2022

Does a tombstone in toTable() propagate to downstream groupBy/aggregate if the key never existed?

Related topics