I have a topic
T with a message expiry
retention.ms set for 2 days. The topic has compaction ( cleanup.policy=compact,delete)
If I read that message into a
KStream and then further aggregate to a
KTable, will the
KTable honour that 2 day expiry? When the message is no longer in the topic
T, will the message also be removed from the
KTable automatically? Or does some housekeeping process need to tombstone those messages?
KStream will honor the retention period, in the sense that it will not process messages that are older than 2 days. But the events are not “deleted” from the stream - there isn’t really a notion of delete in streams, the events in the stream are immutable. There will also be no tombstones published in the stream. Therefore, if you aggregate the result in a
KTable, the table will not be updated to “delete” older events. The changelog topic associated with the
KTable will, in the case of un-windowed aggregations, not expire records based on time. In the case of windowed aggregations, the retention time will be derived from window size and grace period.
In general, the retention period of the input topic should be seen as an implementation detail to “not run out of space”, and not as a way to express your business logic. If you only want to aggregate results for 2 days worth of records, consider using a windowed aggregation.
Got it. Thank you @lbrutschy