Compaction of low-traffic topics

We have a low-traffic topic with around 500k events that has been configured with a compaction strategy. In practice this topic never actually compacts. This makes rerunning from earliest rerun all “versions” rather than just once for latest. I realize that it’s not “wrong” but is there really no setting with confluent cloud that will cause a low-traffic topic to periodically run compaction?

You can configure max.compaction.lag.ms to control this but keep in mind the minimum allowed values:

Minimum value: 21,600,000 ms (six hours) for Dedicated clusters and 604800000 ms (7 days) for Basic, Standard, and Enterprise clusters.

If these time periods are too long, you may contact support with your requirements to see if this can be changed for your account. From here:

All Confluent Cloud resources have hard thresholds that cannot be exceeded, but many of the default quotas can be increased based on your changing requirements. To request an increase for a quota, contact Confluent Support.

We did already adjust this. But still see multiple events for the same key when consuming topic from earliest even before the 7 day limit. Is that expected?

I forgot to mention segment.ms will also matter for this because active segments don’t get cleaned:

This configuration controls the period of time after which Kafka will force the log to roll even if the segment file is not full to ensure that retention can delete or compact old data

What is this set to? You can edit this setting after topic creation in the Confluent Cloud Console by going to the topic, then Configuration, then Edit settings, then Switch to expert mode. From there change the value for segment_ms. Be aware of this:

You can set segment.ms as low as 600000 (10 minutes), but the minimum of 14400000 (4 hours) is still enforced.

It’s been 600000 for some time without any effect on the compaction.

How about after 4 hours? Remember this caveat that 4 hours is the minimum despite the fact that you can enter lower values:

It’s been like that for some weeks without compaction running

I got a bit tricked by the offset/lag calculation so there is some compaction going on, but not every 4th hour. Looks more like a weekly schedule?

@kristoffer I tested this out on Confluent Cloud with max.compaction.lag.ms = 21600000 (6 hours) and segment.ms = 14400000 (4 hours) and I’m observing older messages do get deleted by the compaction process as expected.

There’s an edge case to keep in mind that you might be observing. Because the compaction process only considers inactive segments, you can wind up with multiple events of the same key even after compaction (because the latest event per key in the inactive segments won’t be deleted by the compaction process, and those same keys may also reside in active segments). Example with 6 hour max compaction lag and segment ms:

  1. produce k:1, k:2, k:3 (k is the key, number is the value)
  2. wait a day
  3. produce k:4

The k:4 event will trigger the segment with the first three messages to roll if segment rolling wasn’t already triggered. The compaction process will delete k:1 and k:2 but leave k:3 alone since it’s the latest event with key k. You wind up with k:3 and k:4 post compaction even though k:3 is older than 6 hours.

Follow up question: If nothing else happens on this topic, will k:3 be deleted at some point? Or will the segment with k:4 stay active and is not considered when compacting forever?

In this case, the segment with k:4 wouldn’t roll so it would remain this way if the cleanup policy were strictly compact. If the cleanup policy were compact,delete and retention.ms wasn’t set to -1 (no time based retention), then segment cleanup would kick in to delete the segment with k:3.

1 Like