Is ksqlDB suitable for reconsuming messages with a specific key only?

We have the situation where we wish to use ksqlDB for the aggregation of various invoice data. The exact aggregation that should occur here is dependent on user-specific configuration values which, for example, determine when the cut-off for a calendar day should be (e.g. 00:00 vs 01:00), which is important depending on at which time grain a user wishes to look at their data.

So, the normal workflow would be something like:

Kafka Topic → Partitioned user ID → ksql Aggregations

A user would generally be able to change this (and other) configuration setting, potentially triggering the need for a “recount” based on another version of the configuration.

My question is: is there a best practice way to deal with this situation?

Ordinarily when making global changes or e.g. bugfixes, it would be enough to:

  • Increase version id
  • Reset offsets
  • Calculate new data based off new version id
  • Switch live data to data based off new version id once caught up

But this would impact every user, not just the one requiring it. This obviously doesn’t scale.

One idea that seems like it might have potential is spinning up a separate stream for the specific user whenever they change anything that requires reaggregating their, but how should this data be fed into the original table? Ideally, I would always have all up-to-date results reflected in a single KTable without the need to stage this data in a separate database and grab only the latest active version (or some variation of this).

P.S. For what it’s worth, I’ve x-posted this to Stack Overflow as well in hopes of garnering a bit more attention and will sync up any responses.

This topic was automatically closed after 30 days. New replies are no longer allowed.