More than one "active" tumbling window

I am trying to understand whether I can achieve the following behavior with Kafka Streams, with grace configuration or in any other way:

  • Tumbling window of 1 min
  • Time is taken from a field in the message, lets call it messageTime field
  • Messages are sent from multiple sources and should be aggregated for each window
  • Some messages might be occasionally delayed so an option to process a message that arrived up to 2 minutes late is needed, but it should go in the correct window based on the value of messageTime field.
  • At 10:02:15 I want to have 3 windows that can receive data: 10:00-10:01, 10:01-10:02, 10:02-10:03
  • If message with messageTime value 10:00:00 arrives at 10:02:15 I want it to go into 10:00-10:01 window
  • If data with messageTime value 10:01:00 arrives at 10:02:15 I want it to go into 10:01-10:02 window
  • If data with messageTime value 10:02:00 arrives at 10:02:14 I want it to go into 10:02-10:03 window
  • If data with messageTime value 10:00:00 arrives at 10:03:01 I want it to be discarded (since beyond 1 min window plus 2 mins grace)
  • I want window 10:00-10:01 to be closed and flushed at 10:03 (1 min + 2 mins grace)
  • I want window 10:01-10:02 to be closed and flushed at 10:04 (1 min + 2 mins grace)
  • I want window 10:02-10:03 to be closed and flushed at 10:05 (1 min + 2 mins grace)

Yes, that is exactly how it works.

You need to consider thought, that time progress is only measured based on event-time progress, never on wall-clock time. Eg, when you say

  • At 10:02:15 I want to have 3 windows that can receive data: 10:00-10:01, 10:01-10:02, 10:02-10:03

this hold true with “stream time” being 10:02:15, ie, the largest timestamp (ie, extracted from messageTime) was 10:02:15 so far. It has nothing to do with wall-clock time.

Similarly,

  • I want window 10:02-10:03 to be closed and flushed at 10:05 (1 min + 2 mins grace)

hold true if it means, the window will be closed and flushed at 10:05 “stream time”, ie, when we process the first record with an extracted timestamp of 10:05 (or higher).

1 Like

Many thanks.

So I need tumbling window with grace and custom timeExtractor, right?
Anything else I need to use to achieve this behavior?

1 Like

Yes, that should be all.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.