Need help with Tumbling Window behaviour

I have a stream with 50 partitions. I Created an Tumbling window Table upon it. Suppose window size is 1 hour and at present running from 1 am to 2 am.
A) What happens when the incorrect data with future timestamp (like 5am) comes and
B) same for past data (yesterday) comes

  1. when window was created before and finished
  2. window not created for past data.

What happens when the incorrect data with future timestamp (like 5am) comes and

Time advances and a new window from [5am;6am) will be created.

same for past data (yesterday) comes

If the window is still open (it does not matter if the window has already data or not), the data would be put into the window it belongs to.

In general, many windows can be open in parallel. You have two parameters to control this:

  1. grace period: defines how long a window is open
  2. retention time: how long do you keep a window (even if it was already closed), ie, read-only access

Checkout The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) Kafka Summit 2020 for more details.

Thanks for the replay @mjsax. I have doubt regarding “Time advances and a new window from [5am;6am) will be created.”

what happens to the old window 1 am to 2 am with minimal grace time.
a) will it be closed, what happens when 1 am to 2 am keep coming
b) What about the windows 2 am to 3 am and 3 am to 4 am, after the creation of 5 am to 6 am window created

Thanks once again

what happens to the old window 1 am to 2 am with minimal grace time.

If grace period is minimal, ie, zero, the [1am,2am) window sill be closed. – If new data for this window arrives afterwards, it will be dropped on the floor as “late”.

What about the windows 2 am to 3 am and 3 am to 4 am, after the creation of 5 am to 6 am window created

With no grace period, all these windows would be closed, too, when the [5am,6am) window is created.

With no grace period, you can only have a single open window: each time a new window is crated, it means a record larger than current window-end time arrived, and thus stream-time > current window-end and thus it’s closed.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.