I have a stream with 50 partitions. I Created an Tumbling window Table upon it. Suppose window size is 1 hour and at present running from 1 am to 2 am.
A) What happens when the incorrect data with future timestamp (like 5am) comes and
B) same for past data (yesterday) comes
Thanks for the replay @mjsax. I have doubt regarding “Time advances and a new window from [5am;6am) will be created.”
what happens to the old window 1 am to 2 am with minimal grace time.
a) will it be closed, what happens when 1 am to 2 am keep coming
b) What about the windows 2 am to 3 am and 3 am to 4 am, after the creation of 5 am to 6 am window created
what happens to the old window 1 am to 2 am with minimal grace time.
If grace period is minimal, ie, zero, the [1am,2am) window sill be closed. – If new data for this window arrives afterwards, it will be dropped on the floor as “late”.
What about the windows 2 am to 3 am and 3 am to 4 am, after the creation of 5 am to 6 am window created
With no grace period, all these windows would be closed, too, when the [5am,6am) window is created.
With no grace period, you can only have a single open window: each time a new window is crated, it means a record larger than current window-end time arrived, and thus stream-time > current window-end and thus it’s closed.