Detected out-of-order KTable update(for foreign key join)

Hi, I’m getting some out-of-order KTable update in my kakfa streams application. I’m ingesting 10ish tables from kafka connect and enriching the data by chaining ktable joins. I have started getting a few out-of-order KTable after several days of it running for some of my foreign key joins(the initial snapshot went fine). Any idea on how to fix or debug this in more details ?

2024-07-16 09:21:44.809  WARN 1 --- [-StreamThread-3] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721136104525] new timestamp=[1721136104524]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[2] offset=[2422592].
2024-07-16 10:07:15.642  WARN 1 --- [-StreamThread-4] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721138835409] new timestamp=[1721138835408]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[3] offset=[2418189].
2024-07-16 11:29:32.530  WARN 1 --- [-StreamThread-2] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721143772304] new timestamp=[1721143772285]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[0] offset=[2430283].
2024-07-16 11:59:42.296  WARN 1 --- [-StreamThread-2] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721145582053] new timestamp=[1721145582052]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[1] offset=[2422121].
2024-07-16 13:11:01.654  WARN 1 --- [-StreamThread-3] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721149861431] new timestamp=[1721149861430]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[4] offset=[2426659].
2024-07-16 13:26:43.611  WARN 1 --- [-StreamThread-3] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721150803437] new timestamp=[1721150803434]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[2] offset=[2429629].
2024-07-16 14:52:07.442  WARN 1 --- [-StreamThread-2] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-1.0, old timestamp=[1721155927303] new timestamp=[1721155927302]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000132-topic] partition=[1] offset=[2168931].
2024-07-16 15:24:29.245  WARN 1 --- [-StreamThread-2] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-1.0, old timestamp=[1721157869084] new timestamp=[1721157869080]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000132-topic] partition=[1] offset=[2169464].
2024-07-16 15:51:42.102  WARN 1 --- [-StreamThread-3] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721159501863] new timestamp=[1721159501862]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[2] offset=[2434427].

I guess the question is where the out-of-order data comes from. – If you have builder.table("topic") and this table reports out-of-order updates, you would need to look into the upstream app which published the data.

If there is some repartitioning happening inside the KS app, out-of-order data is something that cannot be avoided.

The generic solution to handle out-of-order data for KTables is the usage of “versioned KTable”:

Well it happens only for foreign key joins. I do 5 normal joins and have no error associated but the 2 foreign key joins generate erros. We can see in the stack I posted that it’s on the response topic of the join : my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000132-topic. Only one app publishes events but there are 5 stream threads(a single Kubernetes pod)

FK joins always need to do some repartitioning, while PK-joins work on the input topics as-is. Thus, if there is no out-of-order data in the input topic, unordered data cannot happen for PK-join, but it can still happen for FK-joins.

Using versioned KTables for the FK-join should still help.