Detected out-of-order KTable update(for foreign key join)

Hi, I’m getting some out-of-order KTable update in my kakfa streams application. I’m ingesting 10ish tables from kafka connect and enriching the data by chaining ktable joins. I have started getting a few out-of-order KTable after several days of it running for some of my foreign key joins(the initial snapshot went fine). Any idea on how to fix or debug this in more details ?

2024-07-16 09:21:44.809  WARN 1 --- [-StreamThread-3] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721136104525] new timestamp=[1721136104524]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[2] offset=[2422592].
2024-07-16 10:07:15.642  WARN 1 --- [-StreamThread-4] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721138835409] new timestamp=[1721138835408]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[3] offset=[2418189].
2024-07-16 11:29:32.530  WARN 1 --- [-StreamThread-2] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721143772304] new timestamp=[1721143772285]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[0] offset=[2430283].
2024-07-16 11:59:42.296  WARN 1 --- [-StreamThread-2] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721145582053] new timestamp=[1721145582052]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[1] offset=[2422121].
2024-07-16 13:11:01.654  WARN 1 --- [-StreamThread-3] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721149861431] new timestamp=[1721149861430]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[4] offset=[2426659].
2024-07-16 13:26:43.611  WARN 1 --- [-StreamThread-3] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721150803437] new timestamp=[1721150803434]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[2] offset=[2429629].
2024-07-16 14:52:07.442  WARN 1 --- [-StreamThread-2] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-1.0, old timestamp=[1721155927303] new timestamp=[1721155927302]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000132-topic] partition=[1] offset=[2168931].
2024-07-16 15:24:29.245  WARN 1 --- [-StreamThread-2] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-1.0, old timestamp=[1721157869084] new timestamp=[1721157869080]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000132-topic] partition=[1] offset=[2169464].
2024-07-16 15:51:42.102  WARN 1 --- [-StreamThread-3] o.a.k.s.kstream.internals.KTableSource   : Detected out-of-order KTable update for my-app.logging.partial-5.0, old timestamp=[1721159501863] new timestamp=[1721159501862]. topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic] partition=[2] offset=[2434427].

I guess the question is where the out-of-order data comes from. – If you have builder.table("topic") and this table reports out-of-order updates, you would need to look into the upstream app which published the data.

If there is some repartitioning happening inside the KS app, out-of-order data is something that cannot be avoided.

The generic solution to handle out-of-order data for KTables is the usage of “versioned KTable”:

Well it happens only for foreign key joins. I do 5 normal joins and have no error associated but the 2 foreign key joins generate erros. We can see in the stack I posted that it’s on the response topic of the join : my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000132-topic. Only one app publishes events but there are 5 stream threads(a single Kubernetes pod)

FK joins always need to do some repartitioning, while PK-joins work on the input topics as-is. Thus, if there is no out-of-order data in the input topic, unordered data cannot happen for PK-join, but it can still happen for FK-joins.

Using versioned KTables for the FK-join should still help.

Hi, can you explain - all questions for FK JOIN:

  1. What happening if such WARNING appears? I miss some data?
  2. How to avoid (at least reduce) this message for FK JOIN with simple KTable? Is it depending ob broker performance?
  3. How the versioned KTable help? Should I have all KTable versioned? I join 2 simple Ktables and output is 3rd simple KTables how to say FK join to use internally the versioned KTables?

I did take a close look, and realized that the log lines say topic=[my-app-KTABLE-FK-JOIN-SUBSCRIPTION-RESPONSE-0000000154-topic]. So it’s Kafka Streams internal, and not related to your input KTables you are joining.

What happening if such WARNING appears? I miss some data?

No data will be dropped. But it would indicate that the join result might not be what you expect, ie, potentially incorrect result.

How to avoid (at least reduce) this message for FK JOIN with simple KTable? Is it depending ob broker performance?

I don’t think you can do anything about it. I believe it’s the side effect of [KAFKA-18713] Kafka Streams Left-Join not always emitting the last value - ASF JIRA (we are actively working on a fix for this ticket).

How the versioned KTable help? Should I have all KTable versioned? I join 2 simple Ktables and output is 3rd simple KTables how to say FK join to use internally the versioned KTables?

Given that it seems to be a bug in Kafka Streams, I doubt that versioned KTables would actually help…