KTable-KTable Foreign-Key LEFT JOIN: Discrepancy between documentation and behavior when FK is NULL

KTable-KTable Foreign-Key LEFT JOIN: Discrepancy between documentation and behavior when FK is NULL

Summary

I’m observing a discrepancy between the documented semantics table and the actual runtime behavior of KTable-KTable foreign-key LEFT JOIN when the foreignKeyExtractor returns null.

Documentation Reference

Kafka Streams DSL - KTable-KTable Foreign-Key Join

The Discrepancy

1. The semantics table shows (row with offset=5):

Record Offset Left KTable (K, extracted-FK) Right KTable (FK, VR) (INNER) JOIN LEFT JOIN
5 (k, null) (k, null) (k, null, null)

This suggests that when the foreign key is null, a LEFT JOIN should output the left record with null for the right-side value — consistent with standard SQL LEFT JOIN semantics.

2. However, the text states:

“Records for which the foreignKeyExtractor produces null are ignored and do not trigger a join.”

3. Runtime behavior confirms records are dropped:

WARN o.a.k.s.k.i.f.SubscriptionSendProcessorSupplier - Skipping record due to null foreign key. topic=[my-topic] partition=[12] offset=[6224]

Question

  1. Which behavior is correct? Should LEFT JOIN with null FK:
    • (A) Output the left record with null right-side value (as the semantics table suggests), or
    • (B) Drop/ignore the record entirely (as the text states and implementation does)?
  2. If (B) is the intended behavior, is the semantics table incorrect and should be updated?
  3. If (A) is the intended behavior, is this a bug in SubscriptionSendProcessorSupplier that should be reported?

Expected vs Actual

Scenario Expected (per table) Actual (runtime)
Left record with FK=null, LEFT JOIN Output: (key, null, null) Record dropped with WARN log

Environment

  • Kafka Streams version: 4.1.1

Workaround

The documentation suggests using a sentinel value (e.g., "NULL" or -1) instead of null, but this requires ensuring no right-side record exists for the sentinel key and adds complexity.


Thank you for any clarification on the intended semantics!

Since 3.7 release, when the FK-extractor returns a null key for a left-join, you should get a join result. (cf KIP-962: Relax non-null key requirement in Kafka Streams - Apache Kafka - Apache Software Foundation )

So the table in the docs is right. The other test snippet you quote, is for inner-join thought, and I cannot see it in the left-join docs… So docs seems to be correct about this, too.

However, I don’t understand, base on the code, why you see the logging. It should only happen for inner-join:

So I am wondering now, if you might by accident use leftTable.join(rightTable,…) instead of leftTable.leftJoin(rightTable,…) in your program?