Leftjoin foreign key optionality

Hi all,

When testing a leftjoin on two tables with Kafka Streams I noticed something I did not expect.

Case A; During a leftjoin with on the left-hand-side a message with a foreign key present, and on the right hand side no message present, a leftjoin is successful.

Case B; During a leftjoin with on the left-hand-side a message without a foreign key present (a null), and on the right hand side no message present, a leftjoin is unsuccessful.

In my case I sometimes need the leftjoin to succeed without having the foreign key present yet on the left-hand-side. The resulting table can then later be used for another join with other fields present.

But now I have to ā€˜createā€™ a foreign key on the left-hand-side (e.g. 0) for the leftjoin to be successful when it is not present.

I would like to know what the reasoning is to have an error occur when the foreign key is a null and why it is not optional.

Resource-wise it is cheaper to know that the leftjoin will not occur instead of searching for the 0.

Looking forward to your response.

Best regards.

Itā€™s basic design decision in Kafka Streams for all key-depending operations (aggregations and all joins) to treat null-keys as ā€œinvalidā€. Itā€™s also pointed out in the FK-join JavaDocs:

@param foreignKeyExtractor a {@link Function} that extracts the key (KO) from this tableā€™s value (V). If the result is null, the update is ignored as invalid.

There is a longer history (that I will skip here) why this decision was made.

There are multiple tickets about it already, so maybe we change it in the future. However, if we change it, it should happen across the board, not just for one operator to have consistent behavior.

I created a ticket to track it, and linked it to existing tickets: [KAFKA-14748] Relax non-null FK left-join requirement - ASF JIRA

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.