One to many join in ksqldb- kstream and ktable


I am working on DB CDC to Kafka use case. Initially table A is getting populated with multiple records. After some time (varies from days to weeks) Table B is getting populated with single record. Both table has joining columns request_id.

I am creating KTable on TableA to hold data until TableB is populated. And Planning to join TableB KStream with TableA KTable.

How can I join KStream one record with KTable multiple records?

@masayyed you’ve posted this in #ksqldb , but mention KStreams and KTables - can you just confirm if you’re looking for an answer in Kafka Streams or with ksqlDB? thanks :slight_smile:

I am looking answer in ksqlDB.

What do you mean by TableB KStream – as you do CDC, it seems you want to read the data from TableB as a table. Why do you want to read it as a stream?

Also, for a stream-table join, each stream-input record may result in at most one output records.

@masayyed Sounds like what you want to do is a normal stream to table join, where your CDC table A goes to Topic A, and your table B goes to Topic B. You would want to stream your data via CDC (Debezium?) into your two Kafka topics, then create a Table in KSQLDB over Topic A and a Stream over Topic B. Then use a stream to table join which is neatly documented over here: Join Event Streams . Take note of the partitioning requirements for your topics.

@Charlla Thanks for the reply. I Am using third party tool on Oracle database for CDC.
As I mentioned Table A has multiple records for same request_id and Table B has single record.
How can I achieve 1 (kstream-topicB) to many (ktable-topicA) join ?

I guess it depends.

If the join attribute is the key in TableB, you could mimic the join with a stream-table join by processing the changelog of TableA as STREAM, and processing TableB as TABLE.

Otherwise, you cannot do the join atm with ksqlDB, because non-key FK table-table joins are not supported yet. There is a ticket for it though: Support non-key joins. · Issue #4424 · confluentinc/ksql · GitHub

1 Like

As mentioned in given link Support non-key joins. #4424, I have fixed my issue by using collect_set.
But I think we need non-key join support to fix this issue properly.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.