Join Guarantees in Stream-Table Join

sahil-malhotra · 1 April 2021 11:31

I have the following tables in postgres:

clients : Columns - client_id, tenant_id
divisions : Columns - division_id, client_id
contacts : Columns - contact_id, division_id

I need a Kafka topic with denormalized data joining the above tables. With ksqlDB and using Kafka-connect (Postgres CDC):

I created a clients stream and started materializing it into the ksql Table clients_table
Then I created a divisions stream, joined it with the clients_table and materialized it again into a divisions_table (which now has the tenant_id info from clients_table )
And created the final contacts stream and joined it with divisions_table and loaded it into Elasticsearch (again using Kafka-connect)

The above seems to satisfy my use-case, but:
When the queries in the Postgres tables are run in 10-100 millisecond gaps, I do not get the enriched data.
I am setting KSQL_KSQL_STREAMS_MAX_TASK_IDLE_MS to 2,500 milliseconds, only after setting the above to 25,000 ms, I see the data enriched properly. But it has slowed down the process as well.

If I only consider joining the divisions stream with client_table, I could easily get away with a 2,500ms setting, but for another layer of materialization (i.e with contacts, it requires more than 25,000ms to guarantee joins.
Is there a way to improve this?

sahil-malhotra · 1 April 2021 14:40

UPDATE :

Could make it work this way, with the same 2,500 idle ms.:

CREATE STREAM enriched_division_contacts AS
SELECT * FROM "contacts"
INNER JOIN
"divisions_table" on "contacts".division_id = "divisions_table"."division_id"
INNER JOIN "clients_table" on "divisions_table"."client_id" = "clients_table"."client_id" ;

Thanks @Aneel for the help.

Slack Thread: Slack

system · 8 April 2021 14:40

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Ksqldb stream join table ksqlDB	1	293	22 June 2024
Missing records in STREAM-TABLE join ksqlDB	6	2342	16 September 2023
Join Stream-Table (ORDER-PROVIDER) when the stream does not have the PROVIDER'S ID but others informations ksqlDB	8	4960	26 July 2021
One to many join in ksqldb- kstream and ktable ksqlDB	8	3757	14 February 2021
ksqlDB: Stream to join with Table Stream Processing	0	28	15 September 2024

Join Guarantees in Stream-Table Join

Related topics