Hi everybody, i’m new to KSQL DB, so I’m not sure if it’s a bug or some complex behavior I don’t understand. I’m trying to join stream:
CREATE OR REPLACE STREAM STREAM1
WITH (KAFKA_TOPIC=‘db.stream1’, PARTITIONS=12, REPLICAS=3) AS SELECT
FROM DB_STREAM d
WHERE after IS NOT NULL
PARTITION BY d.after->fid
with table, using this query:
CREATE OR REPLACE STREAM STREAM2
WITH (KAFKA_TOPIC=‘db.stream2’, PARTITIONS=12, REPLICAS=3) AS SELECT
FROM STREAM1 r
INNER JOIN TABLE1 n on r.fid = n.fid
But in result stream I miss several records (there are corespondent id’s in both stream and table).
What baffles me is if I do not use PARTITION BY in STREAM1 creation
or in join query I use INNER JOIN TABLE1 n on r.after->fid = n.fid instead of r.fid
I get all expected records in result stream.
Can someone please explain to me what may be the reason behind described behavior?
Isn’t it good practice to re-key stream (using PARTITION BY) before joining?