Hi,
I’m working on Kafka streams and I want to do a join with Stream and KTable. Currently, I have following Streams but I want to know if there is any way to reduce the number of streams and achieve all of the following.
Stream 1 where I’m reading data from source DB.
CREATE STREAM strm_source
WITH (KAFKA_TOPIC='DB_Data',VALUE_FORMAT='AVRO');
Stream 2 where I’m extracting the required data
CREATE STREAM strm_transform
WITH (KAFKA_TOPIC='DB_Data_Transform',KEY_FORMAT = 'AVRO', VALUE_FORMAT = 'AVRO') AS SELECT
TS_MS `eventTimestamp`,
AFTER->ID `eventKey`,
AS_VALUE(AFTER->ID) `id`,
FORMAT_TIMESTAMP(FROM_UNIXTIME(AFTER->UTCDATETIME), 'yyyy-MM-dd') `date`,
AFTER->ITEMQUANTITY `quantity`
FROM strm_source
PARTITION BY AFTER->ID;
Stream 3 with a primary key to join with the ktable
CREATE STREAM strm_keyed
(id VARCHAR KEY, date VARCHAR, quantity VARCHAR)
WITH (kafka_topic='DB_Data_Transform', KEY_FORMAT = 'AVRO', value_format='avro');
Stream 4 for to do a join with the ktable
CREATE STREAM strm_filtered
WITH (KAFKA_TOPIC='DB_Data_Filtered',KEY_FORMAT = 'AVRO',VALUE_FORMAT = 'AVRO')
AS SELECT
strm_keyed.id,
AS_VALUE(strm_keyed.id) AS `id`,
date AS `date`
quantity AS `quantity`
FROM strm_keyed
INNER JOIN KTBL_DATA ON STRM_KEYED.id = KTBL_DATA.id
WHERE KTBL_DATA.id IS NOT NULL;
I want to know if there is an efficient way of doing this by reducing the number of streams.
I’m new to Kafka and working on existing code. So any help is much appreciated.
Thanks!