Batch processing in kstreams

mjsax · 19 September 2021 04:35

As you pointed our, there is ksqlDB and we will keep investing heavily to make it more expressive and to address current limitations. Personally, I think that ksqlDB already offers a programmatic way to execute SQL statements over STREAMs and TABLEs. Doing aggregations and joins is actually straightforward with both Kafka Streams and ksqlDB. You should check it out. If we will ever add a “de-duplication” operator to ksqlDB seems to be an open question though (I guess if there is enough user demand we might. On the other hand, you can do it already today: cf. Tombstone message in Table when filtering duplicate events.

For Kafka Streams, there is actually a KIP to add a “distinct” operation to the DSL as pointed out above. There is also a KIP to add more built-in aggregation functions: KIP-747 Add support for basic aggregation APIs - Apache Kafka - Apache Software Foundation

As you can see, there is are many things in-flight…

For ksqlDB, you can also follow KLIPs if you are interesting in future development: ksql/design-proposals at master · confluentinc/ksql · GitHub

Topic		Replies	Views
How to deduplicate records in a kstream or ktable Kafka Streams	5	7947	13 September 2021
Aggregations on Windowed KTables Kafka Streams	2	3354	7 March 2022
Facing issues with complex data processing use-case ksqlDB	1	1259	9 December 2023
🎧 Advanced Stream Processing with ksqlDB ft. Michael Drogalis News and Blogs	0	3452	11 August 2021
How does ksqlDB work? – Free resources to learn more ksqlDB	1	3480	31 December 2021

Batch processing in kstreams

Related topics