Towards creation of a Data Mesh by leveraging ksqlDB

This question is on Data Mesh, that was widely covered in the recently concluded Summit. It is also a topic of interest for us, regarding internal adoption and applicability, as it provides with a practical framework on how to build a federal data distribution platform within an org, by aggregating data from multiple source domains and making it available to downstream consumers.

Initial thoughts & queries towards building a data mesh from an implementation perspective -

  1. Is it feasible to build materialized views using ksqlDB, each representing different view abstractions of the same underlying data in topics sourced internally across multiple domains of an org? This might help to support varying data transformation and aggregation requirements of downstream consumers.

    Just to cite an analogy, this would probably be similar to the Bolt CDC approach that was presented earlier this year in Confluent Devs. We are also planning to use Kafka Connect / Debezium to capture CDC data in Kafka and build views of the underlying data using ksqlDB.

  2. Can consumers read delta updates from ksqlDb tables via Kafka Connect and store them in any sink of their choice (mostly through pull queries)? For e.g. like a consumer read from an individual topic, does the same work for a ksqlDB table too, with incremental updates being captured for any of the underlying topics?

Thanks & regards,

Yes, such a thing could be part of a Data Mesh solution.
Yes, any ksqlDB table is always backed by a Kafka topic, so will have ‘update messages’ and can be used with Kafka Connect like any other topic.

This topic was automatically closed after 30 days. New replies are no longer allowed.