Implementing a Data Mesh with Apache Kafka [Kafka Summit 2022]

Implementing a Data Mesh with Apache Kafka
Date : April 26, 2022
Time : 11:00 AM - 11:45 AM BST

Speakers:

  • Adam Bellemare, Staff Technologist, Confluent

Abstract:
Have you heard about Data Mesh but never really understood how you actually build one? Data mesh is a relatively recent term that describes a set of principles that good modern data systems uphold. Although the data mesh is not a technology specific pattern, it requires that organizations make choices and investments into specific technologies and operational policies when implementing the mesh. Establishing ““paved roads”” for creating, publishing, evolving, deprecating, and discovering data products is essential for bringing the benefits of the mesh to those who would use it.

In this talk, Adam covers implementing a self-service data mesh with events streams in Apache Kafka®. Event streams as a data product are an essential part of a real-world data mesh, as they enable both operational and analytical workloads from a common source of truth. Event streams provide full historical data along with realtime updates, letting each individual data product consumer decide what to consume, how to remodel it, and where to store it to best suit their needs.

Adam structures this talk by seeking to answer a hypothetical SaaS business question of ““what is the relationship between feature usage and user retention?”” This example explores each team’s role in the data mesh, including the data products they would (and wouldn’t) publish, how other teams could use the products, and the organizational dynamics and principles underpinning it all.