🎧 Real-Time Change Data Capture and Data Integration with Apache Kafka and Qlik

alice.richardson · 6 January 2022 08:24

There’s a new Streaming Audio episode - check it out!

Getting data from a database management system (DBMS) into Apache Kafka® in real time is a subject of ongoing innovation. John Neal (Principal Solution Architect, Qlik) and Adam Mayer (Senior Technical Producer Marketing Manager, Qlik) explain how leveraging change data capture (CDC) for data ingestion into Kafka enables real-time data-driven insights.

It can be challenging to ingest data in real time. It is even more challenging when you have multiple data sources, including both traditional databases and mainframes, such as SAP and Oracle. Extracting data in batch for transfer and replication purposes is slow, and often incurs significant performance penalties. However, analytical queries are often even more resource intensive and are prohibitively expensive to run on production transactional databases. CDC enables the capture of source operations as a sequence of incrementing events, converting the data into events to be written to Kafka.

Once this data is available in the Kafka topics, it can be used for both analytical and operational use cases. Data can be consumed and modeled for analytics by individual groups across your organization. Meanwhile, the same Kafka topics can be used to help power microservice applications and help ensure data governance without impacting your production data source. Kafka makes it easy to integrate your CDC data into your data warehouses, data lake, NoSQL database, microservices, and any other system.

Adam and John highlight a few use cases where they see real-time Kafka data ingestion, processing, and analytics moving the needle—including real-time customer predictions, supply chain optimizations, and operational reporting. Finally, Adam and John cap it off with a discussion on how capturing and tracking data changes are critical for your machine learning model to enrich data quality.

EPISODE LINKS

Listen to the episode

Topic	Replies	Views
✍️ Keeping Multiple Databases in Sync Using Kafka Connect and CDC News and Blogs	2865	20 September 2022
🎧 From Batch to Real-Time: Tips for Streaming Data Pipelines with Apache Kafka ft. Danica Fine News and Blogs	3049	13 January 2022
🎧 Powering Real-Time Analytics with Apache Kafka and Rockset News and Blogs	3305	15 July 2021
🎧 Change Data Capture and Kafka Connect on Microsoft Azure ft. Abhishek Gupta News and Blogs	3337	11 January 2021
Recording ready to view: SPEAKER Q&A THREAD: 30 March 2023 - Capturing Database Changes in Real-time with Kafka Connect Events	2294	31 March 2023

🎧 Real-Time Change Data Capture and Data Integration with Apache Kafka and Qlik

Related topics