KsqlDB server restart vs materialized tables

Propster · 18 December 2023 22:19

I’ve been experimenting with larger data sets for a couple of weeks now, feeding from Debezium snapshots and putting some of that data in materialized tables. I’m trying to understand how it works in different situations :

1- Starting from scratch :

Easiest scenario. Just snapshot all the tables you want to process in the correct order (reference tables first, then streaming tables). Everything is smooth.

2- Restarting the server :

KsqlDB reads from its internal (confluent-ksql-*) topics to reload materialized tables into memory
It might produce topics with null values from reference tables or log “Skipping record due to null join key or value.” if it tries to process messages coming in before it had time to reload its memory database.
Overall, it works pretty good once the init is finished

3- Deploying changes to queries into an existing running project :

On restart, KsqlDB will transfer messages from its old internal (confluent-ksql) topics to new ones.
However, once it’s done with that, it doesn’t seem to remember any of the data in materialized tables, even those that didn’t suffer any change to their schema or query.
I’ve waited for it to transfer 43 millions messages from one internal topic to another, but once it was finished, it didn’t seem to remember any of that data.

So basically, I’m trying to find the best strategies for managing ksqlDB servers and how to deal with changes in production without losing any data and making it as smooth as possible. Or at the very least, I’d like to know what to expect in different scenarios and for that, I need to understand how KsqlDB works on restart in different scenarios.

Any ideas where I could find documentation about it?

Thanks!

system · 17 January 2024 22:19

This topic was automatically closed after 30 days. New replies are no longer allowed.

Topic		Replies	Views
The Curious Incident of the State Store in Recovery in ksqlDB ksqlDB	2	3452	17 December 2020
Real-Time Materialized Views/Streams with ksqlDB - new consumers and replay Architecture and Design	0	3169	28 July 2022
How/Where does ksqlDB store ktable data? ksqlDB	2	2174	16 June 2023
How Real-Time Stream Processing Works with ksqlDB, Animated ksqlDB	1	3355	2 December 2021
Questions about non-materialized tables (CREATE TABLE from existing topic) ksqlDB	3	3736	1 October 2021

KsqlDB server restart vs materialized tables

Related topics