Hi Community,
i have started to work with kafka for several month now. Have learned how to create a Spring Boot Kafka Stream Application. Recently i have added a Global State Store to my Application.
Now that i know how to implement a Kafka Stream and a Global State Store, i would like to understand their dependencies on each other.
For example lets say i have an application with a stream and a global state store.
The Stream source from topic A with around 10mio records, it transforms data and writes the transformed data into topic B.
The Global State Store sources from topic B.
Case one(fresh deploy):
When the application starts for the first time, the stream will process all the data from topic B and the global state store will be build up instantly
→ the app is up and running right away while the stream still processing the data
When does the state store gets filled up? Whenever a new record has been written in topic B it also gets directly stored into the state store?
Case two (redeploy with different application-id)
When the application starts it is NOT up and running right away because:
the stream is reprocessing all data from topic A while the state store is getting restored. How long the state store needs to get restored depends on the amount of the data it has been build up before.
How is the process of restoring the global state store? Where does the data come from? What about the data from topic B where it sources from?
When does the store sources from topic B? When the store has been fully restored or while it is being restored?
Thank you for your time to help.
Best regards
Kafkanaut