My application keeps a large amount of data (a few hundred gigabytes) on state stores to join data together from our different data sources before streaming out to another database. Because we have a ton of state, the time taken to restore our KTables from the changelog topics results in significant downtime. We also use the exactly once V2 processing guarantee, so closing the state stores dirty requires restreaming all state from the changelog topics. We would prefer to avoid standby replicas due to the associated costs.
From what I can tell, there is no way to always shutdown cleanly, but we would like to shutdown dirty as infrequently as possible. So far we have implemented it with the JVM shutdown hook, state store listener, and uncaught exception handler, but the JVM shutdown hook is the only reliable way to shutdown cleanly. Everything else is flaky and often gets caught in deadlocks. Note that we are initiating this clean shutdown from calling streams.close().
This seems like a common problem that would be faced, but I could not find any information about achieving this functionality in documentation or online sources. Any advice or recommendations would be greatly appreciated.
Kafka streams version: 3.6.2