We have a stateful kafka-streams topology running as a production system. It is working well but we want to make some changes and migrate to a newer topology.
Resetting the application and replaying all the data from the beginning after updating the topology is not really an option for us because we have a lot of data and it would take a significant amount of time to process all of it from the beginning - we don’t want that much downtime. Not to mention that would be against the notion of processing data “exactly once”.
One possible option would be to run the old topology, v1 and the new topology, v2 side-by-side and using the v1 topology until the v2 topology is sufficiently populated that we could shut off the v1 topology. I feel this isn’t ideal either because it introduces operational complexity and extra overhead of maintaining sets of processors, state stores, etc. It might be ok for updates that happen infrequently but we would like change the topology somewhat regularly as we iterate through our development plans.
I would like to see how others handled this situation.