Deprecated data from RockDb Store

We experienced a multi-broker outage on July 27th, our application topology recovered, or so it seemed…
After it recovered, 1 of our application replicas internal state store started exposing data that had been removed few weeks earlier, July 6th. The RockDb content was completely off.

Recovery for this instance either did a partial restore from the changelog and still managed to go in a Running state.
The internal RockDb database end up corrupted.

Respawning the application, brought the store back to its intended state.
But for a while our app was applying live update to a store that was few weeks off, so we had to replay input data from prior the outage.

We are trying to identify and prevent that issue from reoccuring.
I have included the log of the faulty instance, if you see anything unusual I would be happy to submit a ticket or to dig more info.

Do we have to perform cleanup of some sort within the internals of the RockDb local database? (we are in the process of exposing RockDb metrics to our graphana dashboard hoping to better detect the issue next time)

That application run a separate scheduled data store dump outside of the topology/punctuator, it just loop over the data with store.all() using a reference to the stream and store name, is it safe to do that? (trying to find potential bad practice that may have lead to that issue)


Kafka 3.1
KStream 3.1

I put the log here