Current approach:
We have 2 topics(A and B), we are joining data using a common field and sending to topic C.
We are keeping all the data from Topic A into Global KTable, and the stream on topic B is using the topic A’s Global KTable to do the join.
Also we need to keep 60 days of data in Topic A(Millions of Events), loaded to Global Ktable, which can be a problem if EKS POD’s are restarted, it will take lot of time to load everything.
is it a good solution if we are continuously getting new data into Topic A? Will the new KTable be updated immediately once new data arrives?
Alternatives:
Using regular Ktable for topic A is an option, but the state will be co-partitioned, and each POD will only have partial state of topic A, which is a problem.
Kafka event keys for topic A and topic B are different, so there is no guarantee that the related records will end up in the same partition.
Can we take any other approaches here?