Hello everyone. I am trying to create an ensemble learning engine using kafka and kafka streams microservices.
assume that Each data tupple has a field streamID
The flow is like this : i got 3 input topics ( training-topic, partioned based on streamID for training data, prediction-topic also partitioned with streamID for prediction data and a control topic ( i think i should not give it partitions ) for the control commands ( e.g create microservice for that streamID ) .
Router builds a k-table from the control topic, to have a state about all microservices running.
Also it spawns 2 new topics for that corresponding new microservice, and starts the new microservice ( as a new kafka streams app ) .
i am trying to scale my app horizontally,using multiple instances of my router ( java class build as kafka stream application )
I have build a mechanism for leader election among instances,but the real problem is that the way i am handling the only partition of my state store, AlgorithmMicroserviceStore, may have migrated to an instance which is not the leader… so its full random right now, if the leader get access to that partition the app is gonna run,but most of the times a follower gets it.
i dont know if my logic is wrong, i need to find an optimal way where only the leader receives the control commands-or all control commands are forward to the leader to spawn topics and microservices dynamically.
Meanwhile i need to ensure that all the other instances have their k-table up to data , i think this is happening throught the changelog topic.
do i need a leader election or not?
do i need normal k-table or global ?