we want to add a 4th Broker Node to a Apache Kafka Cluster 2.6 (not the Confluent Platform) and looking for the best practices to increase the replication factor of the current topics.
As far as I read here (Will topics still be available while their partitions are being reassigned?) it should be no problem to increase the replication factor whithout downtime for the clients. But I guess there are some best practices we should follow and I have a few questions about this:
The replicas configuration is ordered (example [2,3,1]). What does that mean about leader and followers? Is “2” always the leader after starting the cluster? Oder is it even always the leader when all repliacs are available (so gets elected back when a failed node rejoins the cluster)?
When I add the new broker 4 at the end of the replicas list, does that mean that that broker will very unlikely will become the leader of this partition ever?
Should I distribute the new broker equally distributed in the replica lists of my partitions (about 6000)?
Do I cause a stress on the cluster if I insert the new broker at a certain position (for example the first one; is the new replica then unnecessarily chosen to be the leader right after getting insync)?
Is ist advisable to put all partions (6000) in one reassignment json file and process them together or should I split that in smaller parts?
Any other things I should consider?
Sorry, that is a long list of questions, but I feel I should now about the answers and not assume how the cluster behaves. Of course we will test the procedure but this might not be possible with the full data load. Thus all your comments are really useful for me.
Thanks in advance