I am currently planning to deploy a Kafka Connect cluster using Kubernetes across three independent data centers.
Currently, my network policies do not allow cross-data center connectivity unless it is using the load balancer. Therefore, I cannot configure the workers to communicate with each other beyond the immediate data center as the IP addresses are not shared between each one
In terms of the REST interface, there is no connectivity across the different data centers, but I can connect to the specific data center where the leader is and make changes as required.
To clarify, the Kafka cluster is using the same topics for the configuration.
Would this setup work, or are there any concerns? For example, will there be more than one leader or other issues?
Also, what happens if one of the data Kafka-connect datacenters goes down?
Ideally, I would like to use all the power across the three datacenters if possible, but I can settle for just one if that’s not possible
Kafka clients must communicate directly with brokers, and another way of looking at that protocol is that Kafka has a built in load balancer, and therefore adding one to your architecture has no benefit.
what happens if one of the data Kafka-connect datacenters goes down
Data would stop being replicated? Have you looked at the Cluster Linking features of Confluent Platform? It moves replication processes into the broker, not external Connect clusters
All my Kafka-connect workers can communicate with the Kafka brokers but connectivity between the Kafka connect workers is not guaranteed. Only workers of the same data center can talk to each other due to network policies
Do all Kafka connect workers need to be able to talk to each other via the network? Or does this coordination happen using the Kafka topics?
Kafka Connect has at least two network configs - listeners and advertised, similar to the brokers that they use to coordinate tasks and forward to the leader. Only once the leader receives the request, should messages start being produced to the internal connect topics for at least config and status topics