Regarding Kafka-connect and a cross-data center configuration

Hi all,

I am currently planning to deploy a Kafka Connect cluster using Kubernetes across three independent data centers.

Currently, my network policies do not allow cross-data center connectivity unless it is using the load balancer. Therefore, I cannot configure the workers to communicate with each other beyond the immediate data center as the IP addresses are not shared between each one

In terms of the REST interface, there is no connectivity across the different data centers, but I can connect to the specific data center where the leader is and make changes as required.

To clarify, the Kafka cluster is using the same topics for the configuration.

Would this setup work, or are there any concerns? For example, will there be more than one leader or other issues?

Also, what happens if one of the data Kafka-connect datacenters goes down?

Ideally, I would like to use all the power across the three datacenters if possible, but I can settle for just one if that’s not possible


Kafka clients must communicate directly with brokers, and another way of looking at that protocol is that Kafka has a built in load balancer, and therefore adding one to your architecture has no benefit.

what happens if one of the data Kafka-connect datacenters goes down

Data would stop being replicated? Have you looked at the Cluster Linking features of Confluent Platform? It moves replication processes into the broker, not external Connect clusters

Thank you for the response @OneCricketeer

Perhaps my question was not clear enough

All my Kafka-connect workers can communicate with the Kafka brokers but connectivity between the Kafka connect workers is not guaranteed. Only workers of the same data center can talk to each other due to network policies

Do all Kafka connect workers need to be able to talk to each other via the network? Or does this coordination happen using the Kafka topics?

Kafka Connect has at least two network configs - listeners and advertised, similar to the brokers that they use to coordinate tasks and forward to the leader. Only once the leader receives the request, should messages start being produced to the internal connect topics for at least config and status topics