Advice/General guidelines associated with cluster & connector configurations in Kafka Connect

I am looking for some general guidance/advice as to what the best practices might be as far as defining the different aspects of Kafka Connect (i.e. - # of connect clusters, cluster isolation characteristics, connector per cluster, etc…)

I have been exploring a simple scenario, in our use case, where we want to leverage a debezium source connector to monitor a few tables in a mysql source database then leverage a http sink connector to push that data out to some other external source systems. For this particular scenario, I am planning on defining two connect workers in a single connect cluster running a source connector and sink connector.

I have observed a few examples online where the configuration depicted a separate cluster for each connector type.

Are there any guidelines/best practices around configuration of the cluster and the connectors? When would it make sense to split out the different connectors into their own cluster?

Overall, as the connector versions have improved, more connectors in one cluster is a great idea.

Earlier, some would use single-domain clusters, so for example, all debezium source connectors in one cluster, all S3 Connectors in another. That would help also based on # of connectors added vs updated periodically, given earlier all connectors would pause and rebalance each time a config was updated. With the awesomeness of Incremental Cooperative Rebalancing in Apache Kafka in place, this is no more the case.

Another reason to separate connect clusters per-use-case was worker level configurations. Even though you may have 10 source connectors, there is one overall producer to Kafka per worker, and therefore if you need custom producer properties, separate clusters would be the way to go. Note: Some of this is changing and allows overriding of certain worker level properties at the connector level

Are you planning to see similar workloads across tasks in a cluster, then definitely go for this. It helps with scaling up the worker-nodes and scaling down as the workload changes.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.