JDBC sink connector - multiple workers

Hi all,
We have replication solution using Kafka Connect Data is being read using Debezium MS SQL connector into multiple topics, and then written to PostgreSQL using JDBC sink connector. Each topic/table has dedicated sink. Using “pk.mode”: “record_value” , which generates UPSERTs. Topic has 10 partitions. There are 2 workers in Kafka Connect cluster.

For now we are using “tasks.max”: “1” in the sink, which means that all writes for given table are done by single worker in single thread.
As JBDC connector supports multiple threads, should we be using more than 2 task in this scenario? Are there any best practices or recommendations in this matter?

I see a couple of potential issues here:

  1. Parallel writes might cause locking on DB side. I saw at least one issue when deadlocks occurred using multiple tasks (Multiple tasks in kafka-connect-jdbc sink causing deadlock · Issue #385 · confluentinc/kafka-connect-jdbc · GitHub)
  2. Can we ensure that changes are applied in the same order? For example, we have a backlog CDC changes for database record id =1:
id=1, value='A'
id=1, value='B'
id=1, value='C'

Is is possible that these messages will be distributed across multiple workers, and then UPSERT’s applied in the wrong order?

Any thoughts or suggestions?

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.