JDBC sink connector - multiple workers

rgurskis · 14 October 2022 14:44

Hi all,
We have replication solution using Kafka Connect Data is being read using Debezium MS SQL connector into multiple topics, and then written to PostgreSQL using JDBC sink connector. Each topic/table has dedicated sink. Using “pk.mode”: “record_value” , which generates UPSERTs. Topic has 10 partitions. There are 2 workers in Kafka Connect cluster.

For now we are using “tasks.max”: “1” in the sink, which means that all writes for given table are done by single worker in single thread.
As JBDC connector supports multiple threads, should we be using more than 2 task in this scenario? Are there any best practices or recommendations in this matter?

I see a couple of potential issues here:

Parallel writes might cause locking on DB side. I saw at least one issue when deadlocks occurred using multiple tasks (Multiple tasks in kafka-connect-jdbc sink causing deadlock · Issue #385 · confluentinc/kafka-connect-jdbc · GitHub)
Can we ensure that changes are applied in the same order? For example, we have a backlog CDC changes for database record id =1:

id=1, value='A'
id=1, value='B'
id=1, value='C'

Is is possible that these messages will be distributed across multiple workers, and then UPSERT’s applied in the wrong order?

Any thoughts or suggestions?
Thanks

system · 13 November 2022 14:45

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Jdbc sink connector to load data from 2 different topics to 2 different table Kafka Connect	4	7812	25 June 2022
Using Multiple Connectors VS Multiple Topics on Single Connector Kafka Connect	1	1309	24 April 2024
More than 1 task on single table Kafka Connect	5	3678	16 November 2021
Incresing the number of tasks for an S3 sink connector Kafka Connect	6	4758	30 April 2021
Deadlocked error on JDBC Sink connector upset mode Kafka Connect	2	3244	20 January 2023

JDBC sink connector - multiple workers

Related topics