Hi,
I have JDBC Sink Connector (with four tasks) that I am using to insert into a SQL Server database. The topic I am reading has about 54M messages. It successfully inserted around 6M rows into the target and then it started throwing errors similar to the following:
[2021-04-13 08:29:31,272] ERROR WorkerSinkTask{id=cdp_snk_member_coreInsert_0-3} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted. Error: Update count (499) did not sum up to total number of records inserted (500) (org.apache.kafka.connect.runtime.WorkerSinkTask)
org.apache.kafka.connect.errors.ConnectException: Update count (499) did not sum up to total number of records inserted (500)
at io.confluent.connect.jdbc.sink.BufferedRecords.flush(BufferedRecords.java:194)
at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:79)
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:74)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:560)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:323)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:226)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:198)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:185)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:235)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Yesterday, I could just restart the failed tasks and it would continue for a bit and insert more rows (10K - 100K) before failing again. I setup a Linux script (inspired by Automatically restarting failed Kafka Connect tasks) that checks for failed tasks periodically and restarts them. Today, it just fails immediately with the same error and does not insert any more rows.
The counts in the error message always appear to be the same:
Update count (**499**) did not sum up to total number of records inserted (**500**)
It is not clear to me what is causing this error. All of the messages in the source topic have unique primary keys. I tried adding a Dead Letter Queue to the connector but that did not make a difference. No messages went to the DLQ topic.
Does anybody have a suggestion?
Thanks!