Jdbc sink connector slow performance with upsert and pk

KevinBegrow · 4 May 2021 14:10

HelIo everyone,
I having a performance issue when using sink connector in upsert mode and with pk for record_values and the PK over 3 fields.
The lag grows quickly, even though kafka connect workers are working as a 3 nodes cluster as well as broker. I use 30 partitions and 30 sink tasks on that topic. The DB it selfs is on a SQL server (Azure) which should have enough power to handle it.

But we are talking about 220 million kafka records from a MySQL server placed by the source connector into the Topic.
Same works fast in insert mode and PK set to none. But this is not really an option.

Do you have any idea what could improve the speed?

I have tested this with another topic which has around 600k records which also still struggles with the performance.

Please find my configuration of the connector below. Transformantions etc. are done already in source connector tasks, so the sink is actually pretty straightforward.

{
          "name": "attribute_values-01",
          "config": {
                  "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
                  "key.converter": "io.confluent.connect.avro.AvroConverter",
                  "key.converter.schema.registry.url": "http://<schemaregistry>:8081",
                  "value.converter": "io.confluent.connect.avro.AvroConverter",
                  "value.converter.schema.registry.url": "http://<schemaregistry>:8081",
                  "connection.url": "jdbc:sqlserver://<sql_server>:1433;databaseName=<DB>",
                  "connection.user": "<user>",
                  "connection.password": "<password>",
	              "topics.regex": "input-attribute_values",
                  "auto.create": "true",
                  "auto.evolve":"true",
	              "table.name.format": "import_input-attribute_values",
                  "pk.mode": "record_value",
                  "pk.fields": "tenant_id, attribute_id, attribute_code",
	              "insert.mode": "upsert",
	              "tasks.max": 30
                  }
          }

rmoff · 4 May 2021 15:28

I would speak to your DBA The fact that it works fine without PK demonstrates that it’s a performance issue on the database side.

KevinBegrow · 5 May 2021 05:28

Thank you @rmoff. I agree I actually can’t find any issue on Kafka side and you have just confirmed it. I will check with the DBA the database again.

rayapatir4 · 7 November 2022 22:54

@KevinBegrow do you have found any issue with DB in this case as we are also facing same issue with sqlserver db when we used the upsert. It is writing the data very fast when we use insert but its very slow for upsert mode.

Topic		Replies	Views
Kafka JDBC Sink connector(oracle) slowness after certain load Kafka Connect	2	3155	19 March 2022
How to improve performance for JDBC sink connector attempting INSERT IGNORE operations on the RDS? Kafka Connect	1	189	28 October 2024
JDBC Sink Connector performance - took 30 minutes to write 100k messages Self-Managed Connectors	1	4841	6 April 2022
How to increase performanve of JDBC sink connector to sql server in upsert mode Self-Managed Connectors	0	2886	19 December 2022
Consumer performance decrease significantly Kafka Connect	1	2600	11 March 2023

Jdbc sink connector slow performance with upsert and pk

Related topics