Incresing the number of tasks for an S3 sink connector

Hey all,

I have a source connector (debezium) that fetch data from Postgres into Kafka. In addition, I have a S3 sink that writes that data from Kafka into S3. When I create the S3 sink connector I set the number of tasks to 1.
Now I want to allow more tasks / threads / workers, in order to lower the lag from Kafka to S3, I have modified the connector, to have 17 tasks and I can see that Kafka connect list 17 tasks, but while looking at the logs it still looks like the first task (with id=0) it doing all the work.

Any ideas on how to fix it?

currently I’m running a single Docker container on a single EC2 instance, running both connectors (debezium and S3).

Thanks.

How many partitions does the topic have? IIRC the connector can parallelise operation over multiple partitions only.

That makes sense. I have 17 topics and each one has only 1 partition, so I thought that each task can process a different topic / partition (in my case).

Yes, I would expect to see it do that. Can you share your connector config?

sorry wrong thread, will update with the connector config later. Thanks

{
"config": {
    "behavior.on.null.values": "ignore",
    "connector.class": "io.confluent.connect.s3.S3SinkConnector",
    "flush.size": "1000000",
    "format.class": "io.confluent.connect.s3.format.parquet.ParquetFormat",
    "locale": "US",
    "partition.duration.ms": "31556952000",
    "partitioner.class": "io.confluent.connect.storage.partitioner.TimeBasedPartitioner",
    "path.format": "'year'=YYYY/",
    "rotate.schedule.interval.ms": "180000",
    "s3.bucket.name": "s3_sink_data",
    "s3.region": "eu-west-1",
    "storage.class": "io.confluent.connect.s3.storage.S3Storage",
    "tasks.max": "17",
    "timezone": "UTC",
    "topics.dir": "db_data",
    "topics.regex": "db_data.public.*"
},
"name": "db-to-s3-sink"
}

could it be that since I created the connector with only 1 task all the existing partitions have been “assigned” to the first task?

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.