I am using python client, when commit by offset list (confluent_kafka API — confluent-kafka 2.1.0 documentation),
1)the list [TopicPartition] , is the offset part the offset of the consumed message? The description says "Initial partition offset.
2)If I commit 100k messages each time, if use sync would it cause session timeout due to long commit?
3)why should I not choose to commit without any argument?
is the offset part the offset of the consumed message?
Yes, when committing manually (which you are doing with this method), you’ll instantiate the
TopicPartition object with the offset of a consumed message. It’s the offset of a consumed message + 1; when you commit, you’re committing the offset of the next message a consumer will consume from a given topic-partition
If I commit 100k messages each time, if use sync would it cause session timeout due to long commit?
When you commit, you don’t commit each individual message; rather, the offset +1 of the last message you completed processing. So if you wanted to commit 100K messages, you’d commit with an offset of 100,001 (assuming you started at offset 0). So barring any network issues, the commit method should return quickly.
The other question is, do you want to wait that long to commit? For example, consider your process failed at offset 75,000 with your current strategy; you’d end up reprocessing 75K messages to get back to the point before the failure occurred.
why should I not choose to commit without any argument?
If you are fully processing all records from a
poll call, the committing without any argument will commit the last offset consumed. You’ll want to provide offsets if the application could still be working with any records, but you want to commit what has been completed so far.