Does changing flush.size of the Confluent s3 sink invalidate EOS?

I have a Kafka Connect cluster running the Confluent s3 Sink Connector, configured for EOS.

Our current flush.size is set to a rather high number which was ideal for reading in historical data. After having caught up to the current upstream, this high value obviously causes high latency in transferring the records.

I don’t see a reason why - all other things being equal, changing the flush.size should invalidate EOS. It perhaps wouldn’t be “exactly-once” in the meta-sense that the files would not look exactly the same if I were to restart from the beginning (including clearing outputs, for the sake of argument), but I don’t see a reason why the actual file content would be impacted in this case. However, with cases such as these the devil is often in the details…

I do have some ways to check this, but it would entail a lot more work in identifying and cleaning up the duplicates since there are no automatic processes in place at the moment (under the verified assumption that the connector provides EOS).

Am I wrong? Would changing (only) flush.size invalidate EOS?

if I were to restart from the beginning

But that has nothing to do with the flush.size. If you reset the consumer group offset and start writing new files, of course you’ll end up with duplicate records in the bucket.

Sorry, perhaps I wasn’t clear in that line of thought - “restart from the beginning” also meant clearing existing s3 outputs in that thought experiment. That’s why I specifically referred to the file content. Otherwise, yes, I agree.