Hello! We’re running 5.4.x version of HDFS connector. We often see offsets being reset to the earliest:
In connector logs:
[INFO] org.apache.kafka.clients.consumer.internals.Fetcher initializeCompletedFetch - [Consumer clientId=connector-536, groupId=connect-group] Fetch offset 11464442987 is out of range for partition connector.test.output.msgs-486, resetting offset
[INFO] org.apache.kafka.clients.consumer.internals.SubscriptionState maybeSeekUnvalidated - [Consumer clientId=connector-536, groupId=connect-group] Resetting offset for partition connector.test.output.msgs-486 to offset 11460258870.
During the same time in Kafka logs:
INFO [ProducerStateManager partition=connector.test.output.msgs-] Writing producer snapshot at offset 11464442979 (kafka.log.ProducerStateManager)
INFO [Log partition=connector.test.output.msgs-, dir=/data/raid/sdb/kafka] Rolled new log segment at offset 11464442979 in 1 ms. (kafka.log.Log)
We have a retention of 48 hours so we definitely have not lost records and HDFS files are present so it can fetch the next offset from the filename but we still see this offset being reset every few days and for some partitions only(more often just a single partition). Why does this happen and how can we avoid this?