I am using a spring boot java project where I have mentioned one consumer group and just created one consumer instance. This will mean that if I poll data from this consumer it should by default read all the partitions and return data from all the partitions(entire topic data). Say I have 10 Kafka messages in my topic but whenever I use consumer.poll I get different number of records returned, sometimes I get 4, sometimes 6 and sometimes all 10.
Consumer<String, String> consumer = consumerFactory.createConsumer():
consumer.subscribe(Collections.singletonList(topic));
consumer.poll(duration.OfSeconds(1));
Set<TopicPartition> partitions = consumer.assignment();
consumer.seekToBeginning(partitions);
ConsumerRecords<String,String> records = consumer.poll(Duration.ofSeconds(10));
here each time different number of records are retrieved. How can we ensure that always all the 10 messages are read each time this functionality to read from topic is called. As I had only one consumer instance it should have given me data from all partitions, but on different runs of the application different subsets of messages are returned.
I also set AUTO_OFFSET_RESET_CONFIG to earliest and enable-auto-commit=false so that I can read all these 10 messages again and again as my functionality is always a full load of data.
How can I solve this issue where I do not get different subsets of messages when consumed but get whole data each time. Please correct me if any understanding is wrong.
Execution observed:
The first poll is to ensure consumer is subscribed and assignment is complete else we get partitions count as 0. I found out my topic 2 partition-0 has 4 records and partition-2 has 6 records. My task is to get all 10 records returned (all data from start of all partitions) so I also did seekToBeginning(partitions). and sometimes I get the last line of code the poll return 10 records and sometimes it gives either 6 or 4. Also the initial poll is giving 0 records sometimes and after that we get either 6 or 4 in the second poll. Is there some code that will always read entire data as 10 always?