Can we get data in order?

It’s difficult to pull data in order out of kafka due to it’s high scalability and parallelism. But at sink client side if we reverse the algorithm or reverse the work of partitioner after fetching data out of kafka brokers, is it possible to get data in order?

It’s difficult to pull data in order out of kafka due to it’s high scalability and parallelism.

Data stored in Kafka is strongly is ordered by topic-partition. I suppose what you are referring to is that Kafka does not provide total ordering across the partitions in a topic?

But at sink client side if we reverse the algorithm or reverse the work of partitioner after fetching data out of kafka brokers, is it possible to get data in order?

If you indeed mean what I said above, then Yes, you can of course reorder (and/or filter, drop, modify, etc.) received events in a consuming application in the way you need it.

Be aware that you also have to account for—what we call—out-of-order and late data. This is data that is out-of-order or late for a different reason, e.g. because producing applications lost network connectivity for a period of time (like a mobile device losing Internet connectivity while aboard an airplane), or because the local time clocks of producing applications are not properly synced with NTP. This out-of-orderness has nothing to do with how Kafka works, and instead is a general issue you have to deal with in any streaming system.

2 Likes

Thank you so much @miguno. It is clear to me now.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.