Ordering of events

iampatelrp · 8 December 2021 01:13

Would anyone know if there is anyway to guarantee ordering of messages across partition? We need to implement “atleast one” with message ordering but everything I read points to ordering guarantee within partition but not across…

Also all the articles are relatively old so I don’t know if there are any changes with most recent versions that provides this capabilities OR if there is any good design pattern that can be used

We are getting messages/events from mainframe and they are coming at us from batch jobs… so drop of 1 million by few milliseconds. When that happens across multiple threaded jobs we are struggling to maintain order in which they are placed.

One obvious option is to add timestamp on the source but would prefer to avoid that if that’s possible.

Any thoughts?

rmoff · 8 December 2021 08:35

Hi @iampatelrp, welcome to the forum!

Can you expand on why you need strict ordering across partitions? What’s the process/requirement that this is supporting ?

Usually strict ordering is needed to support particular business logic which can be enforced by setting the key on messages correctly so that they are all written to the same partition.

iampatelrp · 8 December 2021 12:51

It is indeed required for business logic. Could you please elaborate further on setting keys on message?

riferrei · 8 December 2021 14:24

Robin meant that if you use keys on your records, you will be able to control in which partitions your records will end up. Though I’m afraid, this is only partially true.

Kafka distributes records across partitions based on the criteria specified via an entity called partitioner. Every producer has one. The default one is called DefaultPartitioner, and it uses a simple algorithm that prioritizes partition affinity over distribution. When the affinity cache is empty — which is usually the case for the first time a partition will be used — the partitioner relies on logic to route the record to a specific partition based on the key specified in the record. Under the same logic, if you don’t set a key onto the record, the partitioner will pick a random partition to use.

Pragmatically speaking, you have to configure your mainframe or the middleware that is pulling from the mainframe to send data to Kafka to use these keys onto the records so ordering within a partition can be obtained.

To better illustrate what I am trying to say, I recommend you read this blog that I wrote a while ago that explains how partitioning/assignment works at a record level in Kafka. It uses a buckets pattern to implement a reasonable level of ordering across partitions.

— @riferrei

rmoff · 8 December 2021 14:51

Right - can you elaborate on that? Give an example on what your processing is that relies on the strict ordering of all messages?

iampatelrp · 8 December 2021 16:22

Thank you for the detailed explanation. I am still reading through the blog post but it is beginning to make sense now.

@rmoff Keys on the message is same. For example, Customer 123 spent 34$ at amazon, a minute later spent 40$ at walmart and a second later has a credit of 50$ from Target.

Then while we process messages between walmart and Target, he has reached credit limit but he should get payment declined at Kohls. But if we process credit back from Target, it should go through

Now as crazy as it sounds, we don’t have timestamp on these events and key is customer id. My question/concern is, I want to keep order of the events identical when we ship events to another consumer.

You are right that since key is identical, they should end up going to same partition but i am not sure if there is a guarantee for the same. I also feel like this is probably more common issue and there should be a design pattern to handle this which i am not aware of.

Also our consumer is documentDb and we apply upsert due to the nature of middleware that’s producing events. If we don’t process events in the exact same order as they are produced, we run the risk of data consistency where middleware shows current record having Kohls but on the kafka consumer if ordering isn’t guaranteed, we may show Target or Walmart as latest transaction.

rmoff · 8 December 2021 18:53

So long as your message key is the customer ID, and within the constraints of what @riferrei describes, then yes you do have a guarantee of ordering.

system · 15 December 2021 18:54

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
What should I use as the key for my Kafka message? Architecture and Design	2	50514	9 February 2021
Kafka Strict Ordering via SINGLE PARTITION and MULTIPLE PARTITION Strategies Lounge	6	5015	4 January 2022
Kafka ordering (Single partition Vs Multiple partitions) Lounge	3	4312	21 December 2021
A question about event ordering ksqlDB	1	34	17 December 2024
Guaranteed Ordering Kafka Streams	0	1917	9 June 2023

Ordering of events

Related topics