Kafka Strict Ordering via SINGLE PARTITION and MULTIPLE PARTITION Strategies



I am working for a retial client. Each day we will get thousands of messages on different ORDERs from SAP database. Each ORDER message has different states which need to be processed in strict ordering.

For example:

User place an ORDER, say “ORDER1” (O1). This order in Kafka traverse through different states.

First it will be created in “ORDER1 CREATE1”. Say the state name O1C1 .

step 2: Next it will be received by another department. So next message will be created, say for ex “ORDER1 RECEIVED”. Say the state name O1R1 .

step 3: Finally, the ORDER will be processed and completed by another department. So final message will be created, say for ex “ORDER1 FINISHED”, say O1F1 .

Assume, if we place an AMAZON order, the order should be first “Created, Received and Completed”. Same logic we are implementing for many orders.

ORDER1 (O1), the different order states are O1C1—>O1R1–>O1F1

Like this, there are many orders received by Kafka. Say O2, O3……O100.

VERY Important caveat is : sometimes we get the messages in wrong order from SAP database and dumped into Kafka. But still we would like to process them in order by appending some KEY or TIMESTAMP to those messages in Kafka. All I want is to achieve is the strict ordering of the each message state as shown above. I would like to know the different options in Kafka. Here are my thoughts, will wait for your further inputs.

Question: I think with above option, I can achive the strict ordering in the way I receive the messages from SAP. But if I receive the wrong order of messages from SAP (like below), how can I retrieve them in right order from Kafka.

Wrong order: O1F1–>O1C1–>O1R1 Right order: O1C1–>O1R1–>O1F1

Option2: ONE TOPIC with MULTIPLE PARTITIONS and Consumed by MULTIPLE CONSUMERS and each message append by a KEY (K1,K2,…are keys below).

“O1C1+K1” -->“O1S1+K2”–>“O1R1+K3” → “O1F1+K4” “O2C2+K1” -->“O2S2+K2”–>“O2R2+K3” → “O2F2+K4” Question : How to make sure multiple consumers get these messages in strict order for different orders by meeting the strict ordering among different states of the messages.

Kafka only guarantees to preserve the order per partition. Thus, you can only read ordered data, if you write ordered data. If your upstream system does not provided ordered data, you would need to order it before writing into Kafka, or you need to re-order on-read.

The ordering guarantee is on a per-partition basis. Thus, for your example, you can use a topic with multiple partitions, as long as you guarantee that all records for the same order are written into the same partition. The easiest way to achieve this would be to use the order-id as message key.

Thank you.
We are getting the data in the right order into the Kafka topics. But the consumers are not able to consume in the right order (Order Create, Order Receive and Order Finished).
We will institute the solution that you suggested by appending a key with the order status. Do we need to do anything else on consumer semantics side?

Consumers always read topic partitions in offset order. So if you write ordered data, there is nothing you need to do consumer side.

Thanks. Just to summarize my understanding:
As, we already have the ordered data available in the Kafka topic with one partition, all I need to do is to pause the consumers and update the producer config to send the data with keys and restart the consumers.
Or the other solution for better parallelism is to have a topic with multiple partitions and have the payload associated with keys and have the consumers to consume bit faster than above solution.

Yes, sounds about right.

1 Like

Awesome. Appreciate your time and responses.

Thank you.

1 Like