Does Kaka provide At least once guarantee across clusters?

maithilicharan · 14 June 2024 16:51

I have requirement to write KafkaCluserBridge app read from a topic ‘TopicA’ from Cluster A and publish to another topic ‘TopicB’ located cluster B. In this scenario Can I rely on Kafka Guarantees such as At least once, ordering across the clusters?

Should I just use Mirror Maker2 to replicate TopicA into cluster B and start my data pipeline processing, so that I could get Kafka guarantees, instead of writing my own KafkaClusterBridge application and stop worrying about failover scenarios?

TLDR: Does Kafka provide offset management and delivery guarantee across the cluster without Mirror Maker?

dtroiano · 14 June 2024 17:50

Mirror Maker 2 replicates data asynchronously so in general data can be lost during a failover scenario. For example:

producer to primary cluster with acks=all has successfully written 1000 records
primary fails
failover to DR cluster

At #3, the DR cluster can have anywhere from 0 to 1000 of those records because MM2 replicates async.

Also Mirror Maker 2 doesn’t preserve offsets.

This Kafka Internals course module discusses MM2 plus other geo-replication options and their properties. If you need at least once delivery with respect to the primary and DR clusters, you would need a synchronous replication solution, either a single Kafka stretch cluster than spans regions and uses Kafka’s internal replication and delivery guarantees, or Confluent Platform’s Multi-Region Clusters feature (with synchronous replication, not observers).

maithilicharan · 17 June 2024 12:36

Thanks for your reply Dave. I appreciate your reply. Your answer is addressing multi region, DR scenario, but my scenario is bit different, perhaps I should have explained a little bit more clearer.

As part of data pipeline migration I have a requirement to mirror a high volume topic from Confluent to On Prem cluster. To avoid the duplication of data and throughput could I consume from Confluent Cloud topic and apply the business logic and finally publish to a downstream topic in On Prem cluster with the manual commit and with the producer settings ( acks=all, retries=3 , enable.idempotence=true)?

Could I expect seamless failover synchronisation across the two clusters in this approach? Can I get At least guarantee to work across Confluent Cloud and On Prem Cluster? If not do you see any other potential scenarios where I could loose messages?

Topic		Replies	Views
MirrorMaker 2.0 multiple instances Cluster Replication	3	6251	20 June 2021
Kafka mirrormaker 2 or replicator active/active handling producers consumers after a disastor Cluster Replication	1	4169	11 May 2022
MirrorMaker2 High Availability Kafka Connect	5	99	30 March 2025
🎧 Multi-Cluster Apache Kafka with Cluster Linking ft. Nikhil Bhatia News and Blogs	0	3324	31 August 2021
Confluent Replicator in disaster recovery and offsets Cluster Replication	0	25	29 September 2024

Does Kaka provide At least once guarantee across clusters?

Related topics