How to avoid cycles with mirrormaker

I am using Kafka mirror maker 2.0 with 2 local kafka. I studied the issue that getting rid of cycles without aliases is not possible at the moment.
I try to tune mm2.properties so that MM2 replicates topics with the same names without aliases and without cycles messages in topics.
Can you tell me please how to set up such replication.

Hi @EShabakhov

welcome to the forum :slight_smile:

Just be sure:
you try to replicate a topic between you 2 kafka clusters correct?
I assume the following:
kafka-a → replicates topic test → kafka-b
kafka-b → replicates topic test → kafka-a

would be nice if you could give some details what you would like to achieve

best,
michael

Hi @mmuehlbeyer

Yes, I try to replicate topic with the same name.

For example:
Cluster A replicates topic test to cluster B.
Cluster B replicates topic test to cluster A.

But MirrorMaker’s configuration does not allow to replicate the same topic without aliases and if I do not specify aliases in configuration files, then I will have a message loop.

Can it be fixed without fixing the source code?

Hi,

basically it’s a feature to prevent cycles

https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP382:MirrorMaker2.0-RemoteTopics,Partitions

https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP382:MirrorMaker2.0-Cycledetection

what would you like to achieve with the topics mirrored in that way?

best,
michael

1 Like

I’m not well versed in kafka, but as far as I know there are connectors that work with messages in kafka (read / write). These connectors must read specific topics.

MirrorMaker can only provide replication if the alias settings are specified in the configuration files. If one kafka cluster fails, then some of the connectors will not be able to read the same topics in another cluster, because they have a different name.

basically yes

but again its worth to think about your use case and what you would like to achieve with an active/active mirror maker env

see also

though there is also a stackoverflow discussion about the topic
and how to avoid topic renaming

I guess that’s the config you’re looking for.

best,
michael

1 Like

I would like to achieve a fault-tolerant system in time and data, so that users can access one of the kafka clusters and have no problems with it, even if one of the kafka clusters goes down.

I found these solutions in stackoverflow. May be I thought that they will not solve my problem. I try to use these solutions again and leave feedback here.

Thanks for help! :slight_smile:

1 Like

Hi

I tried to follow recommendations in solution under the question in stackoverflow but the solution the user suggests contains MigrationReplicationPolicy and mirrormaker swears while working on this class, because he cannot find it.

hi,

which Kafka version are you running?

best,
michael

I am running Kafka v2.6

if my understanding is correct you have to create the
class by yourself

eg. Cloudera has an example in their docs

as well as marcusportmann

hth,
michael

I found that
“This replication policy is only supported with a unidirectional data replication setup where replication happens from a single source cluster to a single target cluster.
Configuring additional hops or bi-directional replication is not supported and can lead to severe replication issues.”
at the 1st link

The 2nd link has code with methods formatRemoteTopic() and configure() which also contains sourceClusterAlias.

I tried to change formatRemoteTopic by the solution in stackoverflow but a lot of test are failed while building mirror.

hmm I see
need to think about it, but maybe it makes sense to go one step back
and look at the options from a more architectural side.:

who should consume/produce from which cluster and topic?
what topics should be replicated?
where are the clients located?
same datacenter as the cluster?

Hi

I need to build an active-active replication in kafka where datacenters (as clusters) locates in differents cities for example. Clients can consume/produce from cluster which closest to them and they work with topics which contains the same name to avoid the situation where one of clusters goes down that they wouldn’t have to rewrite connectors to connect to another kafka cluster.

ok understand

one thing to consider:
who decides what it the “closest cluster”?

if you like to stick with mirrormaker2 I would go with a setup like this to prevent infinite cycles
image
source and credits to instaclustr
(Apache Kafka MirrorMaker 2 (MM2) Part 2: Practice - Instaclustr)

but maybe some other options fit your needs as well

best,
michael

Thank you very much for answers :slight_smile:
Do I understand correctly, last link with replicator is it not a free solution from Confluent?

you’re welcome :slight_smile:

yes replicator is part of the commercial features

best,
michael

Hi

Thanks again for your help in resolving this issue. I work for a company that’s locates in Russia and I heard that Confluent is not particularly able to sell licenses to Russia.

Maybe there are other replicators that solve this problem?

yep

not sure whether it fits your needs but it might be worth to try

1 Like

and here’s an interesting blog post

1 Like