We’re currently utilizing the Community Confluent Schema Registry to host our existing schemas.
Our Kafka Clusters run on Amazon MSK. Our clusters are replicated via MSK Replicator.
We’re trying to architect our disaster recovery process as well as perform a data migration to a
new set of clusters.
I’m running into an issue however when trying to migrate the data over within our schema registry.
As a reference point I setup a test environment with the recommended setup outlined
in the Multi-Datacenter Setup.
I’ve noticed however that during the failover event, the existing schema registry data is not
preserved. I followed the Runbook details outlined here, but
I can’t seem to preserve the existing data (something that is crucial for us).
I’ve had success accessing the existing data by changing the underlying kafkastore
topic to match
the replicated __schema
topic. That is setting the kafkastore
topic to <orignal-prefix>.__schema
preserves all the schema data. That leaves a problem though of a growing topic name that will
occur during each failover event.
I haven’t been able to find a simple solution to essentially mirror the __schema
topic in the primary
cluster in the secondary cluster during the failover event.
I’m a missing a step in this setup?
It seems a bit odd that the recommended procedure is to have both schema registries point to a single
cluster, then during failover point it to the secondary cluster and lose all the schema data.
I unfortunately cannot swap out our vendors, so I was wondering if there was a non specific vendor method
to accomplish this.