Debezium SQL Server v2: Best practices for handling Schema Namespace fragmentation across multiple DB shards (SpecificRecord issue)

229178 · 28 January 2026 15:44

Hi everyone,

I am looking for advice on how to handle Avro schema namespaces when ingesting data from multiple identical SQL Server databases (sharding scenario) using the FullyManaged SQL Server Connector v2.

The Scenario: We have N databases with identical schemas (e.g., ShardedDB_01, ShardedDB_02…) writing to the same Kafka topics. In our legacy pipeline (Connector v1), the default behavior produced a uniform namespace for all records, regardless of the physical source database. This allowed our Java Consumers to simply use SpecificRecord with a single generated avsc class for all incoming messages, without any custom configuration.

The Problem with v2: The Debezium v2 connector now generates schemas where the database name is hardcoded into the namespace by default:

Source A: my.prefix.ShardedDB_01.dbo.MyTable
Source B: my.prefix.ShardedDB_02.dbo.MyTable

Even though the fields are identical, Schema Registry treats these as completely different schemas/namespaces.

The Impact: Because of this fragmentation, we cannot use the standard SpecificDeserializer anymore. The consumer expects a specific class (e.g., com.mycompany.avro.MyTable), but receives records with schemas pointing to dynamic, DB-specific namespaces. We are forced to fallback to GenericRecord, losing type safety, which is a significant regression from our v1 experience.

My Questions:

Source Side (Normalization): Is there a native Debezium v2 configuration or a standard SMT pattern to exclude the Database Name from the namespace (restoring the v1-like uniform behavior: my.prefix.dbo.MyTable)?

Note: We attempted a recursive custom SMT to rewrite the namespace, but traversing Debezium’s complex, deep structures (before/after structs) proved to be too memory-intensive, causing OOM errors on high-throughput workers.

Consumer Side: If normalization at the source is not possible, how do you handle SpecificRecord deserialization in a sharded v2 scenario? Is there a standard way to map multiple writer schemas to a single reader schema without maintaining N duplicate POJOs?

Any help to restore the “uniform schema” capability would be greatly appreciated.
Thanks!

mmuehlbeyer · 29 January 2026 09:35

Hi @229178

are you looking for something like this:

hth,

michael

229178 · 29 January 2026 19:16

Hi Michael, thanks for the link.

However, SetSchemaMetadata appears to support only static schema name definitions.

Since I am ingesting from N different tables from X different DB, I cannot hardcode a single static value; I need the schema names to be generated dynamically
Additionally, SetSchemaMetadata is shallow and does not fix the nested before/after namespace issues, leaving the internal structures incompatible for SpecificRecord.

Topic		Replies	Views
Debezium schema Kafka Connect	3	2593	17 June 2023
Nested static namespace causes SchemaParseException: Can't redefine Schema Registry	0	1529	15 November 2023
How to make connector using pre-defined schema on topic Kafka Connect	1	2996	9 January 2022
Produce AVRO messages to kafka topic using debezium connector using Linux platform Kafka Connect	2	2775	30 November 2022
Writing to a SQLServer database schema from the JDBC Sink Connector Kafka Connect	3	4307	8 September 2021

Debezium SQL Server v2: Best practices for handling Schema Namespace fragmentation across multiple DB shards (SpecificRecord issue)

Related topics