Hello!
We use Confluent and Schemaregistry, with protos.
There is an upstream team working in Dotnet, which makes schema evolution progress.
I work in the downstream BI team, working in Java. We consume from their topics, and have our own topics. In our Java project containing the protos, where we autogen Java classes, we always lag behind their proto progress.
I’m now starting to use Kafka Streams in a new microservice. I’m hitting this snag:
We allow K.S. to create topics, so that it can create the needed ‘repartition’ and ‘changelog’ topics that correspond to the KTables and operations on them.
We also allow K.S. to POST to the S.R. to be able to register protos for those.
props.put(“auto.register.schemas”, true);
K.S. consumes topics from the upstream team, there are nested protos. K.S. needs to look that stuff up in the S.R.
In debugging, it emerged that K.S. thinks protos for subjects haven’t been registered, which are actually there. The LLMs tell me, how K.S. does it, is to derive proto from the autogen Java classes, calc a hash, and compare with what’s in the S.R.
It seems this fails because the upstream team has for instance registered:
syntax = “proto3”;
package mic;
option csharp_namespace = “Example.Common”;
enum XYZ {
X = 0;
Y = 1;
}
But when we (Java BI team) put protos to autogen Java classes, we add options:
syntax = “proto3”;
package mic;
option csharp_namespace = “Example.Common”;
option java_package = “com.example.protobuf.common”;
option java_outer_classname = “XyzOuterClass”;
option java_multiple_files = false;
enum XYZ {
X = 0;
Y = 1;
}
In debugging I found that K.S. has then registered such protos, thinking they don’t yet exist. What’s worse, because we lag behind, and they had created a version 2 with:
enum XYZ {
X = 0;
Y = 1;
Z = 2;
}
K.S. has now created a version 3, based on the old one in our project, with:
enum XYZ {
X = 0;
Y = 1;
}
I think this would normally not be possible to do by hand with REST, since it violates schema evolution rules, but K.S. was somehow able.
So that’s my dilemma:
we have to keep: props.put(“auto.register.schemas”, true);
for those places where we want K.S. to register protos for its own autogenerated topics,
but at the same time, we have to get it to find & accept the existing registered protos, and not do that hash based comparison which will always fail.