Protobuf wire compatibility

Hi,

I have just started to try the registry with Protobuf (I have used it for a long time with Avro).

Does the registry allow any way to only check wire compatibility, and not consider .proto file structure or message names? I was trying a simple refactor of the proto definitions, and found it broke compatibility.

To explain with some examples:

Version 1 of the protobuf definitions, in a single file
thing.proto:

syntax = "proto3";
package brasslock.proto.simple;
option java_multiple_files = true;
message Part {
  int32 value = 2;
}
message Thing {
  int32 id = 1;
  string stuff = 2;
  Part part = 3;
}

Version 2a of the protobuf definitions, split into 2 files:
part.proto

syntax = "proto3";
package brasslock.proto.simple;
option java_multiple_files = true;
message Part {
  int32 value = 2;
}

thing.proto

syntax = "proto3";
package brasslock.proto.simple;
import "part.proto";
option java_multiple_files = true;
message Thing {
  int32 id = 1;
  string stuff = 2;
  Part part = 3;
}

Version 2b of the protobuf definitions, in a single file with a type rename:
thing.proto:

syntax = "proto3";
package brasslock.proto.simple;
option java_multiple_files = true;
message Component {
  int32 value = 2;
}
message Thing {
  int32 id = 1;
  string stuff = 2;
  Component part = 3;
}

All 3 versions have the same wire-format, but the registry rejects v2a and v2b.

Is there a way to allow these kind of changes?

Thanks,
Ross

Version compatibility is different for each format supported by Schema Registry. This may explain why v2b is being rejected. In Protocol Buffers, the field tags must be unique for all fields defined in schemas. In your v1, the field value was set to use the field tag 2 and in future versions you renamed the message from Part to Component — but reusing the same field tag. In it’s binary format, these field tags are written along with the message payload to teach the upstream parser (likely the consumer) how to decode the message, and it will fail because the parser will see a existing field tag being used to “another” field. It doesn’t matter if your intention was just to rename the message. The parser will complain about field tab reuse.

As for the reason why v2a is being rejected, that may be related to whether Schema Registry supports schemas spread over multiple files, and consequentially, schemas defined with different IDs. I don’t think Schema Registry can perform multiple lookups to retrieve the schemas just because you have used the instruction java_multiple_files = true in the schema. But I reserve the right to be wrong on this one.

Apropos, this type of discussion was precisely the focus on my session in the Current conference this year at Austin, TX. Whenever the recording becomes available, I would suggest taking a look at it for a better understanding in how these schema evolution attempts can be problematic.

Cheers,

— @riferrei

1 Like