No golang support for schema serializer (protobuf)?

Hey All, we are blocked due to the no golang support in schema registry. Is there a workaround for this? This is specifically for protobuf (not avro).

Please help, this is a big blocker for us…

  • Jake

@riferrei wrote this, perhaps it will help?

1 Like

@rmoff unfortunately it does not, it is incomplete (so not functional). Given that confluent supports golang (as I understand it) is there an official library for this? This is a key feature that we want to use no the product, but it’s not functional using go…that seems hard to believe…

@jacobpeebles,

Which specific features are you looking for in a Go client for Schema Registry?

Confluent support their Golang client. There isn’t currently an official Schema Registry library.

@rmoff We tried this solution and the code only partially works:

https://github.com/riferrei/srclient/issues/17#issuecomment-896525213

The writing works but the confluence UI doesn’t render simple fields(int32).

When we tried the consumer(deserializing), it simply does not work. Are you aware of any serializing code that works with protobuf for golang?

Thanks in advance, Jake

Are you using the Confluent web UI topic viewer? There’s a known issue where it doesn’t deserialise and render Avro/Protobuf messages. But the underlying data in the topic is still correct.

Can you expand on “does not work”? Do you get an error? Can you share your code here that you’re using?

I take it you’ve seen this reference doc on the wire format that Schema Registry uses?

I think I am getting a better picture of the problem now. @jacobpeebles, the library that @rmoff mentioned, is a client for SR (Schema Registry), which helps in reading and writing schemas from/to SR. It is not a serialization framework. While writing data into Kafka, you are still responsible for organizing the bytes so that consumers will understand. The Java client from Confluent does a terrific job in encapsulating the access to SR and how bytes are written and read. However, to do so, it uses an internal bytes arrangement that is not so trivial to understand if you don’t look up the code under the hood.

I blog about this a while ago, as you can see in the post below:

https://riferrei.com/data-sharing-between-java-go-using-kafka-and-protobuf/

TL;DR, if you want either your consumers and Confluent Web UI to read the data and schema from the topics — you must write in the correct order that Confluent is expecting. The blog post should give you an idea about how to. But again, the Go client for SR is just a way for you to read and write schemas from/to SR.

@riferrei

2 Likes

@rmoff since the go client itself is managed by Confluent, and so is the Schema Registry server and Java client, would it make sense for Confluent to also properly support using Schema Registry with Go?

It doesn’t seem like it should be that hard, and I kind of know since I build with Rust something almost similar to Java.

1 Like

@rmoff @riferrei Thanks. It’s good to have a working sample. At the same time, without proper serializer and deserializers, one needs to figure how to express the message index on complex cases both for reading and writing. For example, in our case, there are certain resources that have a few Proto files to represent the full structure. So going for what we have access to, in the ui you use references. Proto file validates, all good.

But then, coming up manually with the message index bytes becomes impractical and error-prone. To a degree, it defies the purpose of the registry.

Thinking out-loud on how a serializer could, it sounds like:

  • Read the format from the schema registry
  • Based on it determine inline messages, and figure the bit array representation
  • If the message is remote(in a different file) maybe pull that schema from the registry? Or maybe the references is enough info to do so?
  • Then figure the bit array representation
  • Along with all that, you need to have the proper order of values inside of the massage index.

From the side of the consumer, you mostly care about determining where your Proto bytes start if we go by the code.

My suggestion would be in the very short term that confluent should release a diagram showing how the bytes come together for the message index. Then a few people could potentially write the code in whatever language is needed. Don’t get me wrong, the hints are there, but really it’s just a matter of documenting it more and expose it to the public.

  • Jake

@rmoff I completely agree with @gklijs - it’s inherently easier (and more sensible) for confluent to make this easier than to have companies, trying hard to adopt the technology, use their own resources to unravel this.

Or are we missing something?

  • Jake

In terms of better explaining the message indexes, we’re looking at improving the example in the documentation with the following text (suggested by a customer):

(start of new text:)

"A single Schema Registry Protobuf entry may contain multiple Protobuf messages, some of which may have nested messages. The role of message-indexes is to identify which Protobuf message in the Schema Registry entry to use. For example, given a Schema Registry entry with the following definition:

package test.package;

message MessageA {

            message Message B {

                        message Message C {

                                    …

                        }

            }

           message Message D {

            …

            }

            message Message E {

                        message Message F {

                        …

                        }

                       message Message G {

                        …

                        }

            …

            }

            …

}

message MessageH {

            message MessageI {

            }

}

The array [1, 0] is (reading the array backwards) the first nested message type of the second top-level message type, corresponding to test.package.MessageH.MessageI. Similarly [0, 2, 1] is the second message type of the third message type of the first top-level message type corresponding to test.package.MessageA.MessageE.MessageG

(end of new text)

The above array for the message indexes is then prepended with the length of the array. All integers are zig-zag encoded.

Please let me know if you need further clarifications on the example.

2 Likes

The documentation for message indexes has been updated: Formats, Serializers, and Deserializers | Confluent Documentation

1 Like