Question Regarding Kafka Producer Using Schema Registry and Protobuf Serialization

pchang388 · 13 December 2022 00:17

I apologize if this is against the rules but I posted a question in Github and am trying to understand the general process of using a Producer (GoLang confluent-kafka-go) and Schema Registry for protobuf use case.

I haven’t got a response yet from anyone on github, trying here to see if anyone can help. here is the GitHub issue with all the details:

github.com/confluentinc/confluent-kafka-go

Question Regarding Kafka Protobuf Producer - Specifying Schemas and Versions

opened 01:13AM - 09 Dec 22 UTC

pchang388

Description =========== Hi, I am new to Kafka Connect and working with Prot…obuf serialization and also this entire process in general. My current task is to evaluate the data flow process between Kafka topics and inserts into TimescaleDB using JDBC sink connector with Kafka Schema Registry. I have almost everything up and running and am trying to test the E2E flow using a sample producer. However, I had a few general questions on Kafka Protobuf Producer, an example is provided in this repo here: https://github.com/confluentinc/confluent-kafka-go/blob/5ba3caae52b04aaf7b4e22d5e30737b42ca948cd/schemaregistry/serde/protobuf/protobuf.go I was hoping someone could explain to me a few things here: In the referenced example above, only the schema URL is given. But let's say that you had multiple proto schema subjects available (example: `proto.testrecord` and `proto.anotherrecord`) in the Schema Registry. In this new example, we have the two schemas showing as: `proto.testrecord` ``` message TestRecord { string cluster_name = 1; string id = 2; string hostname = 3; string metric = 4; int64 value = 5; string value_text = 6; int64 timestamp = 7; } ``` `proto.anotherrecord` ``` message AnotherRecord{ string source_name= 1; string map_id= 2; string hostname = 3; string metric_group = 4; int64 value = 5; string value_text = 6; int64 timestamp = 7; } ``` Let's say you also have two topics and one (producer 1) should use the first schema subject and the second (producer 2) should use the second schema subject to validate/conform the data for inserts. Now let's you are creating a producer for first topic (producer 1) and it should use the `proto.testrecord` subject from the Schema Registry for serialization. How would you configure/tell the producer to use the correct schema subject and also the exact version (if multiple existed)? Or are not supposed to specify those due to the way the process works? I noticed the repo example provided doesn't specify any of that information and I am trying to understand exactly how it knows or defaults to a certain subject and version. According to this: https://github.com/confluentinc/confluent-kafka-go/blob/5ba3caae52b04aaf7b4e22d5e30737b42ca948cd/schemaregistry/serde/config.go#L36. Looks like you can specify the SchemaId (via `UseSchemaID` var) which tells it which schema subject to use but unsure about how to specify versioning. I probably missed something during my reading and am not understanding things correctly and could use some help/discussion. Any help is appreciated, thanks!

daveklein · 17 January 2023 18:29

For the benefit of future readers of this post, here is an answer that was posted on Github.

github.com/confluentinc/confluent-kafka-go

Question Regarding Kafka Protobuf Producer - Specifying Schemas and Versions

opened 01:13AM - 09 Dec 22 UTC

pchang388

Description =========== Hi, I am new to Kafka Connect and working with Prot…obuf serialization and also this entire process in general. My current task is to evaluate the data flow process between Kafka topics and inserts into TimescaleDB using JDBC sink connector with Kafka Schema Registry. I have almost everything up and running and am trying to test the E2E flow using a sample producer. However, I had a few general questions on Kafka Protobuf Producer, an example is provided in this repo here: https://github.com/confluentinc/confluent-kafka-go/blob/5ba3caae52b04aaf7b4e22d5e30737b42ca948cd/schemaregistry/serde/protobuf/protobuf.go I was hoping someone could explain to me a few things here: In the referenced example above, only the schema URL is given. But let's say that you had multiple proto schema subjects available (example: `proto.testrecord` and `proto.anotherrecord`) in the Schema Registry. In this new example, we have the two schemas showing as: `proto.testrecord` ``` message TestRecord { string cluster_name = 1; string id = 2; string hostname = 3; string metric = 4; int64 value = 5; string value_text = 6; int64 timestamp = 7; } ``` `proto.anotherrecord` ``` message AnotherRecord{ string source_name= 1; string map_id= 2; string hostname = 3; string metric_group = 4; int64 value = 5; string value_text = 6; int64 timestamp = 7; } ``` Let's say you also have two topics and one (producer 1) should use the first schema subject and the second (producer 2) should use the second schema subject to validate/conform the data for inserts. Now let's you are creating a producer for first topic (producer 1) and it should use the `proto.testrecord` subject from the Schema Registry for serialization. How would you configure/tell the producer to use the correct schema subject and also the exact version (if multiple existed)? Or are not supposed to specify those due to the way the process works? I noticed the repo example provided doesn't specify any of that information and I am trying to understand exactly how it knows or defaults to a certain subject and version. According to this: https://github.com/confluentinc/confluent-kafka-go/blob/5ba3caae52b04aaf7b4e22d5e30737b42ca948cd/schemaregistry/serde/config.go#L36. Looks like you can specify the SchemaId (via `UseSchemaID` var) which tells it which schema subject to use but unsure about how to specify versioning. I probably missed something during my reading and am not understanding things correctly and could use some help/discussion. Any help is appreciated, thanks!

Topic		Replies	Views
Interfacing with schema registry from golang (msgs serialised as Protobuf msgs) Schema Registry	4	8359	13 January 2022
Value not wanting to be serialised into protobuf Non-Java Clients	6	266	24 June 2024
PostgreSQLSink Connector Issues Managed Connectors	5	3653	2 March 2021
Need the proto descriptor by the name from schema registory Schema Registry	0	3034	8 August 2022
Unable to find any serializer/deserializer for AVRO in golang Schema Registry	4	8528	20 July 2022

Question Regarding Kafka Producer Using Schema Registry and Protobuf Serialization

Related topics