I am trying to get ksqlDB to deserialize protobuf messages with not much success I am afraid.
More specifically, I have Jaeger spans published to a Kafka broker and have succesfully register the Jaeger protobuf schema model with schema registry under /subjects/jaeger-spans-value/ route and the schema string is properly escaped.
KsqlDB always returns a deserialization error when I am trying to register any stream against that topic:
ERROR {"type":0,"deserializationError":{"target":"value","errorMessage":"Error deserializing message from topic: jaeger-spans","recordB64":null,"cause":["Failed to deserialize data for topic jaeger-spans to Protobuf: ","Error deserializing Protobuf message for id -1","Unknown magic byte!"],"topic":"jaeger-spans"},"recordProcessingError":null,"productionError":null,"serializationError":null,"kafkaStreamsThreadError":null} (processing.transient_STR_SPANS_MODEL_8922992258591230941.KsqlTopic.Source.deserializer:44)
It is worth noting that all this are running inside docker, but I have tested the same setup with JSON as the Jageger span encoding format and works like a dream. ksqlDB successfully communicates with the schema registry.
Any ideas, feedback would be highly appreciated.
~GS
The question is, what serializer did you use to write the messages into the topic? ksqlDB assume that the SR serialization format is used, that contains a 5-byte header before the actually protobuf payload data, so the record value is header+payload.
Thanks for taking the time I really appreciate it. The message is encoded in protobuf by the producer.
The producer is a part of Jaeger, an OSS distributed tracing ecosystem that allows for using Kafka as a buffer before writing to some persistent storage.
Are you suggesting that the producer implementation is somehow not properly serializing the messages?