ksqlDB deserilaize jeager spans from protobuf

Hi all,

I am trying to get ksqlDB to deserialize protobuf messages with not much success I am afraid.
More specifically, I have Jaeger spans published to a Kafka broker and have succesfully register the Jaeger protobuf schema model with schema registry under /subjects/jaeger-spans-value/ route and the schema string is properly escaped.

KsqlDB always returns a deserialization error when I am trying to register any stream against that topic:

 ERROR {"type":0,"deserializationError":{"target":"value","errorMessage":"Error deserializing message from topic: jaeger-spans","recordB64":null,"cause":["Failed to deserialize data for topic jaeger-spans to Protobuf: ","Error deserializing Protobuf message for id -1","Unknown magic byte!"],"topic":"jaeger-spans"},"recordProcessingError":null,"productionError":null,"serializationError":null,"kafkaStreamsThreadError":null} (processing.transient_STR_SPANS_MODEL_8922992258591230941.KsqlTopic.Source.deserializer:44)

It is worth noting that all this are running inside docker, but I have tested the same setup with JSON as the Jageger span encoding format and works like a dream. ksqlDB successfully communicates with the schema registry.

Any ideas, feedback would be highly appreciated.

I have no idea what “Jaeger spans” is.

The question is, what serializer did you use to write the messages into the topic? ksqlDB assume that the SR serialization format is used, that contains a 5-byte header before the actually protobuf payload data, so the record value is header+payload.

The header encodes a magic byte plus the 4-byte integer schema id. (cf GitHub - confluentinc/schema-registry: Confluent Schema Registry for Kafka)

Hi @mjsax

Thanks for taking the time I really appreciate it. The message is encoded in protobuf by the producer.

The producer is a part of Jaeger, an OSS distributed tracing ecosystem that allows for using Kafka as a buffer before writing to some persistent storage.

Are you suggesting that the producer implementation is somehow not properly serializing the messages?

Are you suggesting that the producer implementation is somehow not properly serializing the messages?

Could be. How are the the producer serializers configured? Does it use the SR provided Serializers (cf schema-registry/avro-serializer/src/main/java/io/confluent/kafka/serializers at master · confluentinc/schema-registry · GitHub)?

This topic was automatically closed after 30 days. New replies are no longer allowed.