I have a data source that is generating Avro events. I have been able to get it writing to Kafka and integrated with Schema Registry. We use the Sink Connector to save data from Kafka to S3. The problem we have hit is that the Avro data being saved to S3 does not have all the schema information that the schema in Schema Registry has. In particular custom properties added to the schema by the source are being stripped away but those properties are needed to correctly interpret the data.
An example of the schema isL
{
"type": "record",
"name": "ExampleCustomEvent",
"namespace": "com.connect.avro",
"fields": [
{
"name": "numPoints",
"type": {
"type": "string",
"java-class": "java.math.BigDecimal"
}
}
]
}
In the Avro saved by the S3 Sink the "java-class": "java.math.BigDecimal"
property is dropped.
I have found the root cause of this in the code but and trying to understand the reason for it and what I should do to get the properties. In the AvroData.toConnectSchema method that processes the Avro schema from the Schema Registry only know properties or properties with an ‘avro’ prefix are retained. Why? How should this be fixed? I do not control the source for this data.