Issues with optional fields when using Avro schema

I’m trying to create a schema with optional fields. Here’s my setup:

$> cat mytest.avsc
{
  "type": "record",
  "namespace": "com.example",
  "name": "MyTest",
  "fields" : [
    {"name": "field1", "type": "string"},
    {"name": "field2", "type": ["null", "string"], "default": null}
  ]
}
$> cat sample1
{"field1": "value1"}

$> cat sample2
{"field1": "value1", "field2": "value2"}

I’ve also created the topic with the same schema. When I try to ingest sample1 or sample2, I get the following errors.

$> kafka-avro-console-producer --topic mytest --bootstrap-server localhost:9092 --property value.schema="$(< mytest.avsc)" < sample1
org.apache.kafka.common.errors.SerializationException: Error deserializing json {"field1": "value1"} to Avro of schema {"type":"record","name":"MyTest","namespace":"com.example","fields":[{"name":"field1","type":"string"},{"name":"field2","type":["null","string"],"default":null}]}
Caused by: org.apache.avro.AvroTypeException: Expected start-union. Got END_OBJECT
        at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:514)
        at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:433)
        at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:283)
        at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
        at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
        at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:259)
        at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
        at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
        at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
        at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
        at io.confluent.kafka.schemaregistry.avro.AvroSchemaUtils.toObject(AvroSchemaUtils.java:142)
        at io.confluent.kafka.formatter.AvroMessageReader.readFrom(AvroMessageReader.java:121)
        at io.confluent.kafka.formatter.SchemaMessageReader.readMessage(SchemaMessageReader.java:316)
        at kafka.tools.ConsoleProducer$.main(ConsoleProducer.scala:51)
        at kafka.tools.ConsoleProducer.main(ConsoleProducer.scala)

$> kafka-avro-console-producer --topic mytest --bootstrap-server localhost:9092 --property value.schema="$(< mytest.avsc)" < sample2
org.apache.kafka.common.errors.SerializationException: Error deserializing json {"field1": "value1", "field2": "value2"} to Avro of schema {"type":"record","name":"MyTest","namespace":"com.example","fields":[{"name":"field1","type":"string"},{"name":"field2","type":["null","string"],"default":null}]}
Caused by: org.apache.avro.AvroTypeException: Expected start-union. Got VALUE_STRING
        at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:514)
...

Is it possible to have some fields optional with Avro schema? I’ve run into some StackOverflow and google group threads with no luck. I’d appreciate any feedback/guidance.

2 Likes

hi try like this

{“field1”: “value1”, “field2”:null}
{“field1”: “value1”, “field2”: {“string”:“value2”}}

Isn’t there a way to make {"field1": "value1"} to work?

1 Like

Thank you for the responses. I’m just a bit confused about having an optional field and still having to specify a null value when producing events. My assumption was similar to what @sponda mentioned.

In any case, I’m guessing this is as documented but since I was a bit confused about what I’ve found online I tried both with mytest.avsc and mytestoptional.avsc for mytest topic and I’d like to share the results.

Here’s the content I used:

$> cat mytest.avsc
{
  "type": "record",
  "namespace": "com.example",
  "name": "MyTest",
  "fields" : [
    {"name": "field1", "type": "string"},
    {"name": "field2", "type": "string"}
  ]
}

$> cat mytestoptional.avsc
{
  "type": "record",
  "namespace": "com.example",
  "name": "MyTest",
  "fields" : [
    {"name": "field1", "type": "string"},
    {"name": "field2", "type": ["null", "string"], "default": null}
  ]
}

$> cat sample1
{"field1": "value1"}

$> cat sample2
{"field1": "value1", "field2": "value2"}

$> cat sample1optional
{"field1": "value1 optional", "field2":null}

$ cat sample2optional
{"field1": "value1 optional2", "field2": {"string":"value2"}}

TEST1: Changed topic schema to mytest.avsc and tried to produce all sample* files.

WORKING (sample2):

kafka-avro-console-producer --topic mytest --bootstrap-server localhost:9092 --property value.schema="$(< mytest.avsc)" < sample2

ERROR:

kafka-avro-console-producer --topic mytest --bootstrap-server localhost:9092 --property value.schema="$(< mytest.avsc)" < sample1
kafka-avro-console-producer --topic mytest --bootstrap-server localhost:9092 --property value.schema="$(< mytest.avsc)" < sample1optional
kafka-avro-console-producer --topic mytest --bootstrap-server localhost:9092 --property value.schema="$(< mytest.avsc)" < sample2optional

TEST2: Changed topic schema to mytestoptional.avsc and tried to produce all sample* files.

WORKING (sample1optional & sample2optional):

kafka-avro-console-producer --topic mytest --bootstrap-server localhost:9092 --property value.schema="$(< mytestoptional.avsc)" < sample1optional
kafka-avro-console-producer --topic mytest --bootstrap-server localhost:9092 --property value.schema="$(< mytestoptional.avsc)" < sample2optional

ERROR:

kafka-avro-console-producer --topic mytest --bootstrap-server localhost:9092 --property value.schema="$(< mytestoptional.avsc)" < sample1
kafka-avro-console-producer --topic mytest --bootstrap-server localhost:9092 --property value.schema="$(< mytestoptional.avsc)" < sample2

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.