Null field in CSV source connector

This is originally from a Slack thread. Copied here to make it available permanently.
You can join the community Slack here.

Vasco Ferraz

Hello there :slightly_smiling_face:

I am using the SpoolDirCsvSourceConnector in order to ingest some CSV files.

In the connector I am defining the key.schema and value.schema.

Some fields must de defined has INT64 but some rows on the CSV are empty and I get the following error: Could not parse '' to 'Long' which makes sense.

I tried to add something like on the value.schema "type": ["null", "INT64"]} but it seems that null is not a valid type and I get this error:

Cannot deserialize value of type$Type from String \"null\": not one of the values accepted for Enum class: [STRING, INT16, STRUCT, BOOLEAN, ARRAY, FLOAT64, BYTES, MAP, INT64, INT32, INT8, FLOAT32]

Any ideas how to solve this issue?


Neil Buesing

The “null” type is an avro construct that Connect API treats as optional

Connect’s class

Schema OPTIONAL_INT64_SCHEMA = SchemaBuilder.int64().optional().build();

I did a github search and found this example that might help

Spring_Batch_and_kafka/ at fbf3e754ff22a3819c246c5e07eb38ea5f035ba1 · jenisonleo/Spring_Batch_and_kafka · GitHub

My GitHub Search if you want to look for others

extension:java SpoolDirCsvSourceConnector OPTIONAL

So - something like (but properly escaped so it ends up as a string in your connect config (hence why I point you to what I found on GitHub)

"file_name": {"type": "INT64", "isOptional":true}

Vasco Ferraz

Thanks for the reply :slight_smile:

Solution found. I need to add the following property into the connector:


Neil Buesing

well I am glad that connector has a solution to that; good find.

connectors are so much at the control of what they need to do that is unique to that connector that makes it hard to track down all the nuances — thanks for sharing the answer/fix.