I’m attempting to consume from a Confluent Cloud Kafka topic using spark and I would like to make use of the built-in schema registry where the schema has been created/registered for defining the schema. I’m running into a few hiccups trying to make this work. ( It should be noted I’ve scoured the internet for 6+ hours looking for answers which includes time spent in the forum, before I decided to post here )
Schema Registry Used: Yes, Confluent Cloud
Topic Schema Format: JSON schema
Consumer: Spark readstream ( in databricks )
Databricks Spark with Schema Registry
The above article assumes the topic messages are AVRO serialized with an AVRO schema. This is not my case, they are JSON serlialized with a JSON schema. Can I still make use of the schema registry for defining the schema in spark or will I need to create custom processes for retrieving the schema?
I’m happy to provide more context as needed in order to find an optimal solution.