Hi there,
I’m trying to build Spark based Kafka Producer and Consumer in Databricks that sends and consume Avro data and wanted to use Schema registry url for mapping the Data schema.
Unfortunately I’m not able to find some good example on it and need your help on any tested code Scala/Python with Spark.
Thanks for your time.
Thanks for your response and sharing the latest blog on it. It will really help user community.
In out Kafka cluster, schema registry is having https url. I have below queries.
User has read/write permission for Kafka topic. to use Schema registry, do we need to explicitly provide the Read/Write permission to the user?
Hi @abhietc31 , unfortunately the Databricks implementation doesn’t support a secured schema registry yet. So you need to use the schemaregistry client manually to decode the messages. This is described in the blog post I linked above, (see the getSchema(id) function ) but I am pasting here the code for reference:
That’s very nice explanation. I really loved it.
Blog is written very thoughtfully.
I’m getting stuck at Spark-Kafka producer side which is missing in the blog and can’t just test consumer.
Could you please suggest on how to use Schema Registry and to_Avro to producer Avro data in Kafka topic and while producing , how to use Schema registry for mapping the schema plz ?
Thanks,
Abhishek
Hi @abhietc31@gianlucanatali … Anyone tried the Producer Code example for above request. Exactly we are looking for above example producer either in JSON or AVRO format from Databricks using pyspark.
Could someone help me out here!
I tried various things, but I am unable to get my app to consume the topic correctly. I am using pyspark structured streaming, and for some reason the from_avro() doesn’'t deserialize the stream correctly. For example, this is how the output looks