I just started using KSQLDB and am pretty impressed with it. But I have found an issue that I would think be possible, but it appears it isn’t, or I am thinking about it the wrong way. I have a topic that has different messages coming into it and I want to route them to specific topics based on their type.
All messages are JSON and follow the same high level format, similar to an envelope concept. Top level fields are meta data and one field contains the payload. The payload is different for each record. The schema is the same until we get into the Payload. At this point I don’t care about the payload and I would like to ignore the Payload for the “routing” stream.
example fields:
Payload: JSON,
LastModifiedDate: VARCHAR,
Key: VARCHAR,
type: VARCHAR,
Name: VARCHAR,
CreatedDat:e VARCHAR,
Id: VARCHAR,
Status: VARCHAR
see this article for a good explanation on why you might do this: Should You Put Several Event Types in the Same Kafka Topic? | Confluent
I found a couple examples that didn’t work. One solution would be to mark the attribute as varchar. But that changes the content of the message to a string and thus escapes all the double quotes. This would not work for the consumers.
This article shows a similar solution, but it only works by explicitly extracting individual fields. Working with heterogenous JSON records using ksqlDB
How would others approach the problem with KSQLDB? My alternative would be to create a producer/consumer service that manually does this. We have KSQLD setup and it seems like a better solution if I can make it work. Also related to the problem how do people handle rapid green field development were the schema is in flux early on?
Thanks
