Your suggestion @vxia essentially was the solution to my problem. (2) things:
- setting the wrap_single_value property to false
- defining the otherwise anonymous array message as a k/v pair
With this, coupled with what @mjsax pointed out in defining the primary key, I was able to come up with a proper schema to begin querying the topic. Note that the partitioning key in the raw message although looking like an integer actually needed to be defined as a string. So my final schema def looks like this:
create table t (
k string primary key,
v array<struct<...>>
) with (
kafka_topic='...', key_format='KAFKA', value_format='JSON', wrap_single_value=false
)
Very nice! Thanks to all the guidance and suggestions.
-bruiser-