Pros and cons of using json objects as key with schema maintained in schema registry


I am wondering if there is any dis/advantage to using json objects as message keys.

Assume I have 2 event streams coupled with 2 tables in a source system. Each of those tables have its own business keys;

  1. Product table has product_number and base_product_id as business key in the source system (key: product_number, base_product_id)

  2. Invoice table has invoice_id as its business key and prod_id and prod_num as foreign key that point to Product table in the source system.

I’d like to enrich my Invoice stream with records in a globalKtable built on top of my Product event stream in a kstream application by applying an inner join between the two.

I can think of 3 ways to configure my data producers for assigning keys:

  1. concatenate the value of the keys. e.g.: key=valueOf(prod_id).concat(prod_num)

  2. define key as json object with schema maintained in schema registry: key={“prod_id”: “AAM64”, “prod_num”: “334”}

  3. use a hash function to construct the key. key=hashFunction(valueOf(prod_id).concat(prod_num))

option 2 enforces using the same structure and order of key in my Product and Invoice event streams as field names are part of my keys, and as such the join condition will fail if field names do not match.

Any recommendation as to which approach would make sense is highly appreciated.

Thank you!