Message key's and primary keys for messages

Hi all.
Excuse the noob question, following some online examples at the moment and trying to relate it back to some code/producer i have.

My producer (golang app) creates fakes sales baskets… post that onto topic salesbasket.
I then have a 2nd producer that posts a payments onto topic salespayment.
I’ve key’d the payments based upon the store.name … so that they can be consumed store by store.

I’ve used a online tool to create a json schema which i added via CC UI onto both topics.

Json to Schema

<Need to figure out how to manage my date in my schema, how to proper schema defined it, example: > -. done

Following some of Robin’s examples, he has a csv to kafka spool example, where he tell the connect creator that his orderid field from the schema will be the key, so that when he does a describe on the schema it shows his orderid as the pk. which is then used to enforce uniqueness and also used to join on, via streams and ksqldb…
for my messages invoicenumber is the unique value on which the basket and payment will be joined.
I need help, noob level here.

HOW do i get to join the messages from the 2 topics onto a new 3rd topic using either streams or ksqldb, but then also how do i define the schema’s/improve, if needed.

I’m attaching below the 2 payloads and then the created schema’s which I’d assume need improvement, ie the dates…

NOTE: I changed the schema… the SalesDate and PaymentDate was originally specified as strings, now specified as dates.

Realise this might fit better on stream processing section, posted here due to the schema def being the first thing that needs to be defined correctly.

At this stage if I try the follow ksqldb code I get error as per:

ksql> CREATE STREAM SALESBASKET WITH (KAFKA_TOPIC=‘salesbasket’,VALUE_FORMAT=‘JSON’);
No columns supplied.

salesbasket

{
 	"InvoiceNumber": "1341243123341232",
	"SaleDateTime": "2023-12-12-T13:22:37+02:00",
	"Store" : {
		"Id": "2143412",
		"Name": "sdfgsjdjndnjdfgs"
		},
	"Clerk": {
		"Id": "231",
		"Name": "grfvnowifgbvuwe"
		},
	"TerminalPoint": "124",
	"BasketItems":[
		{
			"Id": "234123412",
			"Name": "",
			"Brand": "fgtwruyergfd",
			"Catergory": "",
			"Price":12412.00,
			"Quantity":3
		}
	],
	"Net": 442.23,
	"VAT":10.00,
	"Total":452.23 
}

schema

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "properties": {
    "FinTransactionID": {
      "type": "string"
    },
    "InvoiceNumber": {
      "type": "string"
    },
    "Paid": {
      "type": "number"
    },
    "PayDateTime": {
      "type": "string", 
      "format": "date"
    }
  },
  "required": [
    "InvoiceNumber",
    "PayDateTime",
    "Paid",
    "FinTransactionID"
  ],
  "type": "object"
}

salespayment

{
 	"InvoiceNumber": "13412431233412322",
	"PayDateTime": "2023-12-12-T13:30:37+02:00",
	"Paid": 452.23,
	"FinTransactionID": "42dfgt245wsdg34231rfwfg234234"
}

Schema

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "properties": {
    "FinTransactionID": {
      "type": "string"
    },
    "InvoiceNumber": {
      "type": "string"
    },
    "Paid": {
      "type": "number"
    },
    "PayDateTime": {
      "type": "string", 
      "format": "date"
    }
  },
  "required": [
    "InvoiceNumber",
    "PayDateTime",
    "Paid",
    "FinTransactionID"
  ],
  "type": "object"
}

working on my solution… curious why the above did not piggy back on the known schema record.
I’ve created a stream manually using the following… first time I select from the stream it piped records… subsequent it is not, even with auto.offset.reset = true, even dropping and recreating the stream. even with adding new records to the topic…

salespayments

CREATE STREAM salespayments	 (
	      InvoiceNumber VARCHAR key,
	      FinTransactionID VARCHAR,
	      PayDateTime date,
	      Paid Integer )        
 WITH (KAFKA_TOPIC='salespayments',
       VALUE_FORMAT='JSON',
       PARTITIONS=1);

any ideas why, have a second question, this stream was simply, my sales basket has an array of objects, how do represent that in a stream create. it’s not a struct, it’s an array of structs ?

G