Using Confluent Cloud KSQLDB, is it possible to join an existing stream with the results of a pull query from a persistent query table?
My use case:
- user sends a prompt message to a prompt topic asking a question about a user, such as ‘which department does Sam work in?’.
- The message goes into a prompt stream.
- In order for ChatGPT to answer this question, it needs context (the latest data about users) appended to the original prompt … while the prompt is in motion. The user data exists in a persistent query table.
- I want to create new stream that includes the original prompt question string concatenated with a string containing the latest user data results of a pull query on the table.
- After augmenting the prompt, the new stream goes to an Azure Function Sink that calls the ChatGPT API.
I need help with step 4, if it is possible.
I am aware there are other ways to augment a chatgpt prompt using a embeddings and vector database. But, I am trying to simply leverage data in a persistent table as another option for my clients.
I am also aware that I could use KSQL API outside of Confluent Cloud to perform the pull query and augment the prompt there. That is my backup plan. My preference would be to be able to do it while the data is in motion using stream processing. This use case is for small datasets that do not require a vector database.
What would a CREATE STREAM look like that grabs a snapshot of the latest values of the users, converts it to a string in the new streams output?