Cache/Query latest value of each key in a topic?

Hi all,

I’m very new to Kafka and am looking at it as a potential data pipeline for a system. I’ve been very interested in what ksqldb can do but I haven’t managed to figure this out:

Can you create a table based on a particular topic and use it to cache the latest value of each key, such that you can perform ad hoc (pull) queries on it to get the value of a key. A simple example would be to listen to a stock quote stream and store the most recent price for a ticker symbol and then look that up by symbol with a ksql query.

From what I’ve tried so far, it seems like you can only query a materialized view, which needs to be based on an aggregate function. Am I misunderstanding this, as that seems a bit limited. How should I go about doing this?

Thanks in advance!

Your observation is correct, and we are working on improvements already.

The “old” workaround is to issue an aggregation query using LATEST_BY_OFFSET aggregation function, so you can query the result table of the aggregation.

In 0.17 (and 0.18) release, it is actually simpler now, and instead of doing an aggregation query, you can do a CREATE TABLE materialized AS SELECT * FROM input_table; and you should be able to issue pull queries against materialized (cf Announcing ksqlDB 0.17.0 - New Features and Updates and ksqlDB 0.18: Newest Features and Updates).

We are actually also working on a direct materialization of source tables to avoid the need to write a query to make the data available for pull queries: https://github.com/confluentinc/ksql/pull/7474

This topic was automatically closed after 30 days. New replies are no longer allowed.