Questions about non-materialized tables (CREATE TABLE from existing topic)

Ben · 23 September 2021 17:15

I understand that CREATE TABLE (from existing topic) currently always results in a non-materialized table (meaning it won’t be backed by a RocksDB) which, as such, is not eligible for pull queries.

I’m curious what a non-materialized table amounts to in practice:

Is the table’s state held only in memory?
Are there different upper bounds on table size when dealing with a non-materialized table instead of a materialized one?
If a KSQL cluster spins up a new server, does that server need to consume all messages in the topic in order to re-build its in-memory state for the non-materialized table?

The use case I have in mind is doing a stream-table join:

a stream of apartments doing a left join on
a topic containing building records (as a CREATE TABLE from the existing topic, thus non-materialized)

If that table were to have hundreds of thousands of keys, would I be better off managing its contents via a materialized table (using CREATE TABLE AS SELECT)? I couldn’t find much documentation on how non-materialized tables are implemented and their practical implications. Feedback welcome!

mjsax · 23 September 2021 22:53

Is the table’s state held only in memory?

No. The data will only be in the topic. A CT statement is just a metadata operation that registers the schema and topic for the table in the catalog.

Are there different upper bounds on table size when dealing with a non-materialized table instead of a materialized one?

No. When you use a non-materialized table in a persistent query, the data will be pulled into the ksqlDB server.

If a KSQL cluster spins up a new server, does that server need to consume all messages in the topic in order to re-build its in-memory state for the non-materialized table?

As pointed out above, non-materialized means that data is only the topic, but no data is in the ksqlDB server (also not in-memory). Data will be pulled in, at query time only.

If that table were to have hundreds of thousands of keys, would I be better off managing its contents via a materialized table (using CREATE TABLE AS SELECT)? I couldn’t find much documentation on how non-materialized tables are implemented and their practical implications

No need to do anything special. When you create the join query, ksqlDB will pull in the data from the table topic and materialize it in RocksDB (as internal operator state) to compute the join.

Ben · 24 September 2021 05:49

Thanks for clarifying, much appreciated!

system · 1 October 2021 05:50

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
KsqlDB server restart vs materialized tables ksqlDB	1	1136	17 January 2024
Stream insert from other topic to table ksqlDB	3	3346	5 June 2021
Create Table without data aggregation ksqlDB	2	4092	16 April 2022
Cache/Query latest value of each key in a topic? ksqlDB	2	3688	2 July 2021
Ksql table column from message key ksqlDB	2	50	11 July 2024

Questions about non-materialized tables (CREATE TABLE from existing topic)

Related topics