I understand that CREATE TABLE (from existing topic) currently always results in a non-materialized table (meaning it won’t be backed by a RocksDB) which, as such, is not eligible for pull queries.
I’m curious what a non-materialized table amounts to in practice:
Is the table’s state held only in memory?
Are there different upper bounds on table size when dealing with a non-materialized table instead of a materialized one?
If a KSQL cluster spins up a new server, does that server need to consume all messages in the topic in order to re-build its in-memory state for the non-materialized table?
The use case I have in mind is doing a stream-table join:
a stream of apartments doing a left join on
a topic containing building records (as a CREATE TABLE from the existing topic, thus non-materialized)
If that table were to have hundreds of thousands of keys, would I be better off managing its contents via a materialized table (using CREATE TABLE AS SELECT)? I couldn’t find much documentation on how non-materialized tables are implemented and their practical implications. Feedback welcome!
No. The data will only be in the topic. A CT statement is just a metadata operation that registers the schema and topic for the table in the catalog.
Are there different upper bounds on table size when dealing with a non-materialized table instead of a materialized one?
No. When you use a non-materialized table in a persistent query, the data will be pulled into the ksqlDB server.
If a KSQL cluster spins up a new server, does that server need to consume all messages in the topic in order to re-build its in-memory state for the non-materialized table?
As pointed out above, non-materialized means that data is only the topic, but no data is in the ksqlDB server (also not in-memory). Data will be pulled in, at query time only.
If that table were to have hundreds of thousands of keys, would I be better off managing its contents via a materialized table (using CREATE TABLE AS SELECT)? I couldn’t find much documentation on how non-materialized tables are implemented and their practical implications
No need to do anything special. When you create the join query, ksqlDB will pull in the data from the table topic and materialize it in RocksDB (as internal operator state) to compute the join.