Create a KSQLDB Cluster and split workload

marcel.fenerich · 3 December 2021 18:27

Hi, everyone. I’m creating some microservices and I’d like to use KSQLDB.

I plan to have more than one instance of each service where I will materialize some views.

I’ve seen that is possible to split/partition this materialized views contents accross these instances, like said in this confluent article:

To allow users to GET any order, the Orders Service creates a queryable materialized view (‘Orders View’, using a state store in each instance of the service, so any Order can be requested historically. Note also that the Orders Service is partitioned over three nodes, so GET requests must be routed to the correct node to get a certain key. This is handled automatically using the [nteractive Queries to expose the HTTP endpoint. (Alternatively, we could also implement this view with an external database, via Kafka Connect.)

I’ve done some tests and I created a KSQLDB cluster by setting the same ‘KSQL_KSQL_SERVICE_ID’ (I’m using Docker) for both, also making sure one can talk to other (this is important for perform pull queries, for example). I also make sure my topic has more than one partition.

The problem is: it seems that when materializing a view each instance holds a full copy of the dataset, because I can pull querie any record from any instance.

I’d appreciate any help.

Cheers;

mjsax · 3 December 2021 22:36

If you point to any ksqlDB server to issue a pull query, if it does not host certain record, it will forward the request to the other server automatically to answer the query.

The quote refers to Kafka Streams, not ksqlDB. Kafka Streams does not have such a routing layer.

Cf. Highly Available, Fault-Tolerant Pull Queries in ksqlDB

system · 28 December 2021 20:00

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Real-Time Materialized Views/Streams with ksqlDB - new consumers and replay Architecture and Design	0	3169	28 July 2022
KsqlDb Pull Query and messages in Topic ksqlDB	1	3480	25 December 2022
Are ksqlDB push queries distributed across cluster? ksqlDB	3	39	1 September 2024
How Real-Time Stream Processing Works with ksqlDB, Animated ksqlDB	1	3355	2 December 2021
KsqlDB server restart vs materialized tables ksqlDB	1	1140	17 January 2024

Create a KSQLDB Cluster and split workload

Related topics