Hi,
We are currently running an evaluation against the Confluent Community Platform and have run in to a snag when trying to run a pull query against a partitioned materialised table, it is failing to return results. However even this seems inconsistent though, as it will return if you have a paritioned materialised table that is using a single partitioned stream to provide the data. Here should be some steps that will allow the issue to be replicated;
kafka-topics --create --topic source-topic-1 --bootstrap-server localhost:9092 --config cleanup.policy=compact --config segment.ms=86400000 --partitions 1 --replication-factor 3
kafka-topics --create --topic source-topic-6 --bootstrap-server localhost:9092 --config cleanup.policy=compact --config segment.ms=86400000 --partitions 6 --replication-factor 3
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" --data '{"schema": "{\"type\":\"record\",\"name\":\"sourceConfig\",\"namespace\":\"example\",\"fields\":[{\"name\":\"value\",\"type\":\"string\"}]}"}' http://localhost:8081/subjects/source-topic-1-value/versions
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" --data '{"schema": "{\"type\":\"record\",\"name\":\"sourceConfig\",\"namespace\":\"example\",\"fields\":[{\"name\":\"value\",\"type\":\"string\"}]}"}' http://localhost:8081/subjects/source-topic-6-value/versions
CREATE STREAM `source_topic_1_stream`(
> `key` string KEY,
> `value` string
>) WITH (
> kafka_topic='source-topic-1',
> value_format='avro'
>);
CREATE STREAM `source_topic_6_stream`(
> `key` string KEY,
> `value` string
>) WITH (
> kafka_topic='source-topic-6',
> value_format='avro'
>);
CREATE TABLE `source_pull_topic_1_from_1` WITH (KAFKA_TOPIC='source-pull-topic-1', PARTITIONS=1, REPLICAS=3, VALUE_FORMAT='avro') AS SELECT `key`, LATEST_BY_OFFSET(`value`) `value` FROM `source_topic_1_stream` GROUP BY `key` EMIT CHANGES;
CREATE TABLE `source_pull_topic_6_from_1` WITH (KAFKA_TOPIC='source-pull-topic-1', PARTITIONS=6, REPLICAS=3, VALUE_FORMAT='avro') AS SELECT `key`, LATEST_BY_OFFSET(`value`) `value` FROM `source_topic_1_stream` GROUP BY `key` EMIT CHANGES;
CREATE TABLE `source_pull_topic_1_from_6` WITH (KAFKA_TOPIC='source-pull-topic-6', PARTITIONS=1, REPLICAS=3, VALUE_FORMAT='avro') AS SELECT `key`, LATEST_BY_OFFSET(`value`) `value` FROM `source_topic_6_stream` GROUP BY `key` EMIT CHANGES;
CREATE TABLE `source_pull_topic_6_from_6` WITH (KAFKA_TOPIC='source-pull-topic-6', PARTITIONS=6, REPLICAS=3, VALUE_FORMAT='avro') AS SELECT `key`, LATEST_BY_OFFSET(`value`) `value` FROM `source_topic_6_stream` GROUP BY `key` EMIT CHANGES;
As best as I can tell there is no CLI producer to insert in to an AVRO topic, however I suspect that this can just be done in KSQL we have a python script where we are doing it, which I am not in a position to share. However I would recommend inserting at least 6 records in to each source topic.
When selecting a push query from any of the materialised tables, all data is returned as expected. When select a pull query however, either in KSQL or over the KSQL API, selecting the data from either of the materialised tables that user source-topic-1 as the object they are materialising from, again everything works as expected. However when attempting to do a pull query against the materialised tables that have been built on top of source-topic-6 this is where the pull query stops working.
If you try and select a key that is stored on a partition that is not on the same node as KSQL, you just get an empty response, where it is failing to find the record. If you query for a key that is on the same node as KSQL, you get the following message in KSQL;
ERROR Exhausted standby hosts to try. (io.confluent.ksql.cli.console.Console:345)
Exhausted standby hosts to try.
Query terminated
Which maybe missleading as the following exception is recorded in the KSQL log which will follow in the comments as I have exceeded the char count.
We are running a three node cluster, the current (7.0.0) version of the confluent community platform. The VM’s that are running the cluster have been hardened so we have set various ENV variables to repoint the /tmp directories for the likes of snappy etc.
Any pointers as to what could be causing the problem would be really helpfull in moving us forwards, unless we have hit an actual bug with the platform? If anything is unclear please let me know and I will try and clarify.
Thanks in advance,
Leander