Issues with messages >= 4096 characters

Hi all, I have been working with KSQLDB for a few weeks and am running into what looks like a message size issue. I created a stream that moves a message from a topic to another topic based on a value. It works fine with messages less than 4095 characters. One extra character and the stream doesn’t process the message, one less character and it works fine. The message is JSON. Messages of 4096 characters (an interesting binary number) or larger doesn’t work. This is far under the Kafka max message size. I don’t see message size constraints associated with KSQLDB. any thoughts?

Hi @tparker , this sounds an interesting issue. Can you share steps to reproduce and details of what version you’re running please?

Hi Thanks for the quick reply. I am using AWS MSK for kafka and we have setup KSQLDB as a stand alone service as outlined in ksqlDB: The database purpose-built for stream processing applications..

I have a main topic with a stream associated with it and one two other streams that send (routes) the messages to two other topics based on the field value.

Version: CLI v0.19.0, Server v0.19.0 located

Here are simplified commands:

CREATE STREAM test_split (TableName__c VARCHAR) 
WITH (KAFKA_TOPIC='srctotrgtmappingTopic', PARTITIONS=1, REPLICAS=1, VALUE_FORMAT='JSON');

CREATE STREAM test_contact 
WITH (KAFKA_TOPIC='test_contact', PARTITIONS=1, REPLICAS=1, VALUE_FORMAT='JSON') 
AS   SELECT * FROM test_split WHERE LCASE(Target_Table__c) = 'contact';	

CREATE STREAM test_account WITH (KAFKA_TOPIC='test_account', PARTITIONS=1, REPLICAS=1, VALUE_FORMAT='JSON') 
AS   SELECT * FROM test_split WHERE LCASE(Target_Table__c) = 'account';
show topics;
 Kafka Topic                 | Partitions | Partition Replicas
---------------------------------------------------------------
 test_account                | 1          | 1
 test_contact                 | 1          | 1
 test_split                        | 1          | 1

show streams;
 Stream Name               | Kafka Topic                 | Key Format | Value Format | Windowed
------------------------------------------------------------------------------------------------
 TEST_ACCOUNT              | test_account                | KAFKA      | JSON         | false
 TEST_CONTACT               | test_contact                 | KAFKA      | JSON         | false
 TEST_SPLIT                       | test_split                        | KAFKA     | JSON         | false

Do you see an error in the ksqlDB server log when this happens? Or it’s just silently dropped? And is srctotrgtmappingTopic the source topic with the large value?

and is the data definitely in the topic in the first place? how’s it being populated?

Ha, your a genius. You found the problem. LOL. You asked how it is being populated. It was being populated via the kafka-console-producer which apparently has an issue pasting data 4096 characters or more into it. When I checked the topic it has the data minus the tail end of the message.

FYI for anyone else running into a similar issue, I am running on windows with Ubuntu running as wsl. This new to me as I usually run on a Mac. Not sure if any of these contribute to the situation

Thanks for your help!

Awesome - glad it’s working!

For consideration, kafkacat is my go-to tool of choice for getting data in (and out) of Kafka from the command line. More flexible that kafka-console-producer.

Thank you. I will check out that tool…

This topic was automatically closed after 30 days. New replies are no longer allowed.