Best way to consume, transform and produce between 2 confluent Kafka topics without ksqldb

vradhik · 4 December 2023 07:17

Hello,
if using KSQLDB is not an option, what is the recommended pattern from Confluent to consume events from a confluent Kafka topic, perform transformation of event data as well as structure and publish to another confluent Kafka topic in the same cluster?

We have a confluent cluster deployed on prem. At present, we have developed a custom connector deployed on Kafka-connect that does this. Are there any known issues/limitations with this?

Let me know if you need any further details. Thanks for your inputs.

Xuanzi · 6 December 2023 00:13

You could directly use kafka producer/consumer library to start. If streaming processing is really a thing for you company, maybe adopt flink(but it requires a learning curve and environment setup etc.)

vradhik · 7 December 2023 14:17

Thanks Xuanzi. Do you mean a standalone code that does this instead of a custom Kafka connector?
If so, any known limitation of using a custom connector?

OneCricketeer · 9 December 2023 02:00

MirrorMaker2 is the only connector that can consume and produce to Kafka. It’s not meant to be used for transforming data, only transferring bytes (mirroring topics).

Just Kafka Streams is the answer you’re looking for, which is what ksqlDB is an abstraction over.

The answer is not specific to Confluent.

vradhik · 9 December 2023 10:30

Thanks for your reply. I understand that KSQLDB or Kafka streams is the recommended way and OOTB feature provided by the product. Mirrormaker is not an option as there is some transformation to be made.

What are the issues we might face if we use a custom kafka connect connector to do this apart from the maintainability of the custom connector code?

jdarrahSNT · 11 December 2023 05:21

I have not tried this but you could look at MirrorMaker2 with SMTs. If you need to operate over more than one record at a time you will need streams, ksqldb, or your own consumer / producer code.

OneCricketeer · 12 December 2023 03:54

Flink or Spark would be more flexible options. But Kafka Streams solves your use case, exactly… Kafka Connect SMT code can be very rigid, and you actually need to run a cluster for it. Kafka Streams is just running a JAR file.

As mentioned, since you want data to exist all in the same cluster, that’s not a good use case for Kafka Connect.

Topic		Replies	Views
Send messages from one Kafka topic to another Confluent Cloud	9	9455	4 March 2022
Issue regarding connection between kafka and mysql /kafka and ksqldb Kafka Connect	2	3085	8 January 2022
Connect single ksqlDB to multiple Kafka clusters ksqlDB	2	737	28 April 2024
Flattening data, pushing out CSV's ksqlDB	6	4222	20 June 2021
Setup a small kafka demo to understand & to explain Confluent Cloud	1	3545	19 March 2021

Best way to consume, transform and produce between 2 confluent Kafka topics without ksqldb

Related topics