How to add headers to message with Datagen connector in Confluent Control Center?

I’m trying to add headers to message with Datagen connector in Confluent Control Center, is possible?

I have this configuration:
{
“name”: “AvroSender”,
“connector.class”: “io.confluent.kafka.connect.datagen.DatagenConnector”,
“kafka.topic”: “topic.name”,
“max.interval”: “1”,
“schema.string”: “{"fields":[{"name":"systemId","type":{"avro.java.string":"String","type":"string"}}],"name":"avro","namespace":"com.example.avro","type":"record"}”,
“schema.keyfield”: “key”
}

The messages sent with the connector have this headers:
task.generation
task.id
current.iteration
But I need other headers.

Thanks in advance

Hi @davidleongz , can you explain a bit what kind of information you would need in the headers, and why not adding this data in the key or value? Kafka didn’t even have Headers until 0.11 but it had Key/Value pairs since birth :slight_smile: If you the information you are storing in the headers is also needed when you move/transform the data it is probably better for it to be modeled and implemented in a schema. That way if the data goes into a database or Object store its not lost (as it would be if implemented in a message header). You can also use it in trasformations (like in ksqlDB) Many folks use some some sort of “envelope schema” instead (i.e. put the original message nested one level deep inside of a schema which also has a top-level place for whatever metadata you wanted to add). here you can find some information about this approach : Spring for Apache Kafka and Protobuf Part 1: Event Data Modeling

Let us know if this helps, feel free to add more context so we can give you more focused guidance

Then, use headers in Kafka is a bad practice? I have a Consumer with high volume messages and I need filter this messages, is possible that filtering by header is good for performance? Thanks!

@davidleongz here you find an example built in ksqlDB. If your usecase has many output topics and you need to optimize, you will probably be better off with using Kafka Streams for this usecase, as it will allow you to read once from the source topic and direct your message to a different topic depending on the content. There are benefits of using headers for some use cases, you can find some additional guidance in this blog post , Look at Tip #5 . If you end up using kafka streams here you find how to read the headers in the code.

In short:

is possible that filtering by header is good for performance?

Yes, it is possible in some usecases (as example you don’t need to read the content of the key or value AT ALL in the processing, but you are just filtering/routing messages), but it is not always the best approach. As example could be that in your case performance would be “good enough” also using other approaches (like ksqlDB, or using info in key/value) and you can get more value out of making the “logic” more accessible to a wider audience (not requiring a kafka/kafka stream developer as example).

i hope this gives you a bit of ideas, let us know what you end up using and if you want to blog about it be our guest and share it with us :slight_smile:

@gianlucanatali I removed the headers and move the data within payload. Thank you for your answers.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.