Record Headers in Stream Aggregate

ArthanarisamyA · 13 April 2023 07:14

Records to a topic is produced with a header which helps in identifying the type of message and is useful in some business processing.

    ProducerRecord<String, String> producerRecord = new ProducerRecord<>(topic, key, msg);
	producerRecord.headers().add("headerkey", type.getBytes(StandardCharsets.UTF_8));
	producer.send(producerRecord);

With help of Processor API’s I am able to read the headers, but my application has streams which does the aggregation and other processing.

streamsBuilder.stream("topic", Consumed.with(Serdes.String(), Serdes.String()))
				.process(() -> new Processor(), "stateStore"); // Able to read the header inside Processor class.

Is there a way that we could read the record headers in stream.aggregate() function of Kafka.

The requirement is to read the headers for further processing

stream.aggregate(String::new, (key, value, aggregated) -> aggregateValues(value, aggregated),
             Materialized.with(Serdes.String(), Serdes.String())) // **Header value is required here**

bbejeck · 13 April 2023 21:16

Hi @ArthanarisamyA ,

There’s no way to read headers in the DSL aggregate method. I suggest putting your aggregation into a custom processor or using process to map your record to a new object containing the headers. With the latter approach, if I understood your requirements correctly, you could use a filter to drop records you don’t want to aggregate.

HTH,
Bill

ArthanarisamyA · 14 April 2023 07:05

Hi @bbejeck,

Thanks for your time and reply and yes, your understanding is right. There could possibly be other records which may need not to participate in consolidation and those records needs to be filtered out.

Considering your suggestion to use custom processor for aggregation, would you please help me to understand better. My understanding is that the Custom Processor process(Record<String, String> record) method gives only the current record for processing.
Where as aggregate(key, currVal, aggregatedVal) method provides us, key, current record and previous aggregated value once it is grouped by key.

bbejeck · 14 April 2023 17:34

Hi @ArthanarisamyA ,

Great question - here are the steps you’ll need to take:

Add a state store to the topology and processor,
Do the lookup from the store by key and perform the aggregation with the current value
Put the updated aggregation back into the store.

This is more or less the same procedure used by the DSL.
-Bill

ArthanarisamyA · 17 April 2023 04:07

Thanks @bbejeck

I have been thinking of the same approach just that I was curious to know if there are any other better alternatives to handle it.

Topic		Replies	Views
Streams not able to read the topic Kafka Streams	0	1395	20 September 2023
Streams not able to read from the topic Kafka Streams	0	1508	20 September 2023
Fresh aggregate for each join like BlockingQueue#drainTo Kafka Streams	12	2536	25 January 2023
Facing issues when aggregating streams with protobuf messages Stream Processing	2	1138	11 November 2023
Kafka streams aggregate keeps old values Kafka Streams	0	3584	16 June 2022

Record Headers in Stream Aggregate

Related topics