I have a scenario where there is stream of KafkaMessage(a java class/dto) objects with same request id i and each object having a hashKey and i want to calculate how many hashkeys have an exact count of 3.
var requestIdStream = dataStream.groupByKey().windowedBy(timeWindow);
however it cannnot be used collectively with
var windowedStream = dataStream .groupBy((key, kafkaMessage) -> kafkaMessage.getRequestId() +"/"+kafkaMessage.getHashKey(), // Group by hashKey Grouped.with(Serdes.String(), SerdesFactory.kafkaMessageSerde())) .windowedBy(timeWindow) .count(Materialized.as("count-store")); /// Count occurrences of hashKeys in each window
Then to calculate the count exact equal to 3 stored those hasKeys in a list
List<String> hashKeyList = new ArrayList<>(); windowedStream.toStream().foreach((windowedKey, count) -> { String hashKey = windowedKey.key(); if (count == 3) { hashKeyList.add(hashKey); log.info("HashKey {} has appeared 3 times in the window. Added to ArrayList.", hashKey); } });
But i am unable to find groupByKey() and groupBy() to be used collectively.