ProductionExceptionHandler in infinite loop for Kafka stream with exactly once enabled

I have an aplication written in kotlin and I use one quiet straightforward Kafka stream inside.

I provided ProductionExceptionHandler and UncaughtExceptionHandler because I do not want the service to crash. Inside both of them I log the exception and then from the first handler I return ProductionExceptionHandlerResponse.CONTINUE, and from the second one StreamThreadExceptionResponse.REPLEACE_THREAD.

I have a test where I force the stream to throw RecordTooLargeException, it is thrown correctly and ProductionExceptionHandler is involved in handling that one.

The problem is: Service goes into an infinite loop and tries to send the problematic message again and again.

Can I somehow skip that message? The transaction of course is never commited. It tries to end that one.

I believe you are hitting [KAFKA-15259] Kafka Streams does not continue processing due to rollback despite ProductionExceptionHandlerResponse.CONTINUE if using exactly_once - ASF JIRA

It’s actually a change in how the producer works, that “breaks” Kafka Streams. Details are on the ticket.

The only workaround I can think of (for RecordTooLargeException), it so add a explicit step into your KS program, that serialized the data, and checks the size, and drops the record is too big before it hits the sink. – Of course, if there would be some other exception this workaround would not be sufficient.

1 Like