Context:
I’m working with Apache Kafka using the following setup:
replication.factor = 3
min.insync.replicas = 2
- Producer is configured with
acks=all
Let’s assume:
- Broker 101 is the leader, and brokers 102 and 103 are followers.
- ISR at time
T0
is[101, 102, 103]
. - A new message
M1
is written to Kafka. It gets replicated to 101 (leader) and 102 (follower) quickly. - However, 103 has not fetched the message yet, but it’s still in ISR because
replica.lag.time.max.ms
hasn’t expired.
Now, suppose:
- Broker 101 crashes suddenly.
- Kafka elects a new leader from the current ISR → picks 103.
- Since 103 never fetched the message
M1
, it becomes leader and truncates its log to the last known HW — resulting in loss ofM1
, even though the producer already got a success ack!
Problem:
This seems to violate durability guarantees of acks=all
. The message was acknowledged but lost because a stale ISR became leader.
My Questions:
- Is this behavior expected in Kafka’s current replication model?
- What’s the recommended way to prevent this type of data loss?
- Tuning
replica.lag.time.max.ms
? - Matching
min.insync.replicas
to thereplication.factor
? - Any newer improvements in KRaft mode or Raft-based replication?
- Are there any known trade-offs in availability vs durability in enforcing tighter ISR behavior?