I am working with a Kafka 3.6.1 cluster (KRaft mode enabled) and would like some guidance on scaling Kafka brokers and controllers. Below are the details of my setup and the steps I followed, along with some challenges encountered. So before going on production, I tested the scaling process in a 2-node test environment. ( broker and KRaft controllers on the same nodes )
Test Cluster Setup:
- Initial Configuration:
- Nodes: 2
- Controller quorum configuration on nodes: controller.quorum.voters=0@172.26.1.103:9093,1@172.26.1.189:9093
- Scaling Process:
- Added a new node (172.26.1.81).
- Configured controller.quorum.voters on the new node as: controller.quorum.voters=0@172.26.1.103:9093,1@172.26.1.189:9093,2@172.26.1.81:9093
- Started Kafka on the new node, which connected successfully as an observer in the KRaft quorum.
Issues Encountered:
- The new node was listed as an observer instead of a voter. after starting
ClusterId: mXMb-Ah9Q8uNFoMtqGrBag
LeaderId: 0
LeaderEpoch: 7
HighWatermark: 33068
MaxFollowerLag: 0
MaxFollowerLagTimeMs: 0
CurrentVoters: [0,1]
CurrentObservers: [2]
- Updating controller.quorum.voters on the old nodes caused an error:
[2024-12-02 12:04:11,314] ERROR [SharedServer id=0] Got exception while starting SharedServer (kafka.server.SharedServer)
java.lang.IllegalStateException: Configured voter set: [0, 1, 2] is different from the voter set read from the state file: [0, 1]. Check if the quorum configuration is up to date, or wipe out the local state file if necessary
at org.apache.kafka.raft.QuorumState.initialize(QuorumState.java:132)
at org.apache.kafka.raft.KafkaRaftClient.initialize(KafkaRaftClient.java:375)
at kafka.raft.KafkaRaftManager.buildRaftClient(RaftManager.scala:248)
at kafka.raft.KafkaRaftManager.<init>(RaftManager.scala:174)
at kafka.server.SharedServer.start(SharedServer.scala:260)
at kafka.server.SharedServer.startForController(SharedServer.scala:132)
at kafka.server.ControllerServer.startup(ControllerServer.scala:192)
at kafka.server.KafkaRaftServer.$anonfun$startup$1(KafkaRaftServer.scala:95)
at kafka.server.KafkaRaftServer.$anonfun$startup$1$adapted(KafkaRaftServer.scala:95)
at scala.Option.foreach(Option.scala:437)
at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:95)
at kafka.Kafka$.main(Kafka.scala:113)
at kafka.Kafka.main(Kafka.scala)
[2024-12-02 12:04:11,325] INFO [ControllerServer id=0] Waiting for controller quorum voters future (kafka.server.ControllerServer)
[2024-12-02 12:04:11,328] INFO [ControllerServer id=0] Finished waiting for controller quorum voters future (kafka.server.ControllerServer)
[2024-12-02 12:04:11,331] ERROR Encountered fatal fault: caught exception (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler)
java.lang.NullPointerException: Cannot invoke "kafka.raft.KafkaRaftManager.apiVersions(" because the return value of "kafka.server.SharedServer.raftManager()" is null
at kafka.server.ControllerServer.startup(ControllerServer.scala:205)
at kafka.server.KafkaRaftServer.$anonfun$startup$1(KafkaRaftServer.scala:95)
at kafka.server.KafkaRaftServer.$anonfun$startup$1$adapted(KafkaRaftServer.scala:95)
at scala.Option.foreach(Option.scala:437)
at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:95)
at kafka.Kafka$.main(Kafka.scala:113)
at kafka.Kafka.main(Kafka.scala)
So according to logs I need to “wipe out the local state file.” Okay so the file which contains the word “state” is located in the data.dir folder
/var/lib/kafka/data/__cluster-metadata-0
So I delete that file from old broker 103 and restart Kafka, which completed successfully. So I asked the 103 node about KRaft quorum status and got:
ClusterId: mXMb-Ah9Q8uNFoMtqGrBag
LeaderId: 2
LeaderEpoch: 125
HighWatermark: 84616
MaxFollowerLag: 84617
MaxFollowerLagTimeMs: -1
CurrentVoters: [0,1,2]
CurrentObservers: []
LeaderID is 2? What :)
Okay let’s ask the same on old node 189 and got:
ClusterId: mXMb-Ah9Q8uNFoMtqGrBag
LeaderId: 1
LeaderEpoch: 8
HighWatermark: -1
MaxFollowerLag: 74376
MaxFollowerLagTimeMs: -1
CurrentVoters: [0,1]
CurrentObservers: []
Let’s ask the same on new node 81 and got:
ClusterId: mXMb-Ah9Q8uNFoMtqGrBag
LeaderId: 2
LeaderEpoch: 125
HighWatermark: 84813
MaxFollowerLag: 84814
MaxFollowerLagTimeMs: -1
CurrentVoters: [0,1,2]
CurrentObservers: []
So it seems that the old node is mismatched from other nodes. Okay let’s delete the quorum-state file on the 189 node. After deleting that state file, I encountered the following error:
[2024-12-02 12:16:33,310] ERROR Encountered fatal fault: Unexpected error in raft I0 thread (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler)
java.lang.IllegalStateException: Cannot transition to Follower with leaderId=2 and epoch=125 since it is not one of the voters [0, 1]
at org.apache.kafka.raft.QuorumState.transitionToFollower(QuorumState.java:382)
at org.apache.kafka.raft.KafkaRaftClient.transitionToFollower(KafkaRaftClient.java:522)
What? Okay, I decided to delete the state file on the newest node (81). Deleting the quorum-state file on affected nodes resolved the issue, but the process felt risky and unstructured.
Rebalancing Partitions: After adding the new node, I rebalanced the partitions using the following commands:
/opt/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server 172.26.1.103:9092 --command-config /etc/kafka/admin.properties --topics-to-move-json-file topics.json --broker-list "0,1,2" > reassignment_plan.json
/opt/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server 172.26.1.103:9092 --command-config /etc/kafka/admin.properties --execute --reassignment-json-file reassignment_plan.json
The partitions balanced well across the brokers after waiting for 3–4 minutes.
Questions:
- What are the recommended steps to safely add new brokers and KRaft controllers to an existing Kafka cluster?
- Is it normal to require quorum-state file deletion during the scaling process?
- Are there tools or documentation specifically for scaling KRaft-based Kafka clusters that I might have missed? Any advice or feedback on my approach would be greatly appreciated!