Apache Kafka 3.0 and Kraft quorum

Hi,
I setup a small local lab with docker to try out Apache Kafka 3.0 and KRaft mode.
I was able to let it run with 1 broker and with 3 broker. For the latest I modified the server.properties changing the numbers of voters like this
controller.quorum.voters=1@broker01:9093,2@broker02:9093,3@broker03:9093
and it worked without problems. I was able to kill one broker and the cluster reestablished the quorum.
Anyway I was puzzled since I discovered that is possible to specify an even number of broker in controller.quorum.voters, so when I tried to setup a cluster with only two brokers it started but killing one broker left - of course - the cluster in a unusable state.
Is this considered a bug? I mean from a configuration perspective should it be forbidden to specify an even number of voters?

Hi,

any errors/warnings in the logs?
There is a similar/related issue which is still open and will be in the next release
https://issues.apache.org/jira/browse/KAFKA-12712

Nevertheless it might be worth to log an issue/bug.

Hi,
there isn’t any particular warning. The cluster just start with an even number of nodes.
I’ve create a two broker configuration in docker with hosts broker01 and broker02

broker01 has in its server.properties

node.id=1
controller.quorum.voters=1@broker01:9093,2@broker02:9093

while broker02 has

node.id=2
controller.quorum.voters=1@broker01:9093,2@broker02:9093

When the cluster starts I can see it reach a quorum.

broker01    | [2021-09-22 14:21:05,462] INFO [BrokerLifecycleManager id=1] The broker has been unfenced. Transitioning from RECOVERY to RUNNING. (kafka.server.BrokerLifecycleManager)
broker01    | [2021-09-22 14:21:05,965] INFO [Controller 1] Unfenced broker: UnfenceBrokerRecord(id=1, epoch=1) (org.apache.kafka.controller.ClusterControlManager)
broker02    | [2021-09-22 14:21:04,555] INFO [Controller 2] Unfenced broker: UnfenceBrokerRecord(id=2, epoch=0) (org.apache.kafka.controller.ClusterControlManager)
broker02    | [2021-09-22 14:21:04,718] INFO [BrokerLifecycleManager id=2] The broker has been unfenced. Transitioning from RECOVERY to RUNNING. (kafka.server.BrokerLifecycleManager)
broker02    | [2021-09-22 14:21:05,305] INFO [Controller 2] The request from broker 1 to unfence has been granted because it has caught up with the last committed metadata offset 3. (org.apache.kafka.controller.BrokerHeartbeatManager)
broker02    | [2021-09-22 14:21:05,305] INFO [Controller 2] Unfenced broker: UnfenceBrokerRecord(id=1, epoch=1) (org.apache.kafka.controller.ClusterControlManager)

Checking the status with kafka-metadata-shell gives me:

root@broker01:~# kafka-metadata-shell --snapshot /tmp/kraft-combined-logs/__cluster_metadata-0/00000000000000000000.log
Loading...
Starting...
[ Kafka Metadata Shell ]
>> cat /brokers/1/isFenced
false
>> cat /brokers/2/isFenced
false
>> cat /metadataQuorum/leader
LeaderAndEpoch(leaderId=OptionalInt[2], epoch=2)
>>

Looking at file /tmp/kraft-combined-logs/__cluster_metadata-0/quorum-state I see

{
  "clusterId": "",
  "leaderId": 2,
  "leaderEpoch": 2,
  "votedId": -1,
  "appliedOffset": 0,
  "currentVoters": [
    {
      "voterId": 1
    },
    {
      "voterId": 2
    }
  ],
  "data_version": 0
}

If I destroy one node the cluster stop working completely (that’s obvious since quorum can’t be reached). On the remaining node I see messages

broker01    | [2021-09-22 14:30:29,090] INFO [RaftManager nodeId=1] Completed transition to CandidateState(localId=1, epoch=17, retries=15, electionTimeoutMs=1273) (org.apache.kafka.raft.QuorumState)
broker01    | [2021-09-22 14:30:30,010] INFO [BrokerLifecycleManager id=1] Unable to send a heartbeat because the RPC got timed out before it could be sent. (kafka.server.BrokerLifecycleManager)
broker01    | [2021-09-22 14:30:31,227] INFO [RaftManager nodeId=1] Re-elect as candidate after election backoff has completed (org.apache.kafka.raft.KafkaRaftClient)
broker01    | [2021-09-22 14:30:31,766] INFO [RaftManager nodeId=1] Completed transition to CandidateState(localId=1, epoch=18, retries=16, electionTimeoutMs=1765) (org.apache.kafka.raft.QuorumState)
broker01    | [2021-09-22 14:30:31,770] WARN [RaftManager nodeId=1] Error connecting to node broker02:9093 (id: 2 rack: null) (org.apache.kafka.clients.NetworkClient)

and in /tmp/kraft-combined-logs/__cluster_metadata-0/quorum-state I see

{
  "clusterId": "",
  "leaderId": -1,
  "leaderEpoch": 56,
  "votedId": 1,
  "appliedOffset": 0,
  "currentVoters": [
    {
      "voterId": 1
    },
    {
      "voterId": 2
    }
  ],
  "data_version": 0
}

I don’t know too much about raft protocol and I don’t understand why it can be possible to reach an initial quorum with 2 nodes (or any even number, I also tried with 4).

Regarding Kafka it should be an error (in my opinion) to specify an even number of voters and the cluster should refuse to start in that case.

Hi,

same behavior in my local test environment.

I’m just reading through the several docs to get an idea whether it is planned to implement a check to prevent only 2 controller nodes.

For reference:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum

https://cwiki.apache.org/confluence/display/KAFKA/KIP-631%3A+The+Quorum-based+Kafka+Controller

https://cwiki.apache.org/confluence/display/KAFKA/KIP-595%3A+A+Raft+Protocol+for+the+Metadata+Quorum

Not sure whether such a check is planned to be implemented (would be nice feature to prevent wrong configs, data loss and so on)

though imho everyone running a production Kafka cluster should be aware that it needs an uneven count of nodes to provide a proper quorum.

1 Like

I agree. About the possibility to have an even number of voters I found also this KIP-642 (currently in draft) that explain why this may be necessary during a quorum reassignment. Anyway, it doesn’t make too much sense in normal cluster operation to form a quorum with 2 or 4 of voters.

1 Like