While creating the cluster, we are encountering issues with the 3 controllers

Dear Team,

While creating the KRaft cluster with 3 controllers and 3 brokers using the Confluent Operator, we created the controller resource definition file as shown below. However, upon startup, we are continuously encountering the exceptions mentioned below.

Could you please review and suggest what might be wrong?

Controller file:-

apiVersion: platform.confluent.io/v1beta1
kind: KRaftController
metadata:
name: kraftcontroller
namespace: kafka
spec:
oneReplicaPerNode: false
dataVolumeCapacity: 10G
image:
application: docker.io/confluentinc/cp-server:7.9.0
init: confluentinc/confluent-init-container:2.11.0
replicas: 3
configOverrides:
server:
- broker.id=1
- node.id=1
- cluster.id=ZbD3VucSSRupm5dJlDOwYQ
- confluent.balancer.enable=true
- confluent.operator.managed=true
- controller.listener.names=CONTROLLER
- controller.quorum.voters=1@kraftcontroller-0.kraftcontroller.kafka.svc.cluster.local:9074,2@kraftcontroller-1.kraftcontroller.kafka.svc.cluster.local:9074,3@kraftcontroller-2.kraftcontroller.kafka.svc.cluster.local:9074
- inter.broker.listener.name=REPLICATION
- listener.security.protocol.map=CONTROLLER:PLAINTEXT,REPLICATION:PLAINTEXT
- listeners=CONTROLLER://:9074
- log.dirs=/tmp/datalogs
- log.message.format.version=3.4
- num.network.threads=4
- process.roles=controller
- advertised.listeners=CONTROLLER://kraftcontroller-0:9074

Exception logs

ToCatchUp - [MetadataLoader id=1] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet.
[INFO] 2025-04-06 05:50:36,913 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleVoteRequest - [RaftManager id=1] Candidate sent a voter key (Optional[ReplicaKey(id=2, directoryId=Optional.empty)]) in the VOTE request that doesn’t match the local key (OptionalInt[1], 5pVL1Rivio2nNjWN6andVw); rejecting the vote
[INFO] 2025-04-06 05:50:36,933 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleVoteRequest - [RaftManager id=1] Candidate sent a voter key (Optional[ReplicaKey(id=2, directoryId=Optional.empty)]) in the VOTE request that doesn’t match the local key (OptionalInt[1], 5pVL1Rivio2nNjWN6andVw); rejecting the vote
[INFO] 2025-04-06 05:50:36,957 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleVoteRequest - [RaftManager id=1] Candidate sent a voter key (Optional[ReplicaKey(id=2, directoryId=Optional.empty)]) in the VOTE request that doesn’t match the local key (OptionalInt[1], 5pVL1Rivio2nNjWN6andVw); rejecting the vote
[INFO] 2025-04-06 05:50:36,975 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleVoteRequest - [RaftManager id=1] Candidate sent a voter key (Optional[ReplicaKey(id=2, directoryId=Optional.empty)]) in the VOTE request that doesn’t match the local key (OptionalInt[1], 5pVL1Rivio2nNjWN6andVw); rejecting the vote
[INFO] 2025-04-06 05:50:36,980 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleVoteRequest - [RaftManager id=1] Candidate sent a voter key (Optional[ReplicaKey(id=2, directoryId=Optional.empty)]) in the VOTE request that doesn’t match the local key (OptionalInt[1], 5pVL1Rivio2nNjWN6andVw); rejecting the vote
[INFO] 2025-04-06 05:50:36,996 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleVoteRequest - [RaftManager id=1] Candidate sent a voter key (Optional[ReplicaKey(id=2, directoryId=Optional.empty)]) in the VOTE request that doesn’t match the local key (OptionalInt[1], 5pVL1Rivio2nNjWN6andVw); rejecting the vote
[INFO] 2025-04-06 05:50:36,998 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleVoteRequest - [RaftManager id=1] Candidate sent a voter key (Optional[ReplicaKey(id=2, directoryId=Optional.empty)]) in the VOTE request that doesn’t match the local key (OptionalInt[1], 5pVL1Rivio2nNjWN6andVw); rejecting the vote
[INFO] 2025-04-06 05:50:37,007 [kafka-1-metadata-loader-event-handler] org.apache.kafka.image.loader.MetadataLoader stillNeedToCatchUp - [MetadataLoader id=1] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet.
[INFO] 2025-04-06 05:50:37,016 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleVoteRequest - [RaftManager id=1] Candidate sent a voter key (Optional[ReplicaKey(id=2, directoryId=Optional.empty)]) in the VOTE request that doesn’t match the local key (OptionalInt[1], 5pVL1Rivio2nNjWN6andVw); rejecting the vote
[INFO] 2025-04-06 05:50:37,041 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleVoteRequest - [RaftManager id=1] Candidate sent a voter key (Optional[ReplicaKey(id=2, directoryId=Optional.empty)]) in the VOTE request that doesn’t match the local key (OptionalInt[1], 5pVL1Rivio2nNjWN6andVw); rejecting the vote

[INFO] 2025-04-06 05:51:17,768 [kafka-1-metadata-loader-event-handler] org.apache.kafka.image.loader.MetadataLoader stillNeedToCatchUp - [MetadataLoader id=1] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet.
[INFO] 2025-04-06 05:51:17,799 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleVoteRequest - [RaftManager id=1] Candidate sent a voter key (Optional[ReplicaKey(id=3, directoryId=Optional.empty)]) in the VOTE request that doesn’t match the local key (OptionalInt[1], KmAWIt-XQzC4HhtAEvsNFA); rejecting the vote
[INFO] 2025-04-06 05:51:17,805 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleVoteRequest - [RaftManager id=1] Candidate sent a voter key (Optional[ReplicaKey(id=3, directoryId=Optional.empty)]) in the VOTE request that doesn’t match the local key (OptionalInt[1], KmAWIt-XQzC4HhtAEvsNFA); rejecting the vote
[INFO] 2025-04-06 05:51:17,819 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleVoteRequest - [RaftManager id=1] Candidate sent a voter key (Optional[ReplicaKey(id=3, directoryId=Optional.empty)]) in the VOTE request that doesn’t match the local key (OptionalInt[1], KmAWIt-XQzC4HhtAEvsNFA); rejecting the vote
[INFO] 2025-04-06 05:51:17,825 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleVoteRequest - [RaftManager id=1] Candidate sent a voter key (Optional[ReplicaKey(id=3, directoryId=Optional.empty)]) in the VOTE request that doesn’t match the local key (OptionalInt[1], KmAWIt-XQzC4HhtAEvsNFA); rejecting the vote

hey @naveen

welcome to the forum :slight_smile:

(moving the thread to Ops )

question regarding your conf:

there are a lot of overrides, some of them are not necessary.
is there any special reason for using them?

best,
michael

Hi @ [mmuehlbeyer]

Thank you for the reply. There’s no need to retain all the properties we were experimenting with while setting up the 3-node KRaft controller cluster.

However, the cluster is failing to elect a leader and form a quorum. Do you have any suggestions on how we can successfully form the quorum in this 3-node controller setup?

I tried creating three separate controller YAML files, but I’m facing a cluster inconsistency issue due to each pod having a different cluster.id in the meta.properties file after a restart. How can I ensure that all pods in the Kubernetes cluster use the same cluster.id in their meta.properties file?

[ERROR] 2025-04-08 05:26:57,503 [kafka-1-raft-io-thread] org.apache.kafka.raft.KafkaRaftClient handleUnexpectedError - [RaftManager id=1] Unexpected error INCONSISTENT_CLUSTER_ID in VOTE response: InboundResponse(correlationId=3980, data=VoteResponseData(errorCode=104, topics=, nodeEndpoints=), source=kraftcontroller-1-0.kraftcontroller-1.kafka.svc.cluster.local:9074 (id: 2 rack: null))

cfk should take care of this

did you try with the following quickstart ?

just to have a setup up & running?

best,
michael

We have successfully tested a single-node setup with all components running. However, while setting up a multi-node cluster with 3 controllers and 3 brokers, we are encountering issues.

I see
so what about this example?

Thank you for your response. The cluster is now working as expected.

However, I am facing an issue while mounting external storage. The controller was successfully bound to the external PVC, but the broker is unable to mount the volume. I’m getting the following exception — any suggestions?

[ERROR] 2025-04-14 10:59:18,163 [kafka-0-metadata-loader-event-handler] org.apache.kafka.server.fault.ProcessTerminatingFaultHandler handleFault - Encountered fatal fault: Error starting LogManager
org.apache.kafka.common.KafkaException: Found directory /mnt/data/data0, ‘data0’ is not in the form of topic-partition or topic-partition.uniqueId-delete (if marked for deletion).
Kafka’s log directories (and children) should only contain Kafka topic data.

[ERROR] 2025-04-14 09:22:37,252 [kafka-0-metadata-loader-event-handler] org.apache.kafka.server.fault.ProcessTerminatingFaultHandler handleFault - Encountered fatal fault: Error starting LogManager
org.apache.kafka.common.KafkaException: Found directory /mnt/data/data0/logs/lost+found, ‘lost+found’ is not in the form of topic-partition or topic-partition.uniqueId-delete (if marked for deletion).
Kafka’s log directories (and children) should only contain Kafka topic data

obviously it’s complaining about the
the directories in youre PVC respectively mounted disks

how does your server.properties look like?
how did you attach the PVC?
could you share the respective conf?

Thanks for your response.
There was a mismatch between the volume mount paths and the log directories that Kafka was expecting. I’ve fixed the issue.

Now, after restarting or deleting the pods, all the data persists on our volumes as expected.

Once again, thank you very much!

1 Like