Kafka-broker-api-versions

I’m having trouble getting three kafka brokers to form a cluster. Two will join a cluster but one can’t (it was rejected).

They are named:

  • kafka01
  • kafka03
  • kafka04

The brokers are containers running in Docker Swarm (each on a node) on Ubuntu 22.04.

I logged into kafka01 and ran kafka-broker-api-versions with the --version option passing the each server’s ip. I got the following:

  • kafka01 → kafka01: 7.3.0-ccs
  • kafka01 → kafka03: 7.3.0-ccs
  • kafka01 → kafka04: 7.3.0-css

Everything is matching up thus far.

I ran each command again but not with the --version flag. kafka01 and kafka04 looked fine. However, kafka03 had the following output:

(
	Produce(0): UNSUPPORTED,
	Fetch(1): 0 to 13 [usable: 13],
	ListOffsets(2): UNSUPPORTED,
	Metadata(3): UNSUPPORTED,
	LeaderAndIsr(4): UNSUPPORTED,
	StopReplica(5): UNSUPPORTED,
	UpdateMetadata(6): UNSUPPORTED,
	ControlledShutdown(7): 0 to 3 [usable: 3],
	OffsetCommit(8): UNSUPPORTED,
	OffsetFetch(9): UNSUPPORTED,
	FindCoordinator(10): UNSUPPORTED,
	JoinGroup(11): UNSUPPORTED,
	Heartbeat(12): UNSUPPORTED,
	LeaveGroup(13): UNSUPPORTED,
	SyncGroup(14): UNSUPPORTED,
	DescribeGroups(15): UNSUPPORTED,
	ListGroups(16): UNSUPPORTED,
	SaslHandshake(17): 0 to 1 [usable: 1],
	ApiVersions(18): 0 to 3 [usable: 3],
	CreateTopics(19): 0 to 7 [usable: 7],
	DeleteTopics(20): 0 to 6 [usable: 6],
	DeleteRecords(21): UNSUPPORTED,
	InitProducerId(22): UNSUPPORTED,
	OffsetForLeaderEpoch(23): UNSUPPORTED,
	AddPartitionsToTxn(24): UNSUPPORTED,
	AddOffsetsToTxn(25): UNSUPPORTED,
	EndTxn(26): UNSUPPORTED,
	WriteTxnMarkers(27): UNSUPPORTED,
	TxnOffsetCommit(28): UNSUPPORTED,
	DescribeAcls(29): 0 to 3 [usable: 3],
	CreateAcls(30): 0 to 3 [usable: 3],
	DeleteAcls(31): 0 to 3 [usable: 3],
	DescribeConfigs(32): UNSUPPORTED,
	AlterConfigs(33): 0 to 2 [usable: 2],
	AlterReplicaLogDirs(34): UNSUPPORTED,
	DescribeLogDirs(35): UNSUPPORTED,
	SaslAuthenticate(36): 0 to 2 [usable: 2],
	CreatePartitions(37): 0 to 3 [usable: 3],
	CreateDelegationToken(38): UNSUPPORTED,
	RenewDelegationToken(39): UNSUPPORTED,
	ExpireDelegationToken(40): UNSUPPORTED,
	DescribeDelegationToken(41): UNSUPPORTED,
	DeleteGroups(42): UNSUPPORTED,
	ElectLeaders(43): 0 to 2 [usable: 2],
	IncrementalAlterConfigs(44): 0 to 1 [usable: 1],
	AlterPartitionReassignments(45): 0 [usable: 0],
	ListPartitionReassignments(46): 0 [usable: 0],
	OffsetDelete(47): UNSUPPORTED,
	DescribeClientQuotas(48): UNSUPPORTED,
	AlterClientQuotas(49): 0 to 1 [usable: 1],
	DescribeUserScramCredentials(50): UNSUPPORTED,
	AlterUserScramCredentials(51): UNSUPPORTED,
	Vote(52): 0 [usable: 0],
	BeginQuorumEpoch(53): 0 [usable: 0],
	EndQuorumEpoch(54): 0 [usable: 0],
	DescribeQuorum(55): 0 to 1 [usable: 1],
	AlterPartition(56): 0 to 2 [usable: 2],
	UpdateFeatures(57): 0 to 1 [usable: 1],
	Envelope(58): 0 [usable: 0],
	FetchSnapshot(59): 0 [usable: 0],
	DescribeCluster(60): UNSUPPORTED,
	DescribeProducers(61): UNSUPPORTED,
	BrokerRegistration(62): 0 [usable: 0],
	BrokerHeartbeat(63): 0 [usable: 0],
	UnregisterBroker(64): 0 [usable: 0],
	DescribeTransactions(65): UNSUPPORTED,
	ListTransactions(66): UNSUPPORTED,
	AllocateProducerIds(67): 0 [usable: 0]
)

When I logged into kafk03 and ran the kafka-broker-api-version command on itself (just to sanity check), the output looked fine (as expected).

Based on my situation, does anyone know what could be causing the issue with kafka03 not being able to join the cluster? Does anyone know why APIs, like Produce, JoinGroup, etc, are marked as UNSUPPORTED (those are really important)?

hey @dmtrs
welcome :slight_smile:

could you share some logs or configs?

best,
michael

Glad to.

config

version: '3.7'

services:

  kafka01:
    image: confluentinc/cp-kafka:7.3.0
    hostname: kafka01
    ports:
      - "19092:19092"
      - "19093:19093"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT,CONTROLLER:PLAINTEXT'
      KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://kafka01:9092,PLAINTEXT_HOST://<host1_ip_address>:19092'
      KAFKA_JMX_PORT: 19101
      KAFKA_AUTO_CREATE_TOPICS_ENABLED: 'False'
      KAFKA_PROCESS_ROLES: 'broker,controller'
      KAFKA_NODE_ID: 1
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_CONTROLLER_QUORUM_VOTERS: '1@kafka01:19093,4@kafka04:49093,5@kafka03:39093'
      KAFKA_LISTENERS: 'PLAINTEXT://0.0.0.0:9092,PLAINTEXT_HOST://0.0.0.0:19092,CONTROLLER://0.0.0.0:19093'
      KAFKA_INTER_BROKER_LISTENER_NAME: 'PLAINTEXT'
      KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
      KAFKA_SOCKET_CONNECTION_SETUP_TIMEOUT_MAX_MS: 720000
      KAFKA_SOCKET_CONNECTION_SETUP_TIMEOUT_MS: 360000
      KAFKA_LOG_DIRS: '/tmp/logs'
    networks:
      - kafka_network
    deploy:
      replicas: 1
      placement:
        constraints: [node.labels.kafka == 1]
    volumes:
      - /home/kafka/logs:/tmp/logs
      - /home/kafka/data/update_run.sh:/tmp/update_run.sh
    command: "bash -c 'if [ ! -f /tmp/update_run.sh ]; then echo \"ERROR: Did you forget the update_run.sh file that came with this docker-compose.yml file?\" && exit 1 ; else /tmp/update_run.sh && /etc/confluent/docker/run ; fi'"


  kafka04:
    image: confluentinc/cp-kafka:7.3.0
    hostname: kafka04
    ports:
      - "49092:49092"
      - "49093:49093"
    environment:
      KAFKA_BROKER_ID: 4
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT,CONTROLLER:PLAINTEXT'
      KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://kafka04:9092,PLAINTEXT_HOST://<host4_ip_address>:49092'
      KAFKA_JMX_PORT: 19101
      KAFKA_AUTO_CREATE_TOPICS_ENABLED: 'False'
      KAFKA_PROCESS_ROLES: 'broker,controller'
      KAFKA_NODE_ID: 4
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_CONTROLLER_QUORUM_VOTERS: '1@kafka01:19093,4@kafka04:49093,5@kafka03:39093'
      KAFKA_LISTENERS: 'PLAINTEXT://0.0.0.0:9092,PLAINTEXT_HOST://0.0.0.0:49092,CONTROLLER://0.0.0.0:49093'
      KAFKA_INTER_BROKER_LISTENER_NAME: 'PLAINTEXT'
      KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
      KAFKA_SOCKET_CONNECTION_SETUP_TIMEOUT_MAX_MS: 720000
      KAFKA_SOCKET_CONNECTION_SETUP_TIMEOUT_MS: 360000
      KAFKA_LOG_DIRS: '/tmp/logs'
    networks:
      - kafka_network
    deploy:
      replicas: 1
      placement:
        constraints: [node.labels.kafka == 4]
    volumes:
      - /home/kafka/logs:/tmp/logs
      - /home/kafka/data/update_run.sh:/tmp/update_run.sh
    command: "bash -c 'if [ ! -f /tmp/update_run.sh ]; then echo \"ERROR: Did you forget the update_run.sh file that came with this docker-compose.yml file?\" && exit 1 ; else /tmp/update_run.sh && /etc/confluent/docker/run ; fi'"


  kafka03:
    image: confluentinc/cp-kafka:7.3.0
    hostname: kafka03
    ports:
      - "39092:39092"
      - "39093:39093"
    environment:
      KAFKA_BROKER_ID: 5
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT'
      KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://kafka03:9092,PLAINTEXT_HOST://<host3_ip_address>:39093'
      KAFKA_JMX_PORT: 19101
      KAFKA_AUTO_CREATE_TOPICS_ENABLED: 'False'
      KAFKA_PROCESS_ROLES: 'broker,controller'
      KAFKA_NODE_ID: 5
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_CONTROLLER_QUORUM_VOTERS: '1@kafka01:19093,4@kafka04:49093,5@kafka03:39093'
      KAFKA_LISTENERS: 'PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:39093,PLAINTEXT_HOST://0.0.0.0:39092'
      KAFKA_INTER_BROKER_LISTENER_NAME: 'PLAINTEXT'
      KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
      KAFKA_SOCKET_CONNECTION_SETUP_TIMEOUT_MAX_MS: 720000
      KAFKA_SOCKET_CONNECTION_SETUP_TIMEOUT_MS: 360000
      KAFKA_INITIAL_BROKER_REGISTRATION_TIMEOUT_MS: 360000
      KAFKA_CONTROLLER_QUORUM_REQUEST_TIMEOUT_MS: 120000
      KAFKA_LOG_DIRS: '/tmp/logs'
    networks:
      - kafka_network
    deploy:
      replicas: 1
      placement:
        constraints: [node.labels.kafka == 2]
    volumes:
      - /home/kafka/logs:/tmp/logs
      - /home/kafka/data/update_run.sh:/tmp/update_run.sh
    command: "bash -c 'if [ ! -f /tmp/update_run.sh ]; then echo \"ERROR: Did you forget the update_run.sh file that came with this docker-compose.yml file?\" && exit 1 ; else /tmp/update_run.sh && /etc/confluent/docker/run ; fi'"


networks:
  kafka_network:
    driver: overlay
    attachable: true

The logs below are from kafka03 trying to join the cluster with kafka01 and kafka04.

At the end of the log, I saw kafka03 starting back up and continuing to increment the epoch.

I also deleted repeating sections so I’ll come in under the character limit to post.

 ===> User
 uid=1000(appuser) gid=1000(appuser) groups=1000(appuser)
 ===> Configuring ...
 ===> Running preflight checks ...
 ===> Check if /var/lib/kafka/data is writable ...
 ===> Check if Zookeeper is healthy ...
 ignore zk-ready  40
 All of the log directories are already formatted.
 ===> Launching ...
 ===> Launching kafka ...
 INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
 INFO Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation (org.apache.zookeeper.common.X509Util)
 INFO [LogLoader partition=__cluster_metadata-0, dir=/tmp/logs] Recovering unflushed segment 0. 0/1 recovered for __cluster_metadata-0. (kafka.log.LogLoader)
 INFO [LogLoader partition=__cluster_metadata-0, dir=/tmp/logs] Loading producer state till offset 0 with message format version 2 (kafka.log.UnifiedLog$)
 INFO [LogLoader partition=__cluster_metadata-0, dir=/tmp/logs] Reloading from producer snapshot and rebuilding producer state from offset 0 (kafka.log.UnifiedLog$)
 INFO Deleted producer state snapshot /tmp/logs/__cluster_metadata-0/00000000000000052963.snapshot (kafka.log.SnapshotFile)
 INFO [LogLoader partition=__cluster_metadata-0, dir=/tmp/logs] Producer state recovery took 3ms for snapshot load and 0ms for segment recovery from offset 0 (kafka.log.UnifiedLog$)
 INFO Wrote producer snapshot at offset 52963 with 0 producer ids in 9 ms. (kafka.log.ProducerStateManager)
 INFO [LogLoader partition=__cluster_metadata-0, dir=/tmp/logs] Loading producer state till offset 52963 with message format version 2 (kafka.log.UnifiedLog$)
 INFO [LogLoader partition=__cluster_metadata-0, dir=/tmp/logs] Reloading from producer snapshot and rebuilding producer state from offset 52963 (kafka.log.UnifiedLog$)
 INFO [ProducerStateManager partition=__cluster_metadata-0] Loading producer state from snapshot file 'SnapshotFile(/tmp/logs/__cluster_metadata-0/00000000000000052963.snapshot,52963)' (kafka.log.ProducerStateManager)
 INFO [LogLoader partition=__cluster_metadata-0, dir=/tmp/logs] Producer state recovery took 4ms for snapshot load and 0ms for segment recovery from offset 52963 (kafka.log.UnifiedLog$)
 INFO Initialized snapshots with IDs SortedSet() from /tmp/logs/__cluster_metadata-0 (kafka.raft.KafkaMetadataLog$)
 WARN Epoch from quorum-state file is 0, which is smaller than last written epoch 3 in the log (org.apache.kafka.raft.QuorumState)
 INFO [raft-expiration-reaper]: Starting (kafka.raft.TimingWheelExpirationService$ExpiredOperationReaper)
 INFO [RaftManager nodeId=5] Completed transition to Unattached(epoch=3, voters=[1, 4, 5], electionTimeoutMs=1457) (org.apache.kafka.raft.QuorumState)
 INFO Registered signal handlers for TERM, INT, HUP (org.apache.kafka.common.utils.LoggingSignalHandler)
 INFO Starting controller (kafka.server.ControllerServer)
 INFO [kafka-raft-io-thread]: Starting (kafka.raft.KafkaRaftManager$RaftIoThread)
 INFO [kafka-raft-outbound-request-thread]: Starting (kafka.raft.RaftSendThread)
 INFO Updated connection-accept-rate max connection creation rate to 2147483647 (kafka.network.ConnectionQuotas)
 INFO Awaiting socket connections on 0.0.0.0:39093. (kafka.network.DataPlaneAcceptor)
 INFO [SocketServer listenerType=CONTROLLER, nodeId=5] Created data-plane acceptor and processors for endpoint : ListenerName(CONTROLLER) (kafka.network.SocketServer)
 INFO [Controller 5] Creating new QuorumController with clusterId HADv-E4QSniQ3ms7IkpErg, authorizer Optional.empty. (org.apache.kafka.controller.QuorumController)
 INFO [RaftManager nodeId=5] Registered the listener org.apache.kafka.controller.QuorumController$QuorumMetaLogListener@434402506 (org.apache.kafka.raft.KafkaRaftClient)
 INFO [ThrottledChannelReaper-Fetch]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-ControllerMutation]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Request]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Produce]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ExpirationReaper-5-AlterAcls]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO Enabling request processing. (kafka.network.SocketServer)
 INFO [BrokerServer id=5] Transition from SHUTDOWN to STARTING (kafka.server.BrokerServer)
 INFO [BrokerServer id=5] Starting broker (kafka.server.BrokerServer)
 INFO [ThrottledChannelReaper-Fetch]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Produce]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Request]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-ControllerMutation]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [BrokerToControllerChannelManager broker=5 name=forwarding]: Starting (kafka.server.BrokerToControllerRequestThread)
 INFO Updated connection-accept-rate max connection creation rate to 2147483647 (kafka.network.ConnectionQuotas)
 INFO Awaiting socket connections on 0.0.0.0:9092. (kafka.network.DataPlaneAcceptor)
 INFO [SocketServer listenerType=BROKER, nodeId=5] Created data-plane acceptor and processors for endpoint : ListenerName(PLAINTEXT) (kafka.network.SocketServer)
 INFO Updated connection-accept-rate max connection creation rate to 2147483647 (kafka.network.ConnectionQuotas)
 INFO Awaiting socket connections on 0.0.0.0:39092. (kafka.network.DataPlaneAcceptor)
 INFO [SocketServer listenerType=BROKER, nodeId=5] Created data-plane acceptor and processors for endpoint : ListenerName(PLAINTEXT_HOST) (kafka.network.SocketServer)
 INFO [BrokerToControllerChannelManager broker=5 name=alterPartition]: Starting (kafka.server.BrokerToControllerRequestThread)
 INFO [ExpirationReaper-5-Produce]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-Fetch]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-DeleteRecords]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-ElectLeader]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-Heartbeat]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-Rebalance]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [RaftManager nodeId=5] Registered the listener kafka.server.metadata.BrokerMetadataListener@665551216 (org.apache.kafka.raft.KafkaRaftClient)
 INFO [BrokerToControllerChannelManager broker=5 name=heartbeat]: Starting (kafka.server.BrokerToControllerRequestThread)
 Incarnation bVWcVPHGRo6kWg_3thWT4w of broker 5 in cluster HADv-E4QSniQ3ms7IkpErg is now STARTING. (kafka.server.BrokerLifecycleManager)
 INFO [ExpirationReaper-5-AlterAcls]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [RaftManager nodeId=5] Completed transition to CandidateState(localId=5, epoch=4, retries=1, electionTimeoutMs=1205) (org.apache.kafka.raft.QuorumState)
 In the new epoch 4, the leader is (none). (org.apache.kafka.controller.QuorumController)
 INFO [RaftManager nodeId=5] Re-elect as candidate after election backoff has completed (org.apache.kafka.raft.KafkaRaftClient)
 INFO [RaftManager nodeId=5] Completed transition to CandidateState(localId=5, epoch=5, retries=2, electionTimeoutMs=1432) (org.apache.kafka.raft.QuorumState)
 In the new epoch 5, the leader is (none). (org.apache.kafka.controller.QuorumController)
 INFO [RaftManager nodeId=5] Re-elect as candidate after election backoff has completed (org.apache.kafka.raft.KafkaRaftClient)
 INFO [RaftManager nodeId=5] Completed transition to CandidateState(localId=5, epoch=6, retries=3, electionTimeoutMs=1394) (org.apache.kafka.raft.QuorumState)
 In the new epoch 6, the leader is (none). (org.apache.kafka.controller.QuorumController)
 INFO [RaftManager nodeId=5] Re-elect as candidate after election backoff has completed (org.apache.kafka.raft.KafkaRaftClient)
 INFO [RaftManager nodeId=5] Completed transition to CandidateState(localId=5, epoch=7, retries=4, electionTimeoutMs=1941) (org.apache.kafka.raft.QuorumState)
 In the new epoch 7, the leader is (none). (org.apache.kafka.controller.QuorumController)
 INFO [RaftManager nodeId=5] Re-elect as candidate after election backoff has completed (org.apache.kafka.raft.KafkaRaftClient)
 INFO [RaftManager nodeId=5] Completed transition to CandidateState(localId=5, epoch=8, retries=5, electionTimeoutMs=1290) (org.apache.kafka.raft.QuorumState)
 In the new epoch 8, the leader is (none). (org.apache.kafka.controller.QuorumController)
 INFO [BrokerLifecycleManager id=5] Unable to register the broker because the RPC got timed out before it could be sent. (kafka.server.BrokerLifecycleManager)
 INFO [RaftManager nodeId=5] Re-elect as candidate after election backoff has completed (org.apache.kafka.raft.KafkaRaftClient)
 INFO [RaftManager nodeId=5] Completed transition to CandidateState(localId=5, epoch=9, retries=6, electionTimeoutMs=1129) (org.apache.kafka.raft.QuorumState)
 In the new epoch 9, the leader is (none). (org.apache.kafka.controller.QuorumController)
 INFO [RaftManager nodeId=5] Re-elect as candidate after election backoff has completed (org.apache.kafka.raft.KafkaRaftClient)
 INFO [RaftManager nodeId=5] Completed transition to CandidateState(localId=5, epoch=10, retries=7, electionTimeoutMs=1998) (org.apache.kafka.raft.QuorumState)

. . . 

 INFO [BrokerLifecycleManager id=5] Unable to register the broker because the RPC got timed out before it could be sent. (kafka.server.BrokerLifecycleManager)
 INFO [RaftManager nodeId=5] Re-elect as candidate after election backoff has completed (org.apache.kafka.raft.KafkaRaftClient)
 INFO [RaftManager nodeId=5] Completed transition to CandidateState(localId=5, epoch=21, retries=18, electionTimeoutMs=1897) (org.apache.kafka.raft.QuorumState)

. . .

 INFO [RaftManager nodeId=5] Completed transition to CandidateState(localId=5, epoch=57, retries=54, electionTimeoutMs=1467) (org.apache.kafka.raft.QuorumState)
 In the new epoch 57, the leader is (none). (org.apache.kafka.controller.QuorumController)
 INFO [RaftManager nodeId=5] Node 1 disconnected. (org.apache.kafka.clients.NetworkClient)
 WARN [RaftManager nodeId=5] Connection to node 1 (kafka01/<host1_ip_address>:19093) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
 INFO [RaftManager nodeId=5] Node 4 disconnected. (org.apache.kafka.clients.NetworkClient)
 WARN [RaftManager nodeId=5] Connection to node 4 (kafka04/<host4_ip_address>:49093) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
 INFO [RaftManager nodeId=5] Re-elect as candidate after election backoff has completed (org.apache.kafka.raft.KafkaRaftClient)

. . .

 INFO [RaftManager nodeId=5] Re-elect as candidate after election backoff has completed (org.apache.kafka.raft.KafkaRaftClient)
 INFO [RaftManager nodeId=5] Completed transition to CandidateState(localId=5, epoch=148, retries=145, electionTimeoutMs=1637) (org.apache.kafka.raft.QuorumState)
 In the new epoch 148, the leader is (none). (org.apache.kafka.controller.QuorumController)
 ERROR [BrokerLifecycleManager id=5] Shutting down because we were unable to register with the controller quorum. (kafka.server.BrokerLifecycleManager)
 INFO [BrokerLifecycleManager id=5] registrationTimeout: shutting down event queue. (org.apache.kafka.queue.KafkaEventQueue)
 INFO [BrokerLifecycleManager id=5] Transitioning from STARTING to SHUTTING_DOWN. (kafka.server.BrokerLifecycleManager)
 INFO [BrokerServer id=5] Transition from STARTING to STARTED (kafka.server.BrokerServer)
 INFO [BrokerToControllerChannelManager broker=5 name=heartbeat]: Shutting down (kafka.server.BrokerToControllerRequestThread)
 INFO [BrokerToControllerChannelManager broker=5 name=heartbeat]: Shutdown completed (kafka.server.BrokerToControllerRequestThread)
 INFO [BrokerToControllerChannelManager broker=5 name=heartbeat]: Stopped (kafka.server.BrokerToControllerRequestThread)
 INFO Broker to controller channel manager for heartbeat shutdown (kafka.server.BrokerToControllerChannelManagerImpl)
 ERROR [BrokerServer id=5] Fatal error during broker startup. Prepare to shutdown (kafka.server.BrokerServer)
 java.util.concurrent.CancellationException
 	at java.base/java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2396)
 	at kafka.server.BrokerLifecycleManager$ShutdownEvent.run(BrokerLifecycleManager.scala:485)
 	at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:174)
 	at java.base/java.lang.Thread.run(Thread.java:829)
 INFO [BrokerServer id=5] Transition from STARTED to SHUTTING_DOWN (kafka.server.BrokerServer)
 INFO [BrokerServer id=5] shutting down (kafka.server.BrokerServer)
 INFO [BrokerMetadataListener id=5] beginShutdown: shutting down event queue. (org.apache.kafka.queue.KafkaEventQueue)
 INFO [SocketServer listenerType=BROKER, nodeId=5] Stopping socket server request processors (kafka.network.SocketServer)
 INFO [SocketServer listenerType=BROKER, nodeId=5] Stopped socket server request processors (kafka.network.SocketServer)
 INFO [data-plane Kafka Request Handler on Broker 5], shutting down (kafka.server.KafkaRequestHandlerPool)
 INFO [data-plane Kafka Request Handler on Broker 5], shut down completely (kafka.server.KafkaRequestHandlerPool)
 INFO [ExpirationReaper-5-AlterAcls]: Shutting down (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-AlterAcls]: Stopped (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-AlterAcls]: Shutdown completed (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [KafkaApi-5] Shutdown complete. (kafka.server.KafkaApis)
 INFO [BrokerMetadataListener id=5] closed event queue. (org.apache.kafka.queue.KafkaEventQueue)
 INFO [TransactionCoordinator id=5] Shutting down. (kafka.coordinator.transaction.TransactionCoordinator)
 INFO [Transaction State Manager 5]: Shutdown complete (kafka.coordinator.transaction.TransactionStateManager)
 INFO [Transaction Marker Channel Manager 5]: Shutting down (kafka.coordinator.transaction.TransactionMarkerChannelManager)
 INFO [Transaction Marker Channel Manager 5]: Shutdown completed (kafka.coordinator.transaction.TransactionMarkerChannelManager)
 INFO [TransactionCoordinator id=5] Shutdown complete. (kafka.coordinator.transaction.TransactionCoordinator)
 INFO [GroupCoordinator 5]: Shutting down. (kafka.coordinator.group.GroupCoordinator)
 INFO [ExpirationReaper-5-Heartbeat]: Shutting down (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-Heartbeat]: Stopped (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-Heartbeat]: Shutdown completed (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-Rebalance]: Shutting down (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-Rebalance]: Stopped (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-Rebalance]: Shutdown completed (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [GroupCoordinator 5]: Shutdown complete. (kafka.coordinator.group.GroupCoordinator)
 INFO [ReplicaManager broker=5] Shutting down (kafka.server.ReplicaManager)
 INFO [ReplicaFetcherManager on broker 5] shutting down (kafka.server.ReplicaFetcherManager)
 INFO [ReplicaFetcherManager on broker 5] shutdown completed (kafka.server.ReplicaFetcherManager)
 INFO [ReplicaAlterLogDirsManager on broker 5] shutting down (kafka.server.ReplicaAlterLogDirsManager)
 INFO [ReplicaAlterLogDirsManager on broker 5] shutdown completed (kafka.server.ReplicaAlterLogDirsManager)
 INFO [ExpirationReaper-5-Fetch]: Shutting down (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-Fetch]: Stopped (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-Fetch]: Shutdown completed (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-Produce]: Shutting down (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-Produce]: Stopped (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-Produce]: Shutdown completed (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-DeleteRecords]: Shutting down (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-DeleteRecords]: Stopped (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-DeleteRecords]: Shutdown completed (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-ElectLeader]: Shutting down (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-ElectLeader]: Stopped (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-ElectLeader]: Shutdown completed (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ReplicaManager broker=5] Shut down completely (kafka.server.ReplicaManager)
 INFO [BrokerToControllerChannelManager broker=5 name=alterPartition]: Shutting down (kafka.server.BrokerToControllerRequestThread)
 INFO [BrokerToControllerChannelManager broker=5 name=alterPartition]: Stopped (kafka.server.BrokerToControllerRequestThread)
 INFO [BrokerToControllerChannelManager broker=5 name=alterPartition]: Shutdown completed (kafka.server.BrokerToControllerRequestThread)
 INFO Broker to controller channel manager for alterPartition shutdown (kafka.server.BrokerToControllerChannelManagerImpl)
 INFO [BrokerToControllerChannelManager broker=5 name=forwarding]: Shutting down (kafka.server.BrokerToControllerRequestThread)
 INFO [BrokerToControllerChannelManager broker=5 name=forwarding]: Stopped (kafka.server.BrokerToControllerRequestThread)
 INFO [BrokerToControllerChannelManager broker=5 name=forwarding]: Shutdown completed (kafka.server.BrokerToControllerRequestThread)
 INFO Broker to controller channel manager for forwarding shutdown (kafka.server.BrokerToControllerChannelManagerImpl)
 INFO Shutting down. (kafka.log.LogManager)
 INFO Shutdown complete. (kafka.log.LogManager)
 INFO [ThrottledChannelReaper-Fetch]: Shutting down (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Fetch]: Stopped (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Fetch]: Shutdown completed (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Produce]: Shutting down (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Produce]: Stopped (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Produce]: Shutdown completed (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Request]: Shutting down (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Request]: Stopped (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Request]: Shutdown completed (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-ControllerMutation]: Shutting down (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-ControllerMutation]: Stopped (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-ControllerMutation]: Shutdown completed (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [SocketServer listenerType=BROKER, nodeId=5] Shutting down socket server (kafka.network.SocketServer)
 INFO [SocketServer listenerType=BROKER, nodeId=5] Shutdown completed (kafka.network.SocketServer)
 INFO Metrics scheduler closed (org.apache.kafka.common.metrics.Metrics)
 INFO Closing reporter org.apache.kafka.common.metrics.JmxReporter (org.apache.kafka.common.metrics.Metrics)
 INFO Metrics reporters closed (org.apache.kafka.common.metrics.Metrics)
 INFO Broker and topic stats closed (kafka.server.BrokerTopicStats)
 INFO [BrokerLifecycleManager id=5] closed event queue. (org.apache.kafka.queue.KafkaEventQueue)
 INFO App info kafka.server for 5 unregistered (org.apache.kafka.common.utils.AppInfoParser)
 INFO [BrokerServer id=5] shut down completed (kafka.server.BrokerServer)
 INFO [BrokerServer id=5] Transition from SHUTTING_DOWN to SHUTDOWN (kafka.server.BrokerServer)
 ERROR Exiting Kafka due to fatal exception during startup. (kafka.Kafka$)
 java.util.concurrent.CancellationException
 	at java.base/java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2396)
 	at kafka.server.BrokerLifecycleManager$ShutdownEvent.run(BrokerLifecycleManager.scala:485)
 	at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:174)
 	at java.base/java.lang.Thread.run(Thread.java:829)
 INFO [raft-expiration-reaper]: Shutting down (kafka.raft.TimingWheelExpirationService$ExpiredOperationReaper)
 INFO [raft-expiration-reaper]: Stopped (kafka.raft.TimingWheelExpirationService$ExpiredOperationReaper)
 INFO [raft-expiration-reaper]: Shutdown completed (kafka.raft.TimingWheelExpirationService$ExpiredOperationReaper)
 INFO [kafka-raft-io-thread]: Shutting down (kafka.raft.KafkaRaftManager$RaftIoThread)
 INFO [RaftManager nodeId=5] Beginning graceful shutdown (org.apache.kafka.raft.KafkaRaftClient)
 WARN [RaftManager nodeId=5] Graceful shutdown timed out after 5000ms (org.apache.kafka.raft.KafkaRaftClient)
 ERROR [kafka-raft-io-thread]: Graceful shutdown of RaftClient failed (kafka.raft.KafkaRaftManager$RaftIoThread)
 java.util.concurrent.TimeoutException: Timeout expired before graceful shutdown completed
 	at org.apache.kafka.raft.KafkaRaftClient$GracefulShutdown.failWithTimeout(KafkaRaftClient.java:2408)
 	at org.apache.kafka.raft.KafkaRaftClient.maybeCompleteShutdown(KafkaRaftClient.java:2163)
 	at org.apache.kafka.raft.KafkaRaftClient.poll(KafkaRaftClient.java:2230)
 	at kafka.raft.KafkaRaftManager$RaftIoThread.doWork(RaftManager.scala:52)
 	at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
 INFO [kafka-raft-io-thread]: Stopped (kafka.raft.KafkaRaftManager$RaftIoThread)
 INFO [kafka-raft-io-thread]: Shutdown completed (kafka.raft.KafkaRaftManager$RaftIoThread)
 INFO [kafka-raft-outbound-request-thread]: Shutting down (kafka.raft.RaftSendThread)
 INFO [kafka-raft-outbound-request-thread]: Stopped (kafka.raft.RaftSendThread)
 INFO [kafka-raft-outbound-request-thread]: Shutdown completed (kafka.raft.RaftSendThread)
 INFO [ControllerServer id=5] shutting down (kafka.server.ControllerServer)
 INFO [SocketServer listenerType=CONTROLLER, nodeId=5] Stopping socket server request processors (kafka.network.SocketServer)
 INFO [SocketServer listenerType=CONTROLLER, nodeId=5] Stopped socket server request processors (kafka.network.SocketServer)
 INFO [Controller 5] QuorumController#beginShutdown: shutting down event queue. (org.apache.kafka.queue.KafkaEventQueue)
 INFO [SocketServer listenerType=CONTROLLER, nodeId=5] Shutting down socket server (kafka.network.SocketServer)
 INFO [SocketServer listenerType=CONTROLLER, nodeId=5] Shutdown completed (kafka.network.SocketServer)
 INFO [data-plane Kafka Request Handler on Broker 5], shutting down (kafka.server.KafkaRequestHandlerPool)
 INFO [data-plane Kafka Request Handler on Broker 5], shut down completely (kafka.server.KafkaRequestHandlerPool)
 INFO [ExpirationReaper-5-AlterAcls]: Shutting down (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-AlterAcls]: Stopped (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ExpirationReaper-5-AlterAcls]: Shutdown completed (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
 INFO [ThrottledChannelReaper-Fetch]: Shutting down (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Fetch]: Stopped (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Fetch]: Shutdown completed (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Produce]: Shutting down (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Produce]: Stopped (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Produce]: Shutdown completed (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Request]: Shutting down (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Request]: Stopped (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-Request]: Shutdown completed (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-ControllerMutation]: Shutting down (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-ControllerMutation]: Stopped (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [ThrottledChannelReaper-ControllerMutation]: Shutdown completed (kafka.server.ClientQuotaManager$ThrottledChannelReaper)
 INFO [Controller 5] closed event queue. (org.apache.kafka.queue.KafkaEventQueue)
 INFO App info kafka.server for 5 unregistered (org.apache.kafka.common.utils.AppInfoParser)

thanks for providing configs and logs

anything in the logs of the other brokers?
any firewall in place which may block traffic?

It was an environment configuration issue. The admins fixed it. It was something dealing with getting Swarm and VMWare’s networking stacks to work together.

1 Like