Broker restart loop after hard shutdown

Hi,

(Edit: added some more info below the log)

we had an unfortunate issue on the last weekend causing a hard power off of our Kafka cluster.

Since then the (thankfully test) cluster is not working properly.
The setup is a 3 controller 3 broker cluster (one each on each host).

The controllers seem to work ok, no errors in the log.
The brokers however are in a boot loop, all three of them. They live about a minute before restarting.,

Log at info level

| main | INFO | io.prometheus.jmx.JavaAgent | Starting …
2025-06-17 08:17:42.484 | main | INFO | io.prometheus.jmx.JavaAgent | Running …
[2025-06-17 08:17:42,842] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
[2025-06-17 08:17:43,122] INFO [SharedServer id=4] Starting SharedServer (kafka.server.SharedServer)
[2025-06-17 08:17:43,187] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/cpkafka-data] Producer state recovery took 0ms for snapshot load and 13ms for segment recovery from offset 20558244 (org.apache.kafka.storage.internals.log.UnifiedLog)
[2025-06-17 08:17:44,382] INFO Initialized snapshots with IDs SortedSet(OffsetAndEpoch(offset=20564285, epoch=1218), OffsetAndEpoch(offset=20571443, epoch=1231), OffsetAndEpoch(offset=20578607, epoch=1236), OffsetAndEpoch(offset=20585775, epoch=1242),

OffsetAndEpoch(offset=21240271, epoch=1311), OffsetAndEpoch(offset=21247469, epoch=1311), OffsetAndEpoch(offset=21254667, epoch=1311),

OffsetAndEpoch(offset=21283459, epoch=1311), OffsetAndEpoch(offset=21290657, epoch=1311), OffsetAndEpoch(offset=21297986, epoch=1313)) from /data/cpkafka-data/__cluster_metadata-0 (kafka.raft.KafkaMetadataLog$)
OffsetAndEpoch(offset=21096311, epoch=1311), OffsetAndEpoch(offset=21103509, epoch=1311), OffsetAndEpoch(offset=21110707, epoch=1311), OffsetAndEpoch(offset=21117905, OffsetAndEpoch(offset=21261865, epoch=1311), OffsetAndEpoch(offset=21269063, epoch=1311), OffsetAndEpoch(offset=21276261, epoch=1311), OffsetAndEpoch(offset=21283459, epoch=1311), OffsetAndEpoch(offset=21290657, epoch=1311), OffsetAndEpoch(offset=21297986, epoch=1313)) from /data/cpkafka-data/__cluster_metadata-0 (kafka.raft.KafkaMetadataLog$)
[2025-06-17 08:17:44,860] INFO [BrokerServer id=4] Finished waiting for controller quorum voters future (kafka.server.BrokerServer)
[2025-06-17 08:17:45,128] INFO Updated connection-accept-rate max connection creation rate to 2147483647 (kafka.network.ConnectionQuotas)
[2025-06-17 08:17:45,332] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)

[2025-06-17 08:17:46,629] INFO [broker-4-to-controller-directory-assignments-channel-manager]: Starting (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:17:47,154] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:17:47,399] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:17:47,639] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:17:47,849] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:17:48,149] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:17:48,412] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)

[2025-06-17 08:17:55,687] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:17:55,924] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:17:56,142] INFO [BrokerLifecycleManager id=4] Unable to register the broker because the RPC got timed out before it could be sent. (kafka.server.BrokerLifecycleManager)
[2025-06-17 08:17:56,436] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)

[2025-06-17 08:17:57,455] INFO [NodeToControllerChannelManager id=4 name=heartbeat] Client requested disconnect from node 1 (org.apache.kafka.clients.NetworkClient)
[2025-06-17 08:17:57,672] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)

[2025-06-17 08:17:58,434] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)

[2025-06-17 08:17:59,705] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:17:59,938] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:00,213] INFO [NodeToControllerChannelManager id=4 name=heartbeat] Client requested disconnect from node 1 (org.apache.kafka.clients.NetworkClient)
[2025-06-17 08:18:06,987] INFO [NodeToControllerChannelManager id=4 name=heartbeat] Client requested disconnect from node 1 (org.apache.kafka.clients.NetworkClient)
[2025-06-17 08:18:10,722] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:10,989] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:11,183] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:11,483] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:11,684] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:11,984] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:12,185] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:12,485] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:12,741] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:12,884] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:12,976] INFO [NodeToControllerChannelManager id=4 name=heartbeat] Client requested disconnect from node 1 (org.apache.kafka.clients.NetworkClient)
[2025-06-17 08:18:13,530] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:14,031] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:14,266] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:14,511] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:14,745] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:15,030] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:15,243] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:15,527] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:15,744] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:16,028] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:16,241] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:16,508] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:16,772] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:17,007] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:17,201] INFO [BrokerLifecycleManager id=4] Unable to register the broker because the RPC got timed out before it could be sent. (kafka.server.BrokerLifecycleManager)
[2025-06-17 08:18:17,499] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:17,699] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:18,000] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:18,200] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:18,501] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:18,701] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:19,002] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:19,202] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:19,503] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:19,703] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:20,004] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:20,204] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:20,537] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:20,771] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:21,045] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:21,280] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:21,543] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:21,775] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:22,056] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:22,308] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:22,520] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:22,791] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:23,025] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:23,291] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:23,552] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:23,767] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:24,034] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:24,296] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:24,566] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:24,803] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:25,027] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)

[2025-06-17 08:18:28,538] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:28,739] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:29,039] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:29,240] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:29,570] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:29,792] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:30,080] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:30,293] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:30,557] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:30,840] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:31,054] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:31,336] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:31,548] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:31,831] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:32,092] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:32,329] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:32,591] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:32,828] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:33,089] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:33,349] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:33,602] INFO [NodeToControllerChannelManager id=4 name=heartbeat] Client requested disconnect from node 1 (org.apache.kafka.clients.NetworkClient)
[2025-06-17 08:18:33,850] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:34,051] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:34,351] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:34,552] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)

[2025-06-17 08:18:38,359] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:38,583] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:38,860] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:39,086] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:39,367] INFO [NodeToControllerChannelManager id=4 name=heartbeat] Client requested disconnect from node 1 (org.apache.kafka.clients.NetworkClient)
[2025-06-17 08:18:39,580] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:39,870] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)

[2025-06-17 08:18:41,579] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:41,875] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:42,127] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:42,377] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:42,634] INFO [broker-4-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node host80:9092 (id: 1 rack: null isFenced: false) (kafka.server.NodeToControllerRequestThread)
[2025-06-17 08:18:42,883] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:43,083] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)

[2025-06-17 08:18:46,592] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:46,893] INFO [MetadataLoader id=4] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2025-06-17 08:18:47,046] INFO [Transaction State Manager 4]: Shutdown complete (kafka.coordinator.transaction.TransactionStateManager)
[2025-06-17 08:18:47,261] INFO App info kafka.server for 4 unregistered (org.apache.kafka.common.utils.AppInfoParser)
===> Configuring …

Note I removed a lot of duplicate log messages to keep within the post length

It basically dies /restarts after “Configuring”.

I have no idea what happened, config should not have changed, there were no defects on the disks that I heard off.
On this box (80) was one weird error where it tried accessing a non existent keystore file (that we never used), but that vanished when I updated the containers to the latest version which is now

“release”: “8.0.0-47”,

I’ve verified network connectivity, it should be able to talk to the controller on the local host either way. Its not throwing anything beyond INFO on any of the boxes.

The client app cant connect of course since the broker won’t stay up.

Any idea what could cause this behavior?

I mean I see things like
[2025-06-17 08:17:56,142] INFO [BrokerLifecycleManager id=4] Unable to register the broker because the RPC got timed out before it could be sent.
but why would that happen? its not as if the box was doing much…

We run RHEL with SEL on but have the appropriate permissions set. The box rebooted fine a couple of weeks ago with no intervention.

Can this be a problem with data corruption? But why no errors anywhere then ?

Thanks

hey @Rand

I guess no changes in the config have happened, right?

anything in the controllers log?
any version change happened by accident?

just stumbled over

which obviously is not your case though I was wondering whether it could be related somehow

best,
michael

Hi Michael,

nah, nothing, just the forced power down.
The only update I did was after the cluster did not come up,
we went from 7.8.0-83 to the aforementioned 8.0 then, but no config changes at all.

I mean I can always re-init the cluster and hope that helps (if this is a data issue), but ideally I’d like to understand the cause just in case this ever happens in production too.

I just don’t see an actual cause to this, without an error message this is difficult.
If that one you highlighted is a potential issue I really wonder why its flagged INFO and not ERROR :confused:

Edit

Controller Log looks fine to me...

[2025-06-18 07:14:53,341] WARN [QuorumController id=3] Renouncing the leadership due to a metadata log event. We were the leader at epoch 4, but in the new epoch 5, the leader is (none). Reverting to last stable offset 173144. (org.apache.kafka.controller.QuorumController)
[2025-06-18 07:14:53,426] WARN [NodeToControllerChannelManager id=3 name=registration] Attempting to close NetworkClient that has already been closed. (org.apache.kafka.clients.NetworkClient)
===> User
uid=1000(appuser) gid=1000(appuser) groups=1000(appuser)
===> Configuring …
===> Running preflight checks …
===> Check if /var/lib/kafka/data is writable …
===> Using provided cluster id xxx …
===> Launching …
===> Launching kafka …
2025-06-18 07:15:03.734 | main | INFO | io.prometheus.jmx.JavaAgent | Starting …
2025-06-18 07:15:03.965 | main | INFO | io.prometheus.jmx.JavaAgent | HTTP enabled [true]
2025-06-18 07:15:03.966 | main | INFO | io.prometheus.jmx.JavaAgent | HTTP host:port [0.0.0.0:8082]
2025-06-18 07:15:03.966 | main | INFO | io.prometheus.jmx.JavaAgent | OpenTelemetry enabled [false]
2025-06-18 07:15:04.010 | main | INFO | io.prometheus.jmx.JavaAgent | Running …
WARNING: A Java agent has been loaded dynamically (/tmp/javaagent-loader-1.3.70.jar)
WARNING: If a serviceability tool is in use, please run with -XX:+EnableDynamicAgentLoading to hide this warning
WARNING: If a serviceability tool is not in use, please run with -Djdk.instrument.traceUsage for more information
WARNING: Dynamic loading of agents will be disallowed by default in a future release

Hi @Rand

CP 8.0.0?
I’m a bit confused cause afaik the latest and greatest is CP 7.9

in general even with a minor upgrade you should consider this one:

Best,
Michael

Well thats what it is saying under release;)

podman inspect cd651ad01748 |grep release
“release”: “8.0.0-47”,

Is that more helpful?

“vendor”: “Confluent”,
“version”: “284bf36”

It is whatever build was available as latest on dockerhub yesterday

Edit - i didnt do the feature flag update obviously, thanks for pointing that out though:)

interesting :slight_smile:

which tag did you set exactly?

Hi,
we use
/confluentinc/cp-kafka:latest
:slight_smile:

interesting :slight_smile:
so according to dockerhub cp 8 seems to be available already somehow.

then you need to take the steps outlined in the link above.

Best,
Michael

But the only activity in the Upgrade link that is applicable is the feature increment.

  • All Brokers and Controller use the same new image
  • I dont have a license
  • I didnt upgrade from 6.x so listeners should be fine
  • Same for security
  • I have 3 brokers so even if there is a new replication factor variable I’d fulfill the requirements (and I would expect a message if I didnt)
  • All the other components are not applicable

Not sure what I should do here?

I assume there should be a newer doc for CP8 which takes care of the upgrade path to CP8
but as it’s not officially released it might take some time until it’s available for public

maybe at least switch from “latest” to exact version you’d like to have to prevent auto upgrades on prod :innocent:

Not sure what I should do here?

not sure what would be the best to do here, stumbled over some post and issues which are related to the missing upgrade stuff therefore I thought it might be the best to try this first.

let me try to reproduce locally , I’m a bit curios :slight_smile:

Well usually i don’t call pull image anyway but I thought maybe it contained a fix to whatever I am encountering.
Didnt expect to hit a major version.

I o/c can simply downgrade to the latest of the old builds (or even the previous image) if you think this is upgrade related, but I don’t think so as I only upgraded since I had issues in the first place

mmh I see

it’s a best guess tbh need some time to check, will come back later :slight_smile: