Kafka server fails to start due to the following error:
Readiness probe failed: dial tcp 10.244.2.23:9071: connect: connection refused
Kafka server deployed on AKS using this example. The condition of the k8s cluster is normal. Confluent operator is up and running. Zookeeper starts up fine.
Steps taken so far:
- Re-deployed the confluent cluster
- Increased timeout & initial delay of the probes
- Tried using different image version
First error in the Kafka pod log:
[ERROR] 2022-02-16 15:49:25,245 [controller-event-thread] state.change.logger error - [Controller id=0 epoch=5130] Controller 0 epoch 5130 failed to change state for partition _confluent-controlcenter-7-0-1-0-group-stream-extension-rekey-1 from OfflinePartition to OnlinePartition
Anyone familiar with this issue? Any help or tips will be highly appreciated.