CFK QuickStart Failed

HI Team,
We at Daimler Trucks are evaluating Confluent for Kubernetes and I followed the
https://docs.confluent.io/operator/current/co-quickstart.html
After Step 3, checked the state of the cluster, kafka and connect failed. Can you please help me resolve this as this is critical to showcase the demo to the teams.

As per our internal K8s platfrom
Changed the Image Registry to internal private Registry (Harbor - Procy Cache)
Also used internal StorgaeClass based on cinder

$ k get pods
NAME                                  READY   STATUS             RESTARTS   AGE
confluent-operator-58cccbdf5c-jm5p7   1/1     Running            0          74m
connect-0                             0/1     CrashLoopBackOff   9          44m
kafka-0                               0/1     CrashLoopBackOff   12         43m
kafka-1                               0/1     CrashLoopBackOff   12         43m
kafka-2                               0/1     CrashLoopBackOff   12         43m
zookeeper-0                           1/1     Running            0          44m
zookeeper-1                           1/1     Running            0          44m
zookeeper-2                           1/1     Running            0          44m
$ k describe pod/connect-0
Name:         connect-0
Namespace:    confluent
Priority:     0
Node:         c53p077-md-69d7bb48cb-fk585/192.168.0.191
Start Time:   Wed, 23 Feb 2022 20:10:55 +0530
Labels:       app=connect
              clusterId=confluent
              confluent-platform=true
              controller-revision-hash=connect-7c47ddd796
              platform.confluent.io/type=connect
              statefulset.kubernetes.io/pod-name=connect-0
              type=connect
Annotations:  cni.projectcalico.org/containerID: d15ccff047cc1b632efe53070aca7ab6b1bebd51e9e7f92c8f4a6729b2ef9963
              cni.projectcalico.org/podIP: 192.168.131.192/32
              cni.projectcalico.org/podIPs: 192.168.131.192/32
              kubernetes.io/psp: dhc-secure
              prometheus.io/port: 7778
              prometheus.io/scrape: true
Status:       Running
IP:           192.168.131.192
IPs:
  IP:           192.168.131.192
Controlled By:  StatefulSet/connect
Init Containers:
  config-init-container:
    Container ID:  containerd://cc25781549cbcc472346425290bb30deccb40fce7174031fda4fd1ef4e033820
    Image:         registry.app.corpintra.net/dockerhub/confluentinc/confluent-init-container:2.2.0-1
    Image ID:      registry.app.corpintra.net/dockerhub/confluentinc/confluent-init-container@sha256:b4a0d40e57aa35e3f1c2b534ff558a6452fe6cb0d845b2c470ab3c37d6001d67
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      -xc
    Args:
      until [ -f /mnt/config/init/template.jsonnet ]; do echo "file not found"; sleep 10s; done; /opt/startup.sh
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 23 Feb 2022 20:11:19 +0530
      Finished:     Wed, 23 Feb 2022 20:11:20 +0530
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     500m
      memory:  1Gi
    Requests:
      cpu:     100m
      memory:  512Mi
    Environment:
      CAAS_POD_ID:       connect-0 (v1:metadata.name)
      HOST_IP:            (v1:status.hostIP)
      POD_IP:             (v1:status.podIP)
      POD_NAME:          connect-0 (v1:metadata.name)
      POD_NAMESPACE:     confluent (v1:metadata.namespace)
      OPERATOR_CP_TYPE:  connect
    Mounts:
      /mnt/config from pod-shared-workdir (rw)
      /mnt/config/init from init-config-volume (rw)
      /mnt/config/shared from shared-config-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vlljf (ro)
Containers:
  connect:
    Container ID:  containerd://4bdf387cb8ecfac32565c0122d8d5003a228e9806875a7d1364934aea77c9bb9
    Image:         registry.app.corpintra.net/dockerhub/confluentinc/cp-server-connect:7.0.1
    Image ID:      registry.app.corpintra.net/dockerhub/confluentinc/cp-server-connect@sha256:f682cde0761267df3f556c1e23e445729bdad18585c26ed1139883d079d084a4
    Ports:         8083/TCP, 7203/TCP, 7777/TCP, 7778/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP
    Command:
      /bin/sh
      -xc
    Args:
      /mnt/config/connect/bin/run
    State:          Running
      Started:      Wed, 23 Feb 2022 20:57:59 +0530
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Wed, 23 Feb 2022 20:50:54 +0530
      Finished:     Wed, 23 Feb 2022 20:52:58 +0530
    Ready:          False
    Restart Count:  10
    Liveness:       http-get http://:8083/v1/metadata/id delay=120s timeout=30s period=10s #success=1 #failure=10
    Readiness:      http-get http://:8083/v1/metadata/id delay=60s timeout=10s period=10s #success=1 #failure=3
    Environment:
      CAAS_POD_ID:    connect-0 (v1:metadata.name)
      HOST_IP:         (v1:status.hostIP)
      POD_IP:          (v1:status.podIP)
      POD_NAME:       connect-0 (v1:metadata.name)
      POD_NAMESPACE:  confluent (v1:metadata.namespace)
    Mounts:
      /mnt/config from pod-shared-workdir (rw)
      /mnt/config/init from init-config-volume (rw)
      /mnt/config/shared from shared-config-volume (rw)
      /opt/confluentinc from cp-operator-scripts (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vlljf (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  cp-operator-scripts:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  <unset>
  init-config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      connect-init-config
    Optional:  false
  pod-shared-workdir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  <unset>
  shared-config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      connect-shared-config
    Optional:  false
  kube-api-access-vlljf:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason       Age                    From               Message
  ----     ------       ----                   ----               -------
  Normal   Scheduled    48m                    default-scheduler  Successfully assigned confluent/connect-0 to c53p077-md-69d7bb48cb-fk585
  Warning  FailedMount  48m                    kubelet            MountVolume.SetUp failed for volume "init-config-volume" : failed to sync configmap cache: timed out waiting for the condition
  Warning  FailedMount  48m                    kubelet            MountVolume.SetUp failed for volume "shared-config-volume" : failed to sync configmap cache: timed out waiting for the condition
  Warning  FailedMount  48m                    kubelet            MountVolume.SetUp failed for volume "kube-api-access-vlljf" : failed to sync configmap cache: timed out waiting for the condition
  Normal   Pulling      48m                    kubelet            Pulling image "registry.app.corpintra.net/dockerhub/confluentinc/confluent-init-container:2.2.0-1"
  Normal   Pulled       48m                    kubelet            Successfully pulled image "registry.app.corpintra.net/dockerhub/confluentinc/confluent-init-container:2.2.0-1" in 21.225264228s
  Normal   Created      48m                    kubelet            Created container config-init-container
  Normal   Started      48m                    kubelet            Started container config-init-container
  Warning  Unhealthy    46m                    kubelet            Liveness probe failed: Get "http://192.168.131.192:8083/v1/metadata/id": dial tcp 192.168.131.192:8083: connect: connection refused
  Normal   Pulled       46m (x2 over 48m)      kubelet            Container image "registry.app.corpintra.net/dockerhub/confluentinc/cp-server-connect:7.0.1" already present on machine
  Normal   Created      46m (x2 over 48m)      kubelet            Created container connect
  Normal   Started      46m (x2 over 48m)      kubelet            Started container connect
  Warning  Unhealthy    7m52s (x61 over 47m)   kubelet            Readiness probe failed: Get "http://192.168.131.192:8083/v1/metadata/id": dial tcp 192.168.131.192:8083: connect: connection refused
  Warning  BackOff      3m47s (x118 over 44m)  kubelet            Back-off restarting failed container

Hi @atummid

welcome :slight_smile:

did you already check the pods logfiles ?

best,
michael

1 Like

Issue is resolved. @mmuehlbeyer . It has to do with disk space .once i increased it became normal

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.