Kafka Connect distributed mode Dockerfile

Hey community,

We are working on deploying Kafka Connect in distributed mode hosted on AWS ECS with autoscaling via Fargate. To do this, we provide AWS ECR with a Dockerfile which contains our Connect instance configuration:

FROM confluentinc/cp-kafka-connect-base:latest
EXPOSE 8083
COPY ./plugins/ /opt/kafka/plugins/
ENV CONNECT_PLUGIN_PATH=/opt/kafka/plugins,/usr/share/java
ENV CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR=1
ENV CONNECT_CONFIG_STORAGE_TOPIC=connect-configs
ENV CONNECT_GROUP_ID=ibdata-mock-connect-cluster
ENV CONNECT_INTERNAL_KEY_CONVERTER=org.apache.kafka.connect.json.JsonConverter
ENV CONNECT_INTERNAL_VALUE_CONVERTER=org.apache.kafka.connect.json.JsonConverter
ENV CONNECT_KEY_CONVERTER=org.apache.kafka.connect.storage.StringConverter
ENV CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR=1
ENV CONNECT_OFFSET_STORAGE_TOPIC=connect-offsets
ENV CONNECT_REST_PORT=8083
ENV CONNECT_STATUS_STORAGE_REPLICATION_FACTOR=1
ENV CONNECT_STATUS_STORAGE_TOPIC=connect-status
ENV CONNECT_VALUE_CONVERTER=io.confluent.connect.avro.AvroConverter

ENV CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL=xxx
ENV CONNECT_BOOTSTRAP_SERVERS=xxx
ENV AWS_ACCESS_KEY_ID=xxx
ENV AWS_SECRET_ACCESS_KEY=xxx

ENV CONNECT_REST_ADVERTISED_HOST_NAME=test

The xxx variables are going to be the same for all the workers in the cluster, but thats not the case with the CONNECT_REST_ADVERTISED_HOST_NAME. Currently, with ECS autoscaling, all the workers that will be deployed in the Connect Cluster will share the REST Host Name, and thats not working properly (thanks Robin Moffatt).

Is there any way to dynamically configure this variable for each Dockerfile? Or maybe a workaround to make this work with the same advertised host name for all instances?

Thanks a lot in advance!
Brais

hey @kuro

welcome :slight_smile:

are looking for something like this?

best,
michael

Hello @mmuehlbeyer ,

Thats exactly what we were looking for ! I’ve tested the configuration for our Connect scenario and it worked, the variable was asigned properly on deploy time.

Here is the final product:

set_env.sh

 #!/bin/bash
 set -x
 
 JSON=$(curl ${ECS_CONTAINER_METADATA_URI}/task)
 echo $JSON
 TASK=$(echo $JSON | jq -r '.Containers[0].Networks[0].IPv4Addresses[0]')
 echo $TASK
 
 CONNECT_REST_ADVERTISED_HOST_NAME=$TASK /etc/confluent/docker/run

Dockerfile

FROM confluentinc/cp-kafka-connect-base:latest
EXPOSE 8083
COPY ./plugins/ /opt/kafka/plugins/
ENV CONNECT_PLUGIN_PATH=/opt/kafka/plugins,/usr/share/java
ENV CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR=1
ENV CONNECT_CONFIG_STORAGE_TOPIC=connect-configs
ENV CONNECT_GROUP_ID=ibdata-mock-connect-cluster
ENV CONNECT_INTERNAL_KEY_CONVERTER=org.apache.kafka.connect.json.JsonConverter
ENV CONNECT_INTERNAL_VALUE_CONVERTER=org.apache.kafka.connect.json.JsonConverter
ENV CONNECT_KEY_CONVERTER=org.apache.kafka.connect.storage.StringConverter
ENV CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR=1
ENV CONNECT_OFFSET_STORAGE_TOPIC=connect-offsets
ENV CONNECT_REST_PORT=8083
ENV CONNECT_STATUS_STORAGE_REPLICATION_FACTOR=1
ENV CONNECT_STATUS_STORAGE_TOPIC=connect-status
ENV CONNECT_VALUE_CONVERTER=io.confluent.connect.avro.AvroConverter

ENV CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL=xxx
ENV CONNECT_BOOTSTRAP_SERVERS=xxx
ENV AWS_ACCESS_KEY_ID=xxx
ENV AWS_SECRET_ACCESS_KEY=xxx

RUN wget -O /usr/local/bin/jq https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64 && chmod +x /usr/local/bin/jq
COPY set_env.sh /etc/set_env.sh

CMD ["/etc/set_env.sh"]

I think it will be great to have the IP of the container as default value of the CONNECT_REST_ADVERTISED_HOST_NAME, in the same way as the port has the default “8083” here:
https://github.com/confluentinc/kafka-images/blob/4b8b751f0b0ea8a4473eedc1c5540a9e8fb9021c/kafka-connect-base/include/etc/confluent/docker/configure

Maybe with an aproach similar to what KAFKA_JMX_HOSTNAME is using in the launch file?

export KAFKA_JMX_HOSTNAME=${KAFKA_JMX_HOSTNAME:-$(hostname -i | cut -d" " -f1)}

Do you think that could be an useful addition to the configuration?

Thanks again for your help, I owe you a beer.
Cheers,
Brais

hi @kuro

ok cool :slight_smile:

mmh might be useful, but if the ip you get is a private one you might get some issues :wink:

best,
michael

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.