Hi! I can’t download HDFS Sink connectors !
Actually, i’m trying to install kafka connector in docker and i need to download the .JAR of Kafka-connect-hdfs connector before running it. When i go on this HDFS 3 Sink Connector | Confluent Hub , the download 's button is not clickable. Could someone help me, please?
Here is the configuration of the service:
kafka-connect:
image: confluentinc/cp-kafka-connect:latest
container_name: kafka-connect
volumes:
- ./connector:/usr/share/java
environment:
CONNECT_BOOTSTRAP_SERVERS: ‘kafka:9092’
CONNECT_GROUP_ID: ‘kafka-connect-group’
CONNECT_CONFIG_STORAGE_TOPIC: ‘kafka-connect-config’
CONNECT_OFFSET_STORAGE_TOPIC: ‘kafka-connect-offset’
CONNECT_STATUS_STORAGE_TOPIC: ‘kafka-connect-status’
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_REST_ADVERTISED_HOST_NAME: ‘kafka-connect’
CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_PLUGIN_PATH: ‘/usr/share/java’
ports:
- 8083:8083
hey @Damilola
I would recommend installing it via command line:
add the following to your compose file:
command:
- bash
- -c
- |
echo "Installing Kafka Connect hdfs"
confluent-hub install confluentinc/kafka-connect-hdfs3:1.1.25
#
echo "Launching Kafka Connect worker"
/etc/confluent/docker/run &
#
echo "Waiting for Kafka Connect to start listening on 0.0.0.0:8083 ⏳"
while : ; do
curl_status=$$(curl -s -o /dev/null -w %{http_code} http://0.0.0.0:8083/connectors)
echo -e $$(date) " Kafka Connect listener HTTP state: " $$curl_status " (waiting for 200)"
if [ $$curl_status -eq 200 ] ; then
break
fi
sleep 5
done
sleep infinity
best,
michael
It works. Thanks
But i have another question
This is a snippet from my docker-compose. How to connect kafka-connect to hdfs (namenode)? Or it is during the copy, I presice the url of HDFS ? Here (localhost://50070)
namenode:
image: bde2020/hadoop-namenode:2.0.0-hadoop2.7.4-java8
container_name: namenode
volumes:
- namenode:/hadoop/dfs/name
environment:
- CLUSTER_NAME=test
env_file:
- ./hadoop-hive.env
ports:
- "50070:50070"
networks:
- elk
datanode:
image: bde2020/hadoop-datanode:2.0.0-hadoop2.7.4-java8
container_name: datanode
volumes:
- datanode:/hadoop/dfs/data
env_file:
- ./hadoop-hive.env
environment:
SERVICE_PRECONDITION: "namenode:50070"
ports:
- "50075:50075"
networks:
- elk
kafka-connect:
image: confluentinc/cp-kafka-connect:latest
container_name: kafka-connect
hostname: connect
depends_on:
- schema_registry
- kafka
- zookeeper
environment:
CONNECT_BOOTSTRAP_SERVERS: 'kafka:9092'
CONNECT_REST_ADVERTISED_HOST_NAME: connect
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: compose-connect-group
CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000
CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema_registry:8082'
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema_registry:8082'
CONNECT_INTERNAL_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_INTERNAL_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_ZOOKEEPER_CONNECT: 'zookeeper:2181'
CONNECT_PLUGIN_PATH: /usr/share/java/kafka-connect-*
CONNECT_LOG4J_LOGGERS: org.apache.zookeeper=ERROR,org.I0Itec.zkclient=ERROR,org.reflections=ERROR
ports:
- 8083:8083
command:
- bash
- -c
- |
confluent-hub install confluentinc/kafka-connect-hdfs:10.2.1
/etc/confluent/docker/run
networks:
- elk
it’s done via worker configuration of kafka connect.
see HDFS 3 Sink Connector Configuration Properties | Confluent Documentation for reference
best,
michael
1 Like