Kafkacat connect failure to ConfCloud (SSL?)

Hi all,

Newly setup cluster in ConfCloud, testing that I can get things running against it from my local machine and running into issues. Looks like SSL negotiation has an issue, maybe. Googling did not turn up much more than the usual broker connection issues … which is just the point I’m trying to confirm (assuming the cloud setup did this already)

This (kafkacat) works against a local install of the Conf Platform, but can’t seem to connect to ConfCloud and was built from github (latest), with SSL support:


kafkacat - Apache Kafka producer and consumer tool

https://github.com/edenhill/kafkacat

Copyright (c) 2014-2021, Magnus Edenhill

Version 1.6.0-28-gb10431 (JSON, Avro, Transactions, IncrementalAssign, librdkafka 1.7.0 builtin.features=gzip,snappy,ssl,sasl,regex,lz4,sasl_gssapi,sasl_plain,sasl_scram,plugins,zstd,sasl_oauthbearer)

./kafkacat -b our_broker:9092 -L (plus the needful other stuff) gets me a


% Fatal error at metadata_list:1196:

% ERROR: Failed to acquire metadata: Local: Broker transport failure (Are the brokers reachable? Also try increasing the metadata timeout with -m <timeout>?)

The (edited) command in full is:

./kafkacat -b XXXXXXXXX.confluent.cloud:9092 -X security.protocol=sasl_ssl -X sasl.mechanisms=PLAIN -X sasl.username=<access_key> -X sasl.password=<secret_key> -L

Adding debug (-d broker) shows the following


%7|1629212766.560|BRKMAIN|rdkafka#producer-1| [thrd::0/internal]: :0/internal: Enter main broker thread

%7|1629212766.560|BROKER|rdkafka#producer-1| [thrd:app]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Added new broker with NodeId -1

%7|1629212766.560|CONNECT|rdkafka#producer-1| [thrd:app]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Selected for cluster connection: bootstrap servers added (broker has 0 connection attempt(s))

%7|1629212766.560|INIT|rdkafka#producer-1| [thrd:app]: librdkafka v1.7.0 (0x10700ff) rdkafka#producer-1 initialized (builtin.features gzip,snappy,ssl,sasl,regex,lz4,sasl_gssapi,sasl_plain,sasl_scram,plugins,zstd,sasl_oauthbearer, STATIC_LINKING GCC GXX PKGCONFIG INSTALL GNULD LDS C11THREADS LIBDL PLUGINS STATIC_LIB_zlib ZLIB SSL SASL_CYRUS STATIC_LIB_libzstd ZSTD HDRHISTOGRAM SYSLOG SNAPPY SOCKEM SASL_SCRAM SASL_OAUTHBEARER CRC32C_HW, debug 0x2)

%7|1629212766.560|BRKMAIN|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Enter main broker thread

%7|1629212766.560|CONNECT|rdkafka#producer-1| [thrd:app]: Not selecting any broker for cluster connection: still suppressed for 49ms: application metadata request

%7|1629212766.561|CONNECT|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Received CONNECT op

%7|1629212766.561|STATE|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Broker changed state INIT -> TRY_CONNECT

%7|1629212766.561|CONNECT|rdkafka#producer-1| [thrd:app]: Not selecting any broker for cluster connection: still suppressed for 49ms: application metadata request

%7|1629212766.561|CONNECT|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: broker in state TRY_CONNECT connecting

%7|1629212766.561|STATE|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Broker changed state TRY_CONNECT -> CONNECT

%7|1629212766.561|CONNECT|rdkafka#producer-1| [thrd:app]: Not selecting any broker for cluster connection: still suppressed for 49ms: application metadata request

%7|1629212766.581|CONNECT|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Connecting to ipv4#52.17.58.241:9092 (sasl_ssl) with socket 7

%7|1629212766.688|CONNECT|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Connected to ipv4#52.17.58.241:9092

%7|1629212766.688|STATE|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Broker changed state CONNECT -> SSL_HANDSHAKE

%7|1629212766.688|CONNECT|rdkafka#producer-1| [thrd:app]: Cluster connection already in progress: application metadata request

%7|1629212766.908|CONNECTED|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Connected (#1)

%7|1629212766.908|FEATURE|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Updated enabled protocol features +ApiVersion to ApiVersion

%7|1629212766.908|STATE|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Broker changed state SSL_HANDSHAKE -> APIVERSION_QUERY

%7|1629212766.908|CONNECT|rdkafka#producer-1| [thrd:app]: Cluster connection already in progress: application metadata request

%7|1629212767.017|FEATURE|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Updated enabled protocol features to MsgVer1,ApiVersion,BrokerBalancedConsumer,ThrottleTime,Sasl,SaslHandshake,BrokerGroupCoordinator,LZ4,OffsetTime,MsgVer2,IdempotentProducer,ZSTD,SaslAuthReq

%7|1629212767.017|AUTH|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Auth in state APIVERSION_QUERY (handshake supported)

%7|1629212767.017|STATE|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Broker changed state APIVERSION_QUERY -> AUTH_HANDSHAKE

%7|1629212767.017|CONNECT|rdkafka#producer-1| [thrd:app]: Cluster connection already in progress: application metadata request

%7|1629212767.125|SASLMECHS|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Broker supported SASL mechanisms: PLAIN,OAUTHBEARER

%7|1629212767.125|AUTH|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Auth in state AUTH_HANDSHAKE (handshake supported)

%7|1629212767.125|STATE|rdkafka#producer-1| [thrd:sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstr]: sasl_ssl://pkc-e8mp5.eu-west-1.aws.confluent.cloud:9092/bootstrap: Broker changed state AUTH_HANDSHAKE -> AUTH_REQ

%7|1629212767.125|CONNECT|rdkafka#producer-1| [thrd:app]: Cluster connection already in progress: application metadata request

%7|1629212767.560|CONNECT|rdkafka#producer-1| [thrd:main]: Cluster connection already in progress: no cluster connection

%7|1629212768.560|CONNECT|rdkafka#producer-1| [thrd:main]: Cluster connection already in progress: no cluster connection

%7|1629212769.560|CONNECT|rdkafka#producer-1| [thrd:main]: Cluster connection already in progress: no cluster connection

%7|1629212770.560|CONNECT|rdkafka#producer-1| [thrd:main]: Cluster connection already in progress: no cluster connection

%7|1629212771.560|CONNECT|rdkafka#producer-1| [thrd:main]: Cluster connection already in progress: no cluster connection

%7|1629212771.561|CONNECT|rdkafka#producer-1| [thrd:app]: Not selecting any broker for cluster connection: still suppressed for 49ms: application metadata request

% ERROR: Failed to acquire metadata: Local: Broker transport failure (Are the brokers reachable? Also try increasing the metadata timeout with -m <timeout>?)

So it seems to be either:

  • broker access, connectivity ( I assume this is correct with the Cloud setup)

  • SSL negotiation/params, etc… with client request,

  • something else?

Any advise/pointers/things-I’ve-missed would be very welcome.

Thanks for reading

Mike Frohme

This issue appears to be related to the usual configuration/reachability issues seen with configurations.

Access with kafkacat works fine from an EC2 instance in the same AWS region.

CCloud has a 5s delay on SASL handshake failure (bad credentials), and kafkacat has 5s metadata timeout. Given that the connection takes some extra time the metadata timeout will hit prior to the client receiving the SASL failure from the broker.

The metadata timeout can be changed with -m 10

Thanks, @rmoff … I think I covered that in my testing, but did not mention it in my post. I
(believe) I originally tested this with -m 30 (or more) to no success - it did not appear (at the time) as if this was solely a metadata timeout issue.

I may have the chance (need) to revisit it, but as kcat works fine with our cloud enviroment from an EC2 instance in the same region, its not as high a priority for me at the moment.