Diagnose failed authentication

Hello all

I’m getting an error on my Kafka cluster in server.log.
The error is the following:

[2021-12-22 14:23:38,084] INFO [SocketServer brokerId=1] Failed authentication with /<node_ip> (SSL handshake failed) (org.apache.kafka.common.network.Selector)

This errors happens on each node, every few seconds, and on each message the IP in the message is the kafka broker IP.

I was wondering how do you debug these kind of issues ?
I’ve been TCP dumping the Kafka service port and I think (but I’m not certain) the problem is caused by a missconfiguration of Kafka (it’s not related to an external client, but Kafka itself).
I use SSL only for transport and no authorization at all.

I tried to set the log to trace but either I have failed or it didn’t show me anything new.

So what’s your secret to handle this case ? How would you go about it ?

I’m not especially looking for answers on this specific problem but more on learning how to debug this situation.

Hi @mmacphail

are you using a self signed cert?
and how does your config look like?

Best,
Michael

Hi @mmuehlbeyer,

Certs were delivered by my organisation. I tested them individually and they work fine.

My config is the following:

# Maintained by Ansible
advertised.listeners=INTERNAL://<url>:9092,BROKER://<url>:9091
broker.id=1
confluent.balancer.topic.replication.factor=3
confluent.license.topic=_confluent-license
confluent.license.topic.replication.factor=3
confluent.metadata.topic.replication.factor=3
confluent.schema.registry.url=https://<url>:8081,https://<url>:8081
confluent.security.event.logger.exporter.kafka.topic.replicas=3
confluent.ssl.key.password=<pass>
confluent.ssl.keystore.location=/var/ssl/private/kafka_broker.keystore.jks
confluent.ssl.keystore.password=<pass>
confluent.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore.jks
confluent.ssl.truststore.password=<pass>
confluent.support.customer.id=anonymous
confluent.support.metrics.enable=true
default.replication.factor=3
group.initial.rebalance.delay.ms=3000
inter.broker.listener.name=BROKER
kafka.rest.enable=false
listener.name.broker.ssl.key.password=<pass>
listener.name.broker.ssl.keystore.location=/var/ssl/private/kafka_broker.keystore.jks
listener.name.broker.ssl.keystore.password=<pass>
listener.name.broker.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore.jks
listener.name.broker.ssl.truststore.password=<pass>
listener.name.internal.ssl.key.password=<pass>
listener.name.internal.ssl.keystore.location=/var/ssl/private/kafka_broker.keystore.jks
listener.name.internal.ssl.keystore.password=<pass>
listener.name.internal.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore.jks
listener.name.internal.ssl.truststore.password=<pass>
listener.security.protocol.map=INTERNAL:SSL,BROKER:SSL
listeners=INTERNAL://:9092,BROKER://:9091
log.dirs=/opt/kafka_data/data
log.retention.check.interval.ms=300000
log.retention.hours=168
log.segment.bytes=1073741824
min.insync.replicas=2
num.io.threads=16
num.network.threads=8
num.partitions=1
num.recovery.threads.per.data.dir=2
offsets.topic.replication.factor=3
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
socket.send.buffer.bytes=102400
transaction.state.log.min.isr=2
transaction.state.log.replication.factor=3
zookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty
zookeeper.connect=<url>:2182,<url>:2182,<url>:2182
zookeeper.connection.timeout.ms=18000
zookeeper.ssl.client.enable=true
zookeeper.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore.jks
zookeeper.ssl.truststore.password=<pass>

thx
is there more information regarding the cert in the server.log or just this line?

which Kafka version is in place?

the error should normally occur if
the parameter

ssl.client.auth=required

is set
did you check the effective conf for this param?

Best,
Michael

and one further question what about

security.inter.broker.protocol=SSL

can’t find it in your config

Hello,

Happy new year !

No more information in server.log.
Kafka version is 2.6.0.

The parameter ssl.client.auth is set to none.
The parameter security.inter.broker.protocol is set to PLAINTEXT.

I think see the problem.
The inter.broker.listener.name is set to BROKER, which is mapped to SSL in listener.security.protocol.map.
However, the security.inter.broker.protocol is set to PLAINTEXT.
It should be set to SSL. Let met correct this and check if my understanding of the problem is correct.

This didn’t solve the problem but it expanded my understanding, so that’s something. I’m still working on it.

1 Like

Did you manage to solve this problem? How?

Hello,

With our team, we solved this issue with the renew of all certificates.

After that we used the Ansible Confluent Platform to generate keystores and truststores.

@mmacphail you can close this post :wink: