Self-hosted Kafka with KRaft, SSL and SASL (scram-sha-256)

Hello,

I’m building a fresh Kafka cluster, and I’m stuck with getting SASL_SSL working with SCRAM-SHA-256, in KRaft mode. I’ve tried v3.5.1 and v3.6.0 from tarball from Apache website (with binary built on on Scala 2.13), and RPM confluent-server-7.5.0. Both Kafka servers produce the same error on each host when I start my 3 nodes:

[2023-10-12 10:55:21,909] INFO [SocketServer listenerType=CONTROLLER, nodeId=1] Failed
authentication with /10.0.0.2 (channelId=10.0.0.1:9093-10.0.0.2:42552-8)
(errorMessage=Authentication failed during authentication due to invalid credentials
with SASL mechanism SCRAM-SHA-256 caused by Authentication failed: Invalid user
credentials) (org.apache.kafka.common.network.Selector)

Each node can’t authorize in each other. Nodes are added to /etc/hosts file for easier name resolution.

Here is the server config:

# /etc/kafka/server.properties
process.roles=broker,controller
# 1, 2, 3 on corresponding nodes
node.id=1
controller.quorum.voters=1@kafka-test-01:9093,2@kafka-test-02:9093,3@kafka-test-03:9093
listeners=BROKER://:9092,CONTROLLER://:9093
inter.broker.listener.name=BROKER
# kafka-test-01, kafka-test-02, kafka-test-03 on corresponding nodes
advertised.listeners=BROKER://kafka-test-01:9092
controller.listener.names=CONTROLLER
listener.security.protocol.map=BROKER:SASL_SSL,CONTROLLER:SASL_SSL
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/var/lib/kafka/kraft-combined-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
group.initial.rebalance.delay.ms=0
confluent.license.topic.replication.factor=1
confluent.metadata.topic.replication.factor=1
confluent.security.event.logger.exporter.kafka.topic.replicas=1
confluent.balancer.topic.replication.factor=1
confluent.cluster.link.enable=false
ssl.truststore.location=/etc/kafka/private/server.keystore.jks
ssl.truststore.password=keystore-secret
ssl.keystore.location=/etc/kafka/private/server.keystore.jks
ssl.keystore.password=keystore-secret
ssl.client.auth=required
sasl.enabled.mechanisms=SCRAM-SHA-256
sasl.mechanism.inter.broker.protocol=SCRAM-SHA-256
sasl.mechanism.controller.protocol=SCRAM-SHA-256
authorizer.class.name=org.apache.kafka.metadata.authorizer.StandardAuthorizer
allow.everyone.if.no.acl.found=true
super.users=User:admin
listener.name.broker.scram-sha-256.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
   username="admin" \
   password="secret000";
listener.name.controller.scram-sha-256.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
   username="admin" \
   password="secret000";

I’ve tried KRaft bootstrap both with cleartext password and with salted, with the same “Invalid credentials” error.

sudo -u cp-kafka kafka-storage format -t "VfHmcdoORCC689JspCEe_w" -c /etc/kafka/server.properties \
  --add-scram 'SCRAM-SHA-256=[name="admin",password="secret000"]'

sudo -u cp-kafka kafka-storage format -t "VfHmcdoORCC689JspCEe_w" -c /etc/kafka/server.properties \
  --add-scram 'SCRAM-SHA-256=[name="admin",iterations=4096,salt="f9YMDEauA4S+cgRU1q6bNA==",saltedpassword="TrHbiqINO7ED1+cROTnvqko8ymieLoaTubMij2AsdUI="]'

Is there a way to validate that kafka-storage format produced a valid structure for cluster to start and authorize other nodes? Maybe my JAAS inside the server.properties is invalid?

Hi

I have more or less the same problem. Have tried to create a Kafka setup using Docker and SASL_PLAINTEXT with SCRAM-SHA-256. It works for me if I only start a single broker, but if I start a 3 node cluster, I get the same error.

It works if I change the controller to use PLAINTEXT/PLAIN.

listener.security.protocol.map=CONTROLLER:PLAINTEXT,LOCAL:SASL_PLAINTEXT,BROKER:SASL_PLAINTEXT,DOCKERHOST:SASL_PLAINTEXT,EXTERNAL:SASL_PLAINTEXT
sasl.enabled.mechanisms=SCRAM-SHA-256
sasl.mechanism.inter.broker.protocol=SCRAM-SHA-256
sasl.mechanism.controller.protocol=PLAIN

@gschmutz could you please share your docker file for SASL_PLAINTEXT SCRAM?
Also jass.conf if possible? Thanks.

I’ve managed to get Kafka v3.6.0 working with SASL_SSL + PLAINTEXT, with a binary build from Apache website. Should also work with confluent-server. Here’s my server.properties:

process.roles=broker,controller
node.id=1
controller.quorum.voters=1@kafka-test-01:9093,2@kafka-test-02:9093,3@kafka-test-03:9093
listeners=BROKER://:9092,CONTROLLER://:9093
advertised.listeners=BROKER://:9092
inter.broker.listener.name=BROKER
controller.listener.names=CONTROLLER
listener.security.protocol.map=BROKER:SASL_SSL,CONTROLLER:SASL_SSL
listener.name.controller.ssl.client.auth=required
listener.name.broker.ssl.client.auth=required
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
ssl.truststore.location=/etc/kafka/jks/keystore.jks
ssl.truststore.password=keystore-pass
ssl.keystore.location=/etc/kafka/jks/keystore.jks
ssl.keystore.password=keystore-pass
ssl.client.auth=required
authorizer.class.name=org.apache.kafka.metadata.authorizer.StandardAuthorizer
super.users=User:admin
sasl.enabled.mechanisms=PLAIN
sasl.mechanism.inter.broker.protocol=PLAIN
sasl.mechanism.controller.protocol=PLAIN
listener.name.controller.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
    username="admin" \
    password="secret000" \
    user_admin="secret000";
listener.name.broker.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
    username="admin" \
    password="secret000" \
    user_admin="secret000";
log.dirs=/var/lib/kafka/kraft-combined-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000

Logs dir was bootstrapped with:

sudo -u kafka /opt/kafka/bin/kafka-storage.sh format \
  -t "VfHmcdoORCC689JspCEe_w" -c /etc/kafka/server.properties

PLAIN is not really as comfortable and secure as SCRAM auth – you can only add users with full restart of the cluster. A line like below should be added into both *.sasl.jaas.config=org.apache.* properties for each user on each node:

    user_username="password" \  # or ";" for the last user

Still waiting for the solution for SCRAM authentication.

Thanks. I tried it but I get errors in the logs.

[2024-04-23 07:04:20,152] ERROR [RaftManager id=3] Connection to node 1 (bosukafkbrkrd01/168.66.122.158:19092) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient)
[2024-04-23 07:04:20,152] ERROR [kafka-3-raft-outbound-request-thread]: Failed to send the following request due to authentication error: ClientRequest(expectResponse=true, callback=org.apache.kafka.raft.KafkaNetworkChannel$$Lambda/0x00007f5bb03e8000@6e1c6d9e, destination=1, correlationId=4, clientId=raft-client-3, createdTimeMs=1713870259184, requestBuilder=VoteRequestData(clusterId=‘QsS3S9p6RfeozyiDcyRBbw’, topics=[TopicData(topicName=‘__cluster_metadata’, partitions=[PartitionData(partitionIndex=0, candidateEpoch=59, candidateId=3, lastOffsetEpoch=57, lastOffset=7739)])])) (org.apache.kafka.raft.KafkaNetworkChannel$SendThread)
[2024-04-23 07:04:20,164] INFO [MetadataLoader id=2] initializeNewPublishers: the loader is still catching up because we still don’t know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2024-04-23 07:04:20,153] ERROR Request OutboundRequest(correlationId=4, data=VoteRequestData(clusterId=‘QsS3S9p6RfeozyiDcyRBbw’, topics=[TopicData(topicName=‘__cluster_metadata’, partitions=[PartitionData(partitionIndex=0, candidateEpoch=59, candidateId=3, lastOffsetEpoch=57, lastOffset=7739)])]), createdTimeMs=1713870259184, destinationId=1) failed due to authentication error (org.apache.kafka.raft.KafkaNetworkChannel)
org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed

This is the file I was using. Any thoughts on this?

process.roles=broker,controller
node.id=1
controller.quorum.voters=1@bosukafkbrkrd01:19092,2@bosukafkbrkrd01:19093,3@bosukafkbrkrd01:19094
listeners=BROKER://:9092,CONTROLLER://:19092
advertised.listeners=BROKER://:9092
inter.broker.listener.name=BROKER
controller.listener.names=CONTROLLER
listener.security.protocol.map=BROKER:SASL_SSL,CONTROLLER:SASL_SSL
listener.name.controller.ssl.client.auth=required
listener.name.broker.ssl.client.auth=required
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
ssl.keystore.location=/opt/kafka/kafka-poc.mfs.com/kafka-poc.mfs.com.jks
ssl.keystore.password=KeepMeSecure
ssl.key.password=KeepMeSecure
ssl.truststore.location=/opt/kafka/kafka-poc.mfs.com/kafka-poc.mfs.com.p12
ssl.truststore.password=KeepMeSecure
ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
ssl.client.auth=required

authorizer.class.name=org.apache.kafka.metadata.authorizer.StandardAuthorizer
super.users=User:admin
sasl.enabled.mechanisms=PLAIN
sasl.mechanism.inter.broker.protocol=PLAIN
sasl.mechanism.controller.protocol=PLAIN
listener.name.controller.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required
username=“admin”
password=“secret000”
user_admin=“secret000”;
listener.name.broker.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required
username=“admin”
password=“secret000”
user_admin=“secret000”;
log.dirs=/opt/kafka/kraft-combined-logs-1
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000

is there more information in the logs after this one?

seems to be SSL related

I did a grep of all the errors in the log and i get the following. The keystore/truststore are good so I am confused as to why it is choking. Any thoughts would be greatly appreciated.

[2024-04-23 07:04:20,152] ERROR [RaftManager id=3] Connection to node 1 (bosukafkbrkrd01/168.66.122.158:19092) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient)
[2024-04-23 07:04:20,152] ERROR [kafka-3-raft-outbound-request-thread]: Failed to send the following request due to authentication error: ClientRequest(expectResponse=true, callback=org.apache.kafka.raft.KafkaNetworkChannel$$Lambda/0x00007f5bb03e8000@6e1c6d9e, destination=1, correlationId=4, clientId=raft-client-3, createdTimeMs=1713870259184, requestBuilder=VoteRequestData(clusterId=‘QsS3S9p6RfeozyiDcyRBbw’, topics=[TopicData(topicName=‘__cluster_metadata’, partitions=[PartitionData(partitionIndex=0, candidateEpoch=59, candidateId=3, lastOffsetEpoch=57, lastOffset=7739)])])) (org.apache.kafka.raft.KafkaNetworkChannel$SendThread)
[2024-04-23 07:04:20,153] ERROR Request OutboundRequest(correlationId=4, data=VoteRequestData(clusterId=‘QsS3S9p6RfeozyiDcyRBbw’, topics=[TopicData(topicName=‘__cluster_metadata’, partitions=[PartitionData(partitionIndex=0, candidateEpoch=59, candidateId=3, lastOffsetEpoch=57, lastOffset=7739)])]), createdTimeMs=1713870259184, destinationId=1) failed due to authentication error (org.apache.kafka.raft.KafkaNetworkChannel)
[2024-04-23 07:04:20,166] ERROR [kafka-3-raft-outbound-request-thread]: Failed to send the following request due to authentication error: ClientRequest(expectResponse=true, callback=org.apache.kafka.raft.KafkaNetworkChannel$$Lambda/0x00007f5bb03e8000@57e223be, destination=1, correlationId=6, clientId=raft-client-3, createdTimeMs=1713870260154, requestBuilder=VoteRequestData(clusterId=‘QsS3S9p6RfeozyiDcyRBbw’, topics=[TopicData(topicName=‘__cluster_metadata’, partitions=[PartitionData(partitionIndex=0, candidateEpoch=60, candidateId=3, lastOffsetEpoch=57, lastOffset=7739)])])) (org.apache.kafka.raft.KafkaNetworkChannel$SendThread)
[2024-04-23 07:04:20,166] ERROR Request OutboundRequest(correlationId=6, data=VoteRequestData(clusterId=‘QsS3S9p6RfeozyiDcyRBbw’, topics=[TopicData(topicName=‘__cluster_metadata’, partitions=[PartitionData(partitionIndex=0, candidateEpoch=60, candidateId=3, lastOffsetEpoch=57, lastOffset=7739)])]), createdTimeMs=1713870260154, destinationId=1) failed due to authentication error (org.apache.kafka.raft.KafkaNetworkChannel)
[2024-04-23 07:04:20,167] ERROR [RaftManager id=3] Unexpected error NETWORK_EXCEPTION in VOTE response: InboundResponse(correlationId=6, data=VoteResponseData(errorCode=13, topics=), sourceId=1) (org.apache.kafka.raft.KafkaRaftClient)
[2024-04-23 07:04:20,189] ERROR [RaftManager id=3] Connection to node 2 (bosukafkbrkrd01/168.66.122.158:19093) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient)
[2024-04-23 07:04:20,189] ERROR [kafka-3-raft-outbound-request-thread]: Failed to send the following request due to authentication error: ClientRequest(expectResponse=true, callback=org.apache.kafka.raft.KafkaNetworkChannel$$Lambda/0x00007f5bb03e8000@5669e55, destination=2, correlationId=5, clientId=raft-client-3, createdTimeMs=1713870259184, requestBuilder=VoteRequestData(clusterId=‘QsS3S9p6RfeozyiDcyRBbw’, topics=[TopicData(topicName=‘__cluster_metadata’, partitions=[PartitionData(partitionIndex=0, candidateEpoch=59, candidateId=3, lastOffsetEpoch=57, lastOffset=7739)])])) (org.apache.kafka.raft.KafkaNetworkChannel$SendThread)
[2024-04-23 07:04:20,189] ERROR Request OutboundRequest(correlationId=5, data=VoteRequestData(clusterId=‘QsS3S9p6RfeozyiDcyRBbw’, topics=[TopicData(topicName=‘__cluster_metadata’, partitions=[PartitionData(partitionIndex=0, candidateEpoch=5:

seems as it still complains about ssl