Kafka does not works through NAT as expected

Hello,
There is one scenario that is not achieveable in kafka.

  1. There is a setup of three kafka servers in cluster. Kafka Servers has private IPs and is only accessible from its own VPN say KAFKA VPN.
  2. There is one public IP that is accessible when machine is connected to KAFKA VPN + Office Netowork. This public IP is only used to NAT the private IPs so that over the internet people can access it from Office Network.
  3. There is one consumer at Office Network. Now that consumer can consume topics from Kafka using public IP when it is connected to KAFKA VPN.

Now I want to connect Public IP by disconnecting KAFKA VPN. Is that possible ? Network is working fine. I have checked Trace Route and Reversed Trace Route.

I think there is some problem with producer / kafka server / consumer configuration

NAT Configuration::

10.XX.XX.XX:9092 -> AA.XX.XX.XX:9095
10.XX.XX.XY:9092 -> AA.XX.XX.XX:9093
10.XX.XX.XZ:9092 -> AA.XX.XX.XX:9094

Producer Configuration::

bootstrap-servers=10.XX.XX.XX:9092,10.XX.XX.XY:9092,10.XX.XX.XZ:9092

Kafka Configuration::

listeners=SASL_SSL://0.0.0.0:9092,EXTERNAL://0.0.0.0:9095
advertised.listeners=SASL_SSL://10.XX.XX.XX:9092,EXTERNAL://AA.XX.XX.XX:9095
listener.security.protocol.map=SASL_SSL:SASL_SSL,EXTERNAL:SASL_SSL

Consumer Configuration::

sh kafka-console-consumer.sh --bootstrap-server AA.XX.XX.XX:9093,AA.XX.XX.XX:9094,AA.XX.XX.XX:9095 --topic test --consumer.config consumer.properties

Error that I am getting when I am not connected to KAFKA VPN::

[2022-03-02 17:35:58,236] WARN [Consumer clientId=consumer-test_group-1, groupId=test_group] Connection to node 2147483646 (10.XX.XX.XX/10.XX.XX.XX:9092) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2022-03-02 17:36:26,421] WARN [Consumer clientId=consumer-test_group-1, groupId=test_group] Connection to node 2 (10.XX.XX.XY/10.XX.XX.XY:9092) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2022-03-02 17:36:47,467] WARN [Consumer clientId=consumer-test_group-1, groupId=test_group] Connection to node 3 (10.XX.XX.XZ/10.XX.XX.XZ:9092) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient

Now some of the questionaries are ::

  1. Even if I am using Public IP to connect to Kafka, Why it is showing Private IP in logs ?
  2. Should bootstrap servers in producer and bootstrap servers in consumer must be exactly same ?
  3. How can I check the response of metadata request. Does kafka have any program to check ? Like it has for kafka-console-consumer.
  4. How should I fix this ?
  • Consider SASL and SSL properties and certificate in place. Consider all the network are in its place.

** I have attached architecture details in this forum

It would be great if someone help me on this

Thanks,
Milan K

First off, welcome to the forum and bravo on a great post with clear details of the problem and even a diagram :clap:

Let’s address the non-VPN bit to start with, and then I’ll come back to that afterwards.

Public IP → NAT → Internal

When your external client connects to AA.XX.XX.XX, it is routed by NAT to the internal IP and ports of the Kafka brokers. This is one of the three 10.XX.XX.X_:9092 boxes.

What this means is that your traffic hits the brokers on the internal listener (labelled SASL_SSL in your config) and not the EXTERNAL listener. Because of that, when the broker replies it provides the metadata of the internal listener.

Since your client receives the internal listener metadata, it then tries to connect to 10.XX.XX.X_:9092directly - which fails because it’s not accessible externally.

But why does it work on VPN?

The same as above happens - the external connection goes through NAT, is translated to 10.XX.XX.X_:9092, and internal listener metadata is sent back.

But because the client is on the VPN, when it then tries to subsequently connect given the metadata it received (10.XX.XX.X_:9092) it works, because it is on the VPN and can thus access these IP/ports directly.

How to fix it?

Instead of NAT’ing the external IP to the same internal IP/port as is used internally and on the VPN, you need to NAT each external IP/port to one of the brokers and a new, unused port on that broker. Then configure each broker with a listener on that port, and specify the advertised.listener as the corresponding external IP/port.

So you’ll have three brokers configured thus. Note that the for each EXTERNAL advertised.listener varies, whilst the listeners remains constant.

  • KAFKA01 (10.XX.XX.XX)

    listeners=SASL_SSL://0.0.0.0:9092,EXTERNAL://0.0.0.0:9095
    advertised.listeners=SASL_SSL://10.XX.XX.XX:9092,EXTERNAL://AA.XX.XX.XX:9093
    listener.security.protocol.map=SASL_SSL:SASL_SSL,EXTERNAL:SASL_SSL
    
  • KAFKA02 (10.XX.XX.XY)

    listeners=SASL_SSL://0.0.0.0:9092,EXTERNAL://0.0.0.0:9095
    advertised.listeners=SASL_SSL://10.XX.XX.XY:9092,EXTERNAL://AA.XX.XX.XX:9094
    listener.security.protocol.map=SASL_SSL:SASL_SSL,EXTERNAL:SASL_SSL
    
  • KAFKA03 (10.XX.XX.XZ)

    listeners=SASL_SSL://0.0.0.0:9092,EXTERNAL://0.0.0.0:9095
    advertised.listeners=SASL_SSL://10.XX.XX.XZ:9092,EXTERNAL://AA.XX.XX.XX:9095
    listener.security.protocol.map=SASL_SSL:SASL_SSL,EXTERNAL:SASL_SSL
    

Now you configure NAT thus:

  • AA.XX.XX.XX:9093 -> 10.XX.XX.XX:9095
  • AA.XX.XX.XX:9094 -> 10.XX.XX.XY:9095
  • AA.XX.XX.XX:9095 -> 10.XX.XX.XZ:9095

So a client connecting externally uses the same IP address but different ports, and traffic to each of the different ports routes to one of the three brokers internally. In turn, it hits the broker on its EXTERNAL listener port, and thus when the broker replies the metadata that it sends will include the correct EXTERNAL listener (the NAT’d IP + port)

kcat (formally called kafkacat) can do this, with the -L flag. You can also use the Python or Golang programs here to help validate it:

References

3 Likes

Hello @rmoff
Thanks for your prompt response. I nearly spent 5 days on this. I was really stuck.
It worked well for me!! Big Thanks to you. :clap: :clap: :clap:

4 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.