Exposing internal Kafka to external Systems

Hey All, we are looking for architectural patterns to expose KAFKA (or any message queue in general) “securely” for pub-sub to the outside world? With Kafka-Connect we can integrate external systems (like Splunk, Elastic Search) but it requires DNS whitelisting and doesn’t work on a token-based subscription.
Even though it would work with whitelisting the IPs, this needs to be done on both sides. On one side where Kafka-Connect is running and on the other side where Splunk is running. Now with bigger enterprises in the picture, the whitelisting process itself takes a significant time which is not a good customer experience. To make the experience seamless, we are looking for an alternative for external systems to subscribe to our internal Kafka in a secure way. Any clue about this is really appreciated :slight_smile:
Thanks
Vikas

Hi @vikaskulkarni

welcome to the forum :slight_smile:

i think there are some possible solutions/options to get this done.

some questions regarding the external systems:

  • should they be able to consume or produce or both? ( guess both, just to be sure :wink: )
  • talking about the outside world: what’s outside world for you? internet? partner systems? other network segments? you name it
  • are you aware of the systems producing and consuming data to and from kafka? which applications/systems etc?

first thought which came to mind was kafka rest proxy
(basically we’ve built a similar solution based on logstash and kafka rest proxy some time ago)
https://docs.confluent.io/platform/current/kafka-rest/index.html

authorization and so an could also be achieved with the http basic auth, mtls
see:
https://docs.confluent.io/platform/current/kafka-rest/production-deployment/rest-proxy/security.html#kafkarest-security

however some whitelisting or blacklisting might be necessary/useful depending on you setup/use case.

best,
michael

n case you like to use k8s and strimzi the http bridge might useful

Thanks for your amazing and quick response. You probably answered what I am looking for.
To clarify your points;
** should they be able to consume or produce or both? ( guess both, just to be sure :wink: )*
For now, the external systems (for example Splunk) would ONLY Consume. We are the Producers of events

** talking about the outside world : what’s outside world for you? internet? partner systems? other network segments? you name it*
By external I mean Partner Systems that are publicly available on the Internet

** are you aware of the systems producing and consuming data to and from kafka? which applications/systems etc?*
The Producers in this case are our own applications. For example, we have a K8 cluster of microservices and Kafka is one of them. But Kafka is a black box and completely hidden and available to only the internal microservices. Our applications push the events to this Kafka for which we want the External Systems to Consume. So consumers are the systems not part of our infrastructure.

Hope I am able to clarify your questions. Nevertheless thanks for mentioning Kafka Rest proxy and Strimzi. Will check them.
Vikas

hi @vikaskulkarni

thanks for the information.

ok I see

ok so from an security perspective something like ip whitelisting might be necessary to prevent ddos or similar

according to your answers Kafka rest proxy or strimzi http bridge seems to be a good point to start :slight_smile:

let me know if you need further details

best,
michael

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.