Deploy K@raft(KRAFT) mode multi-node kafka cluster for production

Hi Team,

We’ve enterprise version of confluent kafKa, we want deploy new environment in K@raft(KRAFT) mode for production in on-premise, can you please provide your recommendations, also we couldn’t find much info to built kraft based multi-node kafka cluster.

1 Like

Check out the hardware and config guidance here: Configure KRaft in Production

A few things to hightlight:

  1. The recommended hardware footprint and # nodes is the same as for ZooKeeper
  2. Run in isolated mode in Production (i.e., process.roles = controller)

This Docker Compose file for playing with Multi-Region Clusters is not for Production but gives you a working KRaft configuration with 3 isolated controllers.

Thanks @dtroiano,

I would like to get your community thoughts on below

I want to use a load balancer in front of the primary and secondary Confluent Kafka environments. The load balancer would monitor the availability of the Quorum Controller in the primary environment. If the primary is not reachable, it will redirect traffic to the secondary environment. Confluent Replicator will be used to replicate data from the primary Kafka to the secondary Kafka.

Is using a load balancer in this scenario a recommended approach for achieving high availability?

It would be appreciable to receive your insights and guidance on the above

You might use a load balancer as the bootstrap servers endpoint that Kafka clients use so that, in the case of an environment outage, you can start using the new environment without having to update client configuration (but you’d still need to restart clients). This could aid as part of disaster recovery (easier to failover) but isn’t really an HA strategy. I’m assuming HA in the question means zero-downtime resiliency with respect to an environment outage.

I don’t see load balancers helping with environment outage resiliency because after bootstrapping, clients connect directly to individual brokers. That post-bootstrapping communication wouldn’t be able to happen through a load balancer.

One other thing to keep in mind for Replicator-based DR is that Replicator is async so, on failover, there would potentially be missing data in the secondary environment (“RPO > 0”). If RPO=0 is a requirement then you’d need a single stretch cluster spanning environments (regions or data centers), e.g., Confluent Platform Multi-Region Clusters.

Hi All,
Currently, we are using confluent helm charts for maintaining Kafka cluster in Kubernetes,
with current confluent platform version 7.4.0 and Kafka version 2.4.
We wanted to migrated to KRAFT mode for further Kafka releases i.e., 2.5, where Zookeeper is deprecated.
So, can you please help us is there way that we can migrate to Kafka KRAFT with confluent helm charts.