What is the best configuration of apache kafka when preparing to build pipeline?
In my previous question I have like response 3 nodes clusters,
So what is best optimum configuration in apache kafka?
- number of broker?
3 brokers will work fine, but you then cannot lose any 2 brokers with a recommended replication factor of 3, and min ISR of 2.
Yes, more than 1 brokers creates a cluster. Unclear what you’re asking here.
Without knowing your data set, there’s no way to address how many topics you’d create.
None of this is specific to using Kafka Connect framework since it acts as a client, like any other application.
Kafka cluster is composed of multiple brokers, to build strong pipeline in kafka should I have many cluster or should I have one cluster?
Sure, companies typically offer a prod and non-prod cluster.
Going by your question, my assumption is you are currently exploring Kafka and don’t have a production system ready., is that right? Also since you mentioned Kafka Connect, it means you have some existing data source which you want to ingest into Kafka. There is no thing like best configuration, but probably most suitable configuration. To start with, you can consider a single cluster of n number of Kafka brokers, the “n” will be decided by the amount of data you are going to ingest and the server storage capacity. Number of Topics also depends what data you are ingesting, as topics will roughly correspond to Relational database tables or MongoDB collections. So it depends on your data source primarily for now.