How many partitions to use if I'm not using any key

This is originally from a Slack thread. Copied here to make it available permanently.
You can join the community Slack here

sam

Let’s say, I’ve got 3 kafka brokers and have min.insync.replicas to 2
How do I decide the no. of partitions if I’m not using any partition key ?

Neil Buesing

If you do not have a partition key so overall order doesn’t matter in your processing; I would start with the # of partitions equal to the number consumers you want to run to process the work and just increase partitions if you need more (don’t increase by 1 every time, but you don’t have to go wild and over partition either)

Neil Buesing

Also ever if 1 partition would technically be sufficient for your work load, I never start at 1 — mostly because I don’t want anyone to come along and consume it expecting overall ordering of messages and now you cannot increase partitions because it would break some other consumer you were not aware of.

Neil Buesing

now, I just realize this is in a streams discussion — I was thinking (because you were not having a key) this was just standard producer/consumer — please provide more information on your streams use-case.

sam

@nbuesing I’m not looking for answers or suggestions to a specific use case rather i’m looking for general rule to apply on how to decide on the no. of partitions

sam

I’m thinking about following this,
Set default partitions count to 3 (no. of nodes)
When I can expect more read/write load on a particular topic increase partition count by 3 or multiple of 3
@nbuesing wdy think ?

Neil Buesing

That is a fair approach, the downside of 3 is the only way to have even workloads across consumers is to have 1 or 3. Personally, I like for as I can run 2 and get even workload and then can scale to 4 and still have even workloads. Unless your cluster is highly dedicated, the idea of aligning partitioning to # of brokers is over thinking it (IMHO) in that the brokers can handle and scale well beyond the levels of the producers/consumers — so partition more based on those than based on the brokers.

That being said I know of others that suggest 3 as well (I like 4) .

Another thread worth looking at, that explores other considerations (throughput, etc) and when you do use a key: How many partitions should a topic have? - #2 by mmacphail