What happens if I increase `log.segment.bytes`

Hi,

we have a topic with a few months of retention and around 20 Tb in Kafka. Because the default log.segment.bytes is around 1Gb and Kafka keeps a file descriptor for every segment - open or closed - this boils down to roughly 20k open file descriptors.

We noticed things like broker restarts scaling with the number of file descriptors. Therefore we were thinking about increasing log.segment.bytes to something like 4Gb. However looking for sources online and in “Kafka - The definitive Guide” we only found the opposite case of lowering log.segment.bytes on low volume topics.

Before doing this in production, I would therefore like to ask whether someone here already has experience with this or knows how Kafka would react to that.

We know from experience with lowering log.segment.bytes that this will have no effect on existing, closed segments. They will remain 1Gb large, but eventually fall out of retention. Also we can leave log.index.size.max.bytes untouched as it should be large enough for 5Gb segments according to the strimzi blog Deep dive into Apache Kafka storage internals: segments, rolling and retention

However

  • is our reasoning correct that this will reduce the number of file descriptors and this in turn improve cluster performance?
  • will this heavily the influence the memory footprint?
  • are there more consequences / side effects we are not aware of?

Hi @maow

from my understanding and the docs it should yes.
I would expect at least a slightly smaller amount of open files

I think so, there will be more memory needed to keep all the files open

one thing which came to my mind is confluent tiered storage
if you’d like to use the feature 4gb would take a bit more time to copy the data to a remote location

in general I think a could way to try this is to change the log.segment.bytes on a “test topic”.

hth,
michael

1 Like