Topic disk size utilised

Hi all

Is there a way to find out the amount of space a topic is occupying and number of messages on it… total… as per retention setting.



curious to try and throw 100 000 records structured as JSON at one topic and then do the same using ProtoBuf at a second topic,
See the space utilised different and then with this a descent number see the product time…


still hoping someone can point me to how to calculate the space consumed by a topic…

performance wise i’ve found my protobuf version did 8000/sec, the json only got to 500, so space is def not going to be a decider between the 2, it’s performance vs upstream sources.

This tutorial is a kcat-based approach for counting messages.

For determining the space that a topic uses, take a look at Apache Kafka’s kafka-log-dirs utility. You would need to sum up the byte sizes returned for each partition:

$ kafka-log-dirs --describe --bootstrap-server broker:9092 --topic-list foo
Querying brokers for log directories information
Received log directory information from brokers 1
  "brokers": [
      "broker": 1,
      "logDirs": [
          "partitions": [
              "partition": "foo-4",
              "size": 588,
              "offsetLag": 0,
              "isFuture": false
              "partition": "foo-5",
              "size": 0,
              "offsetLag": 0,
              "isFuture": false
              "partition": "foo-2",
              "size": 0,
              "offsetLag": 0,
              "isFuture": false
              "partition": "foo-3",
              "size": 0,
              "offsetLag": 0,
              "isFuture": false
              "partition": "foo-0",
              "size": 392,
              "offsetLag": 0,
              "isFuture": false
              "partition": "foo-1",
              "size": 196,
              "offsetLag": 0,
              "isFuture": false
          "error": null,
          "logDir": "/var/lib/kafka/data"
  "version": 1

Note that this is what the Kafka admin client’s describeLogDirs method returns. It’s the byte size on disk of a partition’s .log file. It doesn’t contain the other supporting data on disk, e.g., index and timeindex files. To get the total size needed to support a topic, you’d need broker access where you can run du. E.g., for a topic foo, you’d go on all brokers and run:

$ du -ch foo-*
12K	foo-0
12K	foo-1
8.0K	foo-2
8.0K	foo-3
12K	foo-4
8.0K	foo-5
60K	total
