Strategies for Preventing Data Loss in Kafka KRaft Mode Due to /tmp Directory Clearance

I’m deploying Kafka in KRaft mode and utilizing the default log storage paths within the /tmp directory, specifically for various node configurations (/tmp/kraft-combined-logs/, /tmp/kraft-broker-logs/, /tmp/kraft-controller-logs/). Given the ephemeral nature of /tmp in Linux environments—where it’s cleared upon reboot or periodically—I’m concerned about the resilience of Kafka’s data, particularly in a production setting.

Context: My Kafka setup consists of three separate controllers and brokers. Despite the systemd unit configurations designed to mitigate risks (e.g., setting KAFKA_CLUSTER_ID and pre-formatting storage to ensure node identification even if /tmp is cleared), I’m uncertain about the implications of potential data loss for logs stored in /tmp.

Specifically, my questions are:

  1. What impact does clearing the /tmp directory have on Kafka’s operation in KRaft mode?
  2. Are there Kafka mechanisms or best practices to recover or rebuild logs if they are lost due to /tmp clearance?

Attempts to Solve:

  • Maintained default log directory paths to assess implications and see what is next ?
  • Applied systemd unit configurations for resilience, including pre-formatting storage with KAFKA_CLUSTER_ID to safeguard against file loss. Example below:
ExecStartPre=/bin/bash -c '/opt/kafka/bin/ format -t $KAFKA_CLUSTER_ID -c /opt/kafka/config/kraft/ --ignore-formatted'

What I add to systemd unit might recover meta.properities which contains critical information such as, version, and . But, what about the other files/ logs in that directory ? What will happen if they got lost ?

Goal: I aim to understand potential risks and establish a recovery strategy for Kafka data stored in /tmp, ensuring the system’s stability and integrity in production.

Thank you for your guidance.

The autocleaned /tmp directory shouldn’t be used for the log.dirs property in production, or even in a non-prod setting when losing data on reboot isn’t desired. The default lends itself to getting started quickly and leaving no trace but it’s risky.

You’ll lose data (topic / offsets). It might be able to recover from replicas in a multi-node cluster but that’d be good fortune rather than a good recovery practice. For a single node deployment it’d restart as a clean slate.

Replication from other nodes might be able to save data. Aside from that you would might consider DR mechanisms where all data is replicated to another cluster, or there’s a process to rebuild from source data. These wouldn’t be best practices to mitigate the risks of using the /tmp directory, though. Really /tmp shouldn’t be used in the first place so that you’re not mitigating against its usage.