Backing up the Kafka cluster data

whatsupbros · 4 February 2021 09:35

Hi Kafkadmins!

I have a simple question - what options exist out there to backup and restore the data in our Kafka clusters?

Is creation of replica cluster the recommended (as it is mentioned in the official docs) and the only way to backup data in the topics?

Or are there already some other, probably third-party options, which you already used and tested?

Or, probably this is okay (which I doubt is possible) to backup the cluster data just by copying broker logs using something like rsync and then compressing the copy with zip or gzip?

Thanks for sharing your experience!

roadSurfer · 4 February 2021 12:44

At the moment I am using the latter approach with bind volumes in a Container going to a host mounted network share, which is then backed up or a regular basis, runs on RAID etc.

The one downside I see in this is that if there is a catastrophic failure, there won’t be automatic fail-over to some other replica and any restore will only be as good as the last backup.

Be interested to see what other folks are doing.

Edit: Thinking about this, if something takes out the entire storage array, then Kafka will be the least of our problems!

gkoenig · 4 February 2021 12:51

Hi @whatsupbros ,
the usage of a dedicated replica/backup cluster is certainly an option, but it also needs to be operated/monitored…which adds quite some work…and you need the additional resources to run that cluster.
Another approach could be to use kafka-connect and dump data into an object store (S3 or compatibles) or in a RDBMS…maybe you already have something available.
And always ensure that you have a tested and reliable restore process !!!
I’d not recommend using low-level tools like rsync/gzip approach to copy over data, because you are only copying a binary format of partitions hosted by a particular broker. This can be a solution if you want to replace a failed single broker, where you can spin-up a new maching and start the broker with the same id as the failed broker and copy back your data.
At the end, as often, it also depends on your UseCase, e.g. if you want to be able to restore only certain topic(s), you need to have a backup on “data”-level (see first 2 examples), not on “storage”-level (as the rsync/gzip)

HTH

whatsupbros · 4 February 2021 13:10

@roadSurfer, do you mean the rsync approach? Have you already tried to restore the cluster on a different machine after that?

I mean, probably one of the most interesting points here would be whether the cluster is in a consistent state, and if one can successfully start the restored cluster after this hot-copy of the Broker log-files.

whatsupbros · 4 February 2021 13:20

Totally agree with this point, and this is why I look for alternatives also.

Hmm, this is actually a really interesting approach. The complexity here would be probably to introduce the opposite process of restoration. Because, I think, it’s going to be even harder to do that, than to backup the data…

I put it here just as an option, because this is often a standard approach to backup stuff on unix systems. I agree that this will enable to backup only data of one Broker, but this process can be spread to other Brokers as well, or?

This is a good point. When I talked about “backing-up solution”, I meant something, with which you don’t have to think about the contents. You just backup and restore the data, and as a result, a consistent state of the cluster should be restored.

An example of what I mean would be rman utility, if we speak about Oracle Database, which can create backup sets, and also can restore and recover data after the moment when the backupset was created, using the archived and redo logs.

roadSurfer · 4 February 2021 14:00

Yes @whatsupbros, I mean the rsync approach, although what back-up strategy is being used isn’t known to me. I would imagine it’s using Z-send or something similar.
It works in a simple sense, but it may not fit all use cases.

We could, for example, push data into our MinIO cluster and use that as a backup. This is early days for us at the moment.

whatsupbros · 4 February 2021 14:19

Okay, thanks for your input. Yes, same here - we are currently estimating all the stuff, it is already decided that we are giving it a try, but we are only in the very beginning of our journey.

That is why I have so many questions

roadSurfer · 4 February 2021 14:51

I came across the below as an example of how to push data into MinIO:

odedia · 15 February 2021 16:47

I wonder, would Kubernetes-native solutions like Velero work for clusters deployed to Kubernetes? Velero backs up the persistent volumes as well the etcd control plane database to allow solutions such as backup and restore or data migration.

mmuehlbeyer · 16 February 2021 06:48

Hi @whatsupbros ,

some time ago I’ve stumbled over

Never used it in production but a short test was promising.

HTH

Kris · 16 October 2024 14:36

We found out that the person that created the open source kafka-backup solution didn’t have time to continue. It was also very limited.

I’m happy to show you our Kannika Armory product in a demo. You can already find all information on our kannika dot io website and even try it out.

Topic		Replies	Views
Kafka Cluster Backup Cluster Replication	0	1753	4 August 2023
Confluent Kafka Ops	6	3130	2 August 2022
What's the most efficient way to copy Kafka messages to S3 and back Ops	0	500	20 May 2024
Disaster Recovery Strategy Cluster Replication	7	4323	8 November 2023
Best way to build multi-dc kafka Cluster Replication	0	2602	14 December 2022

Backing up the Kafka cluster data

Related topics