Incremental Backups of KVM virtual machines on LVM with Borgbackup

Hello there.

I am currently running two home servers with KVM for the virtualization. Mentioned it a few times already. For the vdisks I am using LVM which should give me a tiny bit better performance since the VM does not have to go through an additional layer in form of the filesystem (in the case of qcow2). But the main reason is, that I like to work with LVM.

Anyway, I have been looking for a reliable way to backup my VMs for a very long time. A few years ago, I stumbled onto “borgbackup”. As the website says,

“borgbackup is a deduplicating archiver with compression and encryption”.

This is an excellent backup software. Fast, flexible and reliable.

Let me try to explain the deduplication part. It is a method of removing redundant data and only storing a single instance of that.

Here is a simple example. Let’s say you have 5 identical Linux VMs. Deduplication keeps only one instance of duplicate data blocks. This would mean I would only backup the file once, which saves a ton of space.

Maybe a bad example, but there are already excellent explanations of deduplication. Just search for “what is deduplication” and you will find it.

Let’s get to the topic. Incremental Backups for KVM VMs. Now, I created another very specific to my system and not so great script, which I will attach at the bottom if you care.

But the main thing I want to show you is how to create a (somewhat) consistent backup of your VMs. Keep in mind that you should flush and lock your databases if you want to back those up. I actually do not do this, but I never had any issues and am not running any critical databases, to begin with.

Let’s go through the steps. I am using Rocky Linux 8 for this.

Installation of Borgbackup

First, we need the EPEL repository and enable “powertools”.

rockylinux :: ~ » sudo dnf install epel-release -y
rockylinux :: ~ » sudo dnf config-manager --set-enabled powertools

Now we can install “borgbackup”. If you get an error message saying that “nothing provides python3.6dist(packaging)” then you probably have not enabled powertools.

rockylinux :: ~ » sudo dnf install borgbackup -y

OK, the installation is done. Next, we need to create a repository for the backups. For the encryption, I will use the repokey mode. This will store the key in the repository, so I will only need the passphrase to access the backup. But it is still recommended to export the key.

Creating the repository and exporting the key

I will use an SMB share I mounted on /mnt/backup.

rockylinux :: ~ » sudo borg init --encryption=repokey-blake2 /mnt/backup/borg-backup-repo
Enter new passphrase: 
Enter same passphrase again:
Do you want your passphrase to be displayed for verification? [yN]: y
Your passphrase (between double-quotes): "SECURE-PASSWORD"
Make sure the passphrase displayed above is exactly what you wanted.

Let’s export the key. Keep this file in a safe place.

rockylinux :: ~ » sudo borg key export /mnt/backup/borg-backup-repo ~/borg-backup-repo.key

Creating the first backup

Now that we are done with the preparation, we can start to back up files. I will demonstrate this by backing up the “logs” folder.

// (date +%F_%R:%S generates an output like this: 2021-11-27_18:36:13)

rockylinux :: ~ » sudo borg create --progress --info --list --stats /mnt/backup/borg-backup-repo::LOGS-BACKUP-$(date +%F_%R:%S) /var/log
Enter passphrase for key /mnt/backup/borg-backup-repo: 
Creating archive at "/mnt/backup/borg-backup-repo::LOGS-BACKUP-2021-11-27_18:36:13"
A /var/log/anaconda/anaconda.log                                                                                                   
A /var/log/anaconda/syslog
A /var/log/anaconda/X.log
...
A /var/log/hawkey.log
d /var/log
------------------------------------------------------------------------------                                                     
Archive name: LOGS-BACKUP-2021-11-27_18:36:13
Archive fingerprint: b0d9894e19dc050020035a671437d474fc149a0feddbb064b2e325d81c2cdfcb
Time (start): Sat, 2021-11-27 18:36:15
Time (end):   Sat, 2021-11-27 18:36:15
Duration: 0.15 seconds
Number of files: 28
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:                4.78 MB            793.63 kB            793.63 kB
All archives:                4.78 MB            793.63 kB            793.63 kB

                       Unique chunks         Total chunks
Chunk index:                      26                   26
------------------------------------------------------------------------------

I will create another backup to show what deduplication does. The interesting parts are marked red.

rockylinux :: ~ » sudo borg create --progress --info --list --stats /mnt/backup/borg-backup-repo::LOGS-BACKUP-$(date +%F_%R:%S) /var/log
...
------------------------------------------------------------------------------                                                       
Archive name: LOGS-BACKUP-2021-11-27_18:39:19
Archive fingerprint: e055a7aafd4a4d441682352ee95b0ba9ae4ad4cde224cc1aa3d51f50afa58f99
Time (start): Sat, 2021-11-27 18:39:21
Time (end):   Sat, 2021-11-27 18:39:21
Duration: 0.09 seconds
Number of files: 28
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:                4.78 MB            793.89 kB             32.87 kB
All archives:                9.56 MB              1.59 MB            826.50 kB

                       Unique chunks         Total chunks
Chunk index:                      29                   52
------------------------------------------------------------------------------

We can see the “compression size” stayed the same on both backups. Those are basically the same files, so there shouldn’t be too much of a difference. Under “deduplicated size” we see that the size of the new archive/backup went down to 32 kb.

Let me show you an example of my actual backup I am running every week.

KVM :: ~ » borg info borg-kvm-router::kvm-router-backup-2021-11-15_18:00:01
Comment: 
Hostname: kvm-router
Username: root
Time (start): Mon, 2021-11-15 18:01:01
Time (end): Mon, 2021-11-15 18:25:24
Duration: 24 minutes 23.13 seconds
Number of files: 4
Command line: /bin/borg create --progress --info --list --stats --read-special /mnt/VM-BACKUP/BORG-kvm-router::kvm-router-backup-2021-11-15_18:00:01 /dev/VM/pihole-snap /dev/VM/opnsense-snap /dev/VM/nextcloud-snap /dev/VM/paperless-snap
Utilization of maximum supported archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               83.75 GB             49.49 GB              2.78 GB
All archives:              416.62 GB            245.36 GB             64.35 GB

                       Unique chunks         Total chunks
Chunk index:                   36802               145164

Here we can see that the “original size” would be around 416GB. The compression lowered it to 245GB and deduplication down to only 64GB.

Pruning Backups

Of course, I do not have endless disk space, so I want to prune older backups.

rockylinux :: ~ » sudo borg prune --stats --list --keep-hourly=3 --keep-daily=5 --keep-weekly=3 --keep-yearly=1 /mnt/backup/borg-backup-repo/
Keeping archive: LOGS-BACKUP-2021-11-27_18:42:25      Sat, 2021-11-27 18:42:27 [39b0edb3fba5c65fe128a2b23bb122424e8ca7a4a3eb78b58fef3565ec60c6f2]
Pruning archive: LOGS-BACKUP-2021-11-27_18:39:19      Sat, 2021-11-27 18:39:21 [e055a7aafd4a4d441682352ee95b0ba9ae4ad4cde224cc1aa3d51f50afa58f99] (1/2)
Pruning archive: LOGS-BACKUP-2021-11-27_18:36:13      Sat, 2021-11-27 18:36:15 [b0d9894e19dc050020035a671437d474fc149a0feddbb064b2e325d81c2cdfcb] (2/2)
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
Deleted data:               -9.56 MB             -1.59 MB            -65.48 kB
All archives:                4.80 MB            794.76 kB            794.76 kB

                       Unique chunks         Total chunks
Chunk index:                      26                   26
------------------------------------------------------------------------------

Restoring Backups

Let’s say I deleted a file by accident, or it got corrupted. Here is how we can restore it.

First I want to take a look at the content of the backup.

// List the content
rockylinux :: ~ » sudo borg list /mnt/backup/borg-backup-repo::LOGS-BACKUP-2021-11-27_18:42:25
drwxr-xr-x root   root          0 Sat, 2021-11-27 18:42:21 var/log
drwxr-xr-x root   root          0 Thu, 2021-11-25 21:28:45 var/log/anaconda
-rw------- root   root      37372 Thu, 2021-11-25 21:28:45 var/log/anaconda/anaconda.log
-rw------- root   root     629987 Thu, 2021-11-25 21:28:45 var/log/anaconda/syslog
-rw------- root   root      30733 Thu, 2021-11-25 21:28:45 var/log/anaconda/X.log
-rw------- root   root      10904 Thu, 2021-11-25 21:28:45 var/log/anaconda/program.log

Restoring the “program.log” file. This command extracts the backup always in the current folder, you are executing it from. So if we want it in the original path, we have to change it to the root folder. This will override the original file without verification.

rockylinux :: ~ » cd /
rockylinux :: / » sudo borg extract /mnt/backup/borg-backup-repo::LOGS-BACKUP-2021-11-27_18:42:25 var/log/anaconda/program.log

Ok. These were some of the basics of “borg”. Let’s take a look at how to backup the VMs.

Backing up virtual machines (KVM)

Freezing the filesystem

To get a consistent backup, we will use the snapshot feature of LVM. This allows us to create a “frozen” state of the data. We want to avoid data being changed during the backup process, which would cause an inconsistent state, which in turn could corrupt files.

I will use my server for the example. First, let me show you the volumes.

KVM :: ~ » sudo lvs
  LV         VG              Attr       LSize    Pool Origin Data%  Meta%  Move Log... 
  EXT-BACKUP EXTERNAL-BACKUP -wi-a----- <465.76g                                                    
  nextcloud  VM              -wi-ao----   32.00g                                                    
  opnsense   VM              -wi-ao----    9.00g                                                    
  paperless  VM              -wi-ao----   20.00g                                                    
  pihole     VM              -wi-ao----   17.00g                                                    
  root       centos          -wi-ao----   30.00g                                                    
  swap       centos          -wi-a-----   <3.88g                                                    

We can see the “LV (logical volumes)” in the “VG (volume group)”. The LVs under “VM” are assigned to my virtual machines. I will use “paperless” (marked red) as an example. My VMs are named identically.

First, we will freeze the filesystems.

KVM :: ~ » sudo virsh domfsfreeze paperless
Froze 2 filesystem(s)

Creating the snapshot

Next, we will create the snapshot using LVM. This command creates a snapshot with a max size of 5GB. This should be enough for the backup to finish.

// -s tells lvcreate to create a snapshot
KVM :: ~ » sudo lvcreate -L5G -s -n paperless-snap VM/paperless

Let’s take another look at the LVs.

KVM :: ~ » sudo lvs
  LV             VG              Attr       LSize    Pool Origin    Data%  Meta%  Move Log.. 
  EXT-BACKUP     EXTERNAL-BACKUP -wi-a----- <465.76g                                                       
  nextcloud      VM              -wi-ao----   32.00g                                                       
  opnsense       VM              -wi-ao----    9.00g                                                       
  paperless      VM              owi-aos---   20.00g                                                       
  paperless-snap VM              swi-a-s---    5.00g      paperless 0.00                                   
  pihole         VM              -wi-ao----   17.00g                                                       
  root           centos          -wi-ao----   30.00g                                                       
  swap           centos          -wi-a-----   <3.88g                                                       

Thawing the filesystem

Now that we have a consistent snapshot we can thaw the filesystem.

KVM :: ~ » sudo virsh domfsthaw paperless
Thawed 2 filesystem(s)

Creating the backup

The volume is prepared. We can start the backup process. I will assume that I have the same backup repository mounted as in the first example.

KVM :: ~ » sudo borg create --progress --info --list --stats --read-special /mnt/backup/borg-backup-repo::paperless-$(date +%F_%R:%S) /dev/VM/paperless-snap

We could also add multiple volumes to the same backup. An example.

KVM :: ~ » sudo borg create --progress --info --list --stats --read-special /mnt/backup/borg-backup-repo::paperless-$(date +%F_%R:%S) /dev/VM/paperless-snap /dev/VM/pihole-snap /dev/VM/opnsense-snap

After the backup, we get an overview.

Restoring a virtual machine (KVM)

We have a backup now. Great. Quite useless if we don’t know how to restore it. 🙂

Let’s take a look at the restore process. If your backup repository has multiple VMs, then you have to single out the volume you want to restore and pipe it into “dd”.

This will of course override the volume.

KVM :: ~ » sudo borg extract --stdout /mnt/backup/borg-backup-repo::kvm-router-backup-2021-05-07_18:00:01 dev/VM/paperless-snap | dd of=/dev/VM/paperless bs=10M

We could also create a new logical volume and restore it there.

KVM :: ~ » sudo lvcreate VM -n paperless-restore
  Logical volume "paperless-restore" created.
KVM :: ~ » sudo borg extract --stdout /mnt/backup/borg-backup-repo::kvm-router-backup-2021-05-07_18:00:01 dev/VM/paperless-snap | dd of=/dev/VM/paperless-restore bs=10M

That’s pretty much it. As I mentioned earlier, I created a script to automate most of this (the repo creation is still a manual process). There are way better scripts out there, but I like to create these things myself if I can. 😉

(Bonus) Script

I know I said this about every script, but this one is quite bad… but it’s been working for me for years.

Usage:

KVM :: ~ » kvm-backup-borg.sh -h
Options:
        -h      Help
        -g      LVM Groupname in which the VM is
        -p      Where to Backup the Virtual Mashine
        -R      Backup Repo path
        -r      Delete backups older than X Weeks
        -s      Delete backups older than X hours
        -d      Delete backups older than X Days
        -M      Delete backups older than x months
        -l      List VMs and Groups
        -L      Logging Path
        -n      notification e-mail address
KVM :: ~ » kvm-backup-borg.sh -n [E-MAIL Address] -g [VOLUME GROUP] -M [PRUNE OLDER THAN x MONTHS] -r [OLDER THAN WEEKS] -d [OLDER THAN DAYS] -s [OLDER THAN HOURS] -R [BACKUP REPO] -p [BACKUP-PATH] -L [LOG PATH]

Here is an example I use as a cronjob.

KVM :: ~ » /root/vm-backup/kvm-backup-borg-all.sh -n email-address -g VM -M 0 -r 2 -d 3 -s 0 -R /mnt/VM-BACKUP/BORG- -p /mnt/VM-BACKUP/ -L /var/log/backup/ > /var/log/backup/kvm-backup-all-$(date +\%F).log 2>&1

That’s it… I have been using borgbackup for around 3 years now, if not longer. Never had any problems with it, during that time. I am still very happy with this solution.

Leave a Reply