[solved] backup qubes.tarwriter extremly slow

ckN6QwSZ · November 9, 2021, 8:26am

While backing up VMs I have read the handfull of threads dealing with the slow backup mechanism of Qubes OS, especially

Indeed my external USB-C backup HDD even goes to sleep for several minutes (30 min at least!) while trying to backup a rather large HVM. So I used “top” in dom0 to examine which process is eating resources and indeed

python3 -m qubes.tarwriter --override-name=vm41/root.img /dev/qubes_dom0/vm-hvm-arch-root

is on top in terms of CPU and MEM usage.

What does qubes.tarwriter do and why is that so inefficient?

PS: that happend to a smaller extent with other HVMs (20GB), too, and compression has always been switched off.

rustybird · November 9, 2021, 12:04pm

It’s backing up one of the VM’s volumes.

In the default installation layout, volumes are presented as block devices in an lvm_thin pool. GNU tar can’t back up the contents of block devices, so the backup system falls back on its own tar implementation named qubes.tarwriter, which reads the full volume data and then throws away stretches of just zeros. The sparser your VM volumes (i.e. when they’re using a small fraction of their available size), the more time is wasted on reading data that will never be written to the backup drive. If these periods of time are long enough, the idle backup drive might spin down.

GNU tar, which is used to back up regular files, e.g. in the Btrfs installation layout, can simply skip over the holes in the volume data without actually reading them. (Unfortunately, restoring is still inefficient in a similar way - for every type of pool / installation layout: There’s essentially a restore pipeline where the left side inflates any holes in the backed up data into possibly many gigabytes of zeros, and the right side has to read the stream and detect stretches of zeros in it…)

ckN6QwSZ · November 9, 2021, 1:03pm

Ahhh, thank you. Yes, that HVM image probably has a lot of zeros. So, it’s a trade off: (less) speed for leaving out zeros (not yet “compression”).

Mounting and rsync -av --progress source destination wouldn’t be an alternative? Duplicating the entire block device would eat disc space, but hey, time is more valueable.

rustybird · November 9, 2021, 1:45pm

I don’t think that leaving out the holes from of the written data makes the backup slower. qubes.tarwriter scanning for zeros probably still takes less time than the alternative (don’t scan, but then actually write those zeros to the destination instead).

It’s just unfortunate that on LVM it’s reading+scanning the zeros at all. If it could just skip over them (like in regular files) then it would take roughly the same time to back up an LVM volume with 10 of 500 GB used as it would for one with 10 of 50 GB.

ckN6QwSZ · November 9, 2021, 2:24pm

Since it is recommended to have more than one backup, I rsync-ed two verified qubes-backup-2021-11-08Txxxxxx to another HDD, which is a lot faster. Cloning backups like this might be an alternative at least for existing backups. Unfortunatly verifying Qubes-Backups takes hours and does not report progress, also.

I have read
https://docs.openstack.org/cinder/train/admin/blockstorage-backup-disks.html
and

However, in my eyes those solutions are not for the faint of heart like me. I’m not very familiar with lvm-thin-provisioning other then f…king up lvm-pools or -drives beyond repair. I see lvm’s advantages, but I don’t like to touch them.