Qubes 4.0 Backup VMs slooow (gzip)

luja · February 21, 2022, 2:49pm

Hi, in order to migrate the production system to qubes 4.1 I do a full backup of the VMs using the exports feature.

Gzip has 90% cpu load and I have 86% idle
So I feel that only 1/12 cores does the compression of the huge tar ball.

This should be improved, as qubes users have many cores and data is big.
300 or 500GB on a NAS is a difference, so compression can be used, if offered.

So please use multithreaded gzip or some other compression with multithreading here.
Also important for VM inport in Qubes R4.1

Thanks for Qubes, it is hot.

BTW,

Will there be support for openbsd (firewall pf:-) ) VM?

Cheers

luja

fsflover · February 22, 2022, 9:39am

Yes, if someone helps with it: Use OpenBSD for sys-net · Issue #5294 · QubesOS/qubes-issues · GitHub and Add First-Class Support for OpenBSD · Issue #4551 · QubesOS/qubes-issues · GitHub.

See also:

adw · February 25, 2022, 10:56pm

The speed of backup compression also depends on the data being backed up, in my experience. When trying to back up a large amount of data that’s already compressed, it seems to take much longer. In those cases, I prefer to use the --no-compress option, which greatly speeds up backup creation.

luja · February 27, 2022, 11:07am

This is not the point.

The point is to implement it properly so it uses all available CPUs.
Using a proper compression library is not soo difficult imho

Cheers

luja

rustybird · February 27, 2022, 12:59pm

It’s possible to use alternative (de)compressors such as pigz, a parallelized version of gzip:

$ sudo qubes-dom0-update pigz
$ qvm-backup --compress-filter=pigz ...
$ qvm-backup-restore --compression-filter=pigz ...

adw · February 27, 2022, 6:31pm

Feel free to open an enhancement request.

luja · February 28, 2022, 3:09am

Then I would like to ask the maintainers to use pigz as default, as qubes lives from people having hardware with many cores, so backing up VMs one could expect that all available cores are used to speed up the process.
So please make pigz the default if domeone uses the gui to back-up the VMs and also wiyh the cli.
So using gzip should be a choice by setting compressor=gzip.

Not everybody reads the man pages for every command, so the default shoyld be alway something most usable by the masses. And in case of backing up 500G of VMs a multi core compressor should be the default.
The facebook compressor should be a choice if the gui has a sub menue for compressors as this compressor can be adjusted in compression preferences (speed vs. size vs. compatibility)
In the first step pigz should be chosen as default.

Cheers,

luja

adw · February 28, 2022, 8:40am

I think this already is one of the main considerations, but perhaps not in the way you’re thinking. Gzip has been around for a long time and is widely available across many different kinds of systems, which makes the manual recovery of files from a Qubes backup without a Qubes system more likely to be viable in an emergency situation (e.g., when all you have access to is an old Ubuntu disc). I’m not sure if the same can be said of pigz. If not, then this might be one consideration against making it the default. However, it might make sense to offer the user a menu of compression filter choices (e.g., to support the choice of trading compatibility for performance).

luja · February 28, 2022, 8:56am

Hi here you should *read" the manual!
Pigz is a parallel gzip implementation!
https://zlib.net/pigz/

Pigz is gzip, but parallel.

GWeck · February 28, 2022, 5:27pm

I tried to put the selection of pigz into a backup profile, by specfying compress-filter: pigz in /etc/qubes/backup/qubes-manager-backup.conf. It seems to work, but I have no idea if pigz is used or not. Anyhow, this line is automatically removed when the backup operation finishes.

rustybird · March 1, 2022, 12:24pm

This can be checked by attempting to restore (or verify) the backup. It should complain: “Unusual compression filter ‘pigz’ found. Use --compression-filter=pigz to use it anyway.” Which also means it’s currently not possible to restore such backups through the GUI - you have to do it through the qvm-backup-restore CLI.

adw · March 1, 2022, 3:06pm

I had already gathered this from the preceding posts in the thread. My comment was made with this understanding already in mind. Just because something is a parallelized implementation of X doesn’t mean it will be equally supported everywhere X is, especially in the sort of case I mentioned, where all you have on hand is some old Linux distro ISO that may predate the parallelized implementation of X.

Also, please note that an antagonistic tone is not conducive to productive discussion. I suggest reconsidering that approach.

luja · March 2, 2022, 6:02am

I am quite sure you did not read about pigz.
Please evaluate the compressed data generated by pigz against gz.

luja · March 2, 2022, 6:17am

As pigz implements gz,
you shall be able to use gz as a restore filter.
Restore is slow then.

So I strongly suggest to use pigz for backup and restore as default in the gui and the cli, as one wants to save time on both operations, backup and restore!

Also the resulting compressed data is readable with gz!
So in an emergency with only slackware of 1996 or Solaris7 with gnu tools at hand, one could still decompress the tarballl

It is getter to use parallel implementations as qubes machines tend to have many cores.

So please implement pigz for VM backup and recovery as default in gui and cli, as it saves a lot of time, especially if one wants to do a clean install of a new qubes because of bit rot, or because of a security incident.

It is a technical discussion, dont try to include some “personal” or “politics” fnord, thanks a lot.

luja

luja · March 2, 2022, 6:39am

You can say top in dom0

if you have 90 to 100% cpu load and 60% idle then it is not pigz as a parallel implementation uses all available cpu cores so there should be nearly 0% idle, if pigz is used.

Also look at the processes running.

deeplow · March 3, 2022, 10:17am

Can someone tests the Emergency backup recovery (v4) | Qubes OS does indeed work with a pigz-compressed backup?

Following that logic when restoring we should not need to specify that it used pigz, correct? Having to remember we used an alternative compression method and which one is a drawback. But it it does the same thing, one shouldn’t have to. Correct?

deeplow · March 3, 2022, 10:23am

A post was split to a new topic: Will there be support for openbsd (firewall pf:-) ) VM?

luja · March 3, 2022, 10:37am

Hi this should be the case.

So any compressed backup generated using pigz
should (and will be readable) using gzip, why?
Pigz is a parallel implementation of the gzip algorithm.

So as I need to cure some bit rot (and may-be incident ) I could volunteer to test it.
I will backup my VMs again on my nas (plenty of space) using pigz. Then I will restore using gz.

The transition script from 4.0 to 4.1 just dies utterly because the template VM were upgraded through many versions of fedora and debian respectively. Much stuff was installed and the dependency hell broke loose now.

Also having experienced nasty behaviour of some VMs in the context of using firefox that may have been exploited somehow (wired disk activity, unusable ps/2) best is to simply install the shit again.

It works on my machine, T7500 see hcl and now I just need to migrate the data and do a fresh install on the production disk:
“A man’s got to do what a man’s got to do.”

Cheers

taradiddles · March 7, 2022, 3:08pm

Just tried it out out of curiosity: made a backup with qvm-backup --compress-filter=pigz then restored with copy/pasting (verbatim) the recovery instructions: it works as expected, ie. no problem to decompress pigz content with gzip -d.

I’ve also tested qvm-backup-restore: as @rustybird pointed out it does need --compression-filter=pigz (because of “compression-filter=pigz” in backup-header; if pigz becomes the default filter it should ideally be shown as “gzip”).

That said - to get back to the OP’s point: on my T450s (2 cores) scrypt is the CPU hog, not gzip ; there’s zero difference between gzip and pigz:

time qvm-backup -y -p somefile --compress-filter=pigz /somepath fedora-dvm
[...]
real    0m23.295s
user    0m0.112s
sys     0m0.045s

time qvm-backup -y -p somefile /somepath fedora-dvm
[...]
real    0m23.717s
user    0m0.115s
sys     0m0.049s

(fedora-dvm’s private volume size is ~250MB)

@luja : care to do a similar test as the one above - maybe with a larger private volume - and report your findings ?

luja · April 19, 2022, 1:42am

Hi I will do the test when migrating my production qubes os 4.0 to 4.1 which is a big pain I am trying to procastenate sucessfully until now