Hi, in order to migrate the production system to qubes 4.1 I do a full backup of the VMs using the exports feature.
Gzip has 90% cpu load and I have 86% idle
So I feel that only 1/12 cores does the compression of the huge tar ball.
This should be improved, as qubes users have many cores and data is big.
300 or 500GB on a NAS is a difference, so compression can be used, if offered.
So please use multithreaded gzip or some other compression with multithreading here.
Also important for VM inport in Qubes R4.1
Thanks for Qubes, it is hot.
BTW,
Will there be support for openbsd (firewall pf:-) ) VM?
The speed of backup compression also depends on the data being backed up, in my experience. When trying to back up a large amount of data thatās already compressed, it seems to take much longer. In those cases, I prefer to use the --no-compress option, which greatly speeds up backup creation.
Then I would like to ask the maintainers to use pigz as default, as qubes lives from people having hardware with many cores, so backing up VMs one could expect that all available cores are used to speed up the process.
So please make pigz the default if domeone uses the gui to back-up the VMs and also wiyh the cli.
So using gzip should be a choice by setting compressor=gzip.
Not everybody reads the man pages for every command, so the default shoyld be alway something most usable by the masses. And in case of backing up 500G of VMs a multi core compressor should be the default.
The facebook compressor should be a choice if the gui has a sub menue for compressors as this compressor can be adjusted in compression preferences (speed vs. size vs. compatibility)
In the first step pigz should be chosen as default.
I think this already is one of the main considerations, but perhaps not in the way youāre thinking. Gzip has been around for a long time and is widely available across many different kinds of systems, which makes the manual recovery of files from a Qubes backup without a Qubes system more likely to be viable in an emergency situation (e.g., when all you have access to is an old Ubuntu disc). Iām not sure if the same can be said of pigz. If not, then this might be one consideration against making it the default. However, it might make sense to offer the user a menu of compression filter choices (e.g., to support the choice of trading compatibility for performance).
I tried to put the selection of pigz into a backup profile, by specfying compress-filter: pigz in /etc/qubes/backup/qubes-manager-backup.conf. It seems to work, but I have no idea if pigz is used or not. Anyhow, this line is automatically removed when the backup operation finishes.
This can be checked by attempting to restore (or verify) the backup. It should complain: āUnusual compression filter āpigzā found. Use --compression-filter=pigz to use it anyway.ā Which also means itās currently not possible to restore such backups through the GUI - you have to do it through the qvm-backup-restore CLI.
I had already gathered this from the preceding posts in the thread. My comment was made with this understanding already in mind. Just because something is a parallelized implementation of X doesnāt mean it will be equally supported everywhere X is, especially in the sort of case I mentioned, where all you have on hand is some old Linux distro ISO that may predate the parallelized implementation of X.
Also, please note that an antagonistic tone is not conducive to productive discussion. I suggest reconsidering that approach.
As pigz implements gz,
you shall be able to use gz as a restore filter.
Restore is slow then.
So I strongly suggest to use pigz for backup and restore as default in the gui and the cli, as one wants to save time on both operations, backup and restore!
Also the resulting compressed data is readable with gz!
So in an emergency with only slackware of 1996 or Solaris7 with gnu tools at hand, one could still decompress the tarballl
It is getter to use parallel implementations as qubes machines tend to have many cores.
So please implement pigz for VM backup and recovery as default in gui and cli, as it saves a lot of time, especially if one wants to do a clean install of a new qubes because of bit rot, or because of a security incident.
It is a technical discussion, dont try to include some āpersonalā or āpoliticsā fnord, thanks a lot.
if you have 90 to 100% cpu load and 60% idle then it is not pigz as a parallel implementation uses all available cpu cores so there should be nearly 0% idle, if pigz is used.
Following that logic when restoring we should not need to specify that it used pigz, correct? Having to remember we used an alternative compression method and which one is a drawback. But it it does the same thing, one shouldnāt have to. Correct?
So any compressed backup generated using pigz
should (and will be readable) using gzip, why?
Pigz is a parallel implementation of the gzip algorithm.
So as I need to cure some bit rot (and may-be incident ) I could volunteer to test it.
I will backup my VMs again on my nas (plenty of space) using pigz. Then I will restore using gz.
The transition script from 4.0 to 4.1 just dies utterly because the template VM were upgraded through many versions of fedora and debian respectively. Much stuff was installed and the dependency hell broke loose now.
Also having experienced nasty behaviour of some VMs in the context of using firefox that may have been exploited somehow (wired disk activity, unusable ps/2) best is to simply install the shit again.
It works on my machine, T7500 see hcl and now I just need to migrate the data and do a fresh install on the production disk:
āA manās got to do what a manās got to do.ā
Just tried it out out of curiosity: made a backup with qvm-backup --compress-filter=pigz then restored with copy/pasting (verbatim) the recovery instructions: it works as expected, ie. no problem to decompress pigz content with gzip -d.
Iāve also tested qvm-backup-restore: as @rustybird pointed out it does need --compression-filter=pigz (because of ācompression-filter=pigzā in backup-header; if pigz becomes the default filter it should ideally be shown as āgzipā).
That said - to get back to the OPās point: on my T450s (2 cores) scrypt is the CPU hog, not gzip ; thereās zero difference between gzip and pigz:
time qvm-backup -y -p somefile --compress-filter=pigz /somepath fedora-dvm
[...]
real 0m23.295s
user 0m0.112s
sys 0m0.045s
time qvm-backup -y -p somefile /somepath fedora-dvm
[...]
real 0m23.717s
user 0m0.115s
sys 0m0.049s
(fedora-dvmās private volume size is ~250MB)
@luja : care to do a similar test as the one above - maybe with a larger private volume - and report your findings ?