[Ideas] Why do people not make backups? How to improve that?

qubes_username · August 25, 2024, 8:35pm

One workaround is:

Backup essential Qubes only. You should be able to get this to 20-40GB.

And if you have a lot of non-OS files in that 300GB, consider putting those in a Vault qube or something, and backing that up completely separately. Or even leaving those files on an HDD and not bringing them into Qubes.

solene · August 25, 2024, 8:54pm

If you put the files on an external drive, this just offload the backup process to somewhere else.

leo · August 26, 2024, 1:14pm

I am curious about your setup because the last time I tried to automate my backups with wyng I ran into some issues. Do you back up your volumes locally first and then manually copy the backup archive to the off-site storage pool, or do you directly make your backups on the off-site storage pool with wyng --dest=ssh://...?

kenosen · August 26, 2024, 1:51pm

Dom0 directories I care about are compressed into a tar using rsync/tar, then copied to a disposable (qvm-copy-to-vm) and then sent with scp (maybe I once used rsync, can’t remember) to the remote server; wyng uses the same disposable to backup over ssh to the remote server (the disposable is based on an appvm which holds the ssh key). Both backups are automated through a script called daily via a cronjob (a log and notify-send notifications keep me aware of anything that may go wrong with the script, or allows me to check which files were updated, etc.).

solene · August 26, 2024, 2:00pm

It’s worth occasionally verifying that you can actually restore, and not only that the backup script is working.

You should also make sure you can reach the backup if the server dies. Having both access to the remote server in SSH and the password passphrase is important.

kenosen · August 26, 2024, 3:11pm

An important note. My script also does restore and search through sessions. And I test the available VMs monthly, but really only the critical VMs. I don’t spend time testing the restore for all of them.

kenosen · August 26, 2024, 3:22pm

I suppose while we’re at it, given that wyng makes its own small snapshots that linger for a time, I’ve found overall system performance to be benefited by a defrag of /var/lib/qubes ever so often. All of which reminds me I should build that as well as restore tests into the script!

SteveC · August 26, 2024, 4:20pm

This happened to me. I created a “dropbox” qube, which did a mount of a directory on my NAS. I had it set up so you could write to that directory, but not read from it. Unfortunately something happened in 4.2 (or debian 12?) where you can’t write a file to a place you have no read access to; cp now tries to stat something on the destination drive, and if it can’t, cp fails. So my “dropping” backup files to the NAS was no longer working and I didn’t realize it for months, because this was all happening in a bash script on a disposable. (Fortunately, my “internal” backups to a different SSD not part of the QubesOS filesystem were still working so I wouldn’t have been totally SOL unless I lost the entire machine.)

qubes_username · August 26, 2024, 10:43pm

If you put the files on an external drive, this just offload the backup process to somewhere else.

Yes, I meant don’t have them in Qubes (or on your machine). So say I have huge-files AppVM with 250GB of files in it. I instead move those files to an HDD and they are not in Qubes at all and not subject to Qubes backup wait times.

It’s not a perfect solution for a number of reasons but it would make Qubes backups faster.

dispuser · August 26, 2024, 10:55pm

I like this idea, but it would take me about 12 hours to understand and then apply this

It’s easier for me, as a very low skill Qubes user, to just use the regular backup feature. However, since it requires an enormous amount of space and time to do a backup, it will just always be precariously infrequent.

If someone came up for community guide for this, would be more likely that I would try it since I sometimes follow community guides.

qubes_username · August 26, 2024, 10:56pm

Last night I tried the following:

When I went to bed, I started a backup with a few clicks. As soon as I saw the progress bar, I locked my screen.

When I logged on today, I was presented with the most beautiful welcome screen I think I’ve ever seen. The backup was complete.

I don’t know how long that backup took but it only took about 12 seconds of my time or attention. Found another Qubes ninja move.

SteveC · August 26, 2024, 11:55pm

There’s a “poor man’s” version of this, not as elegant but it’s vastly better than needlessly backing up templates, and the learning curve isn’t quite as steep. It’s basically what I did before I moved to using salt.

That is to put all of the installation commands into a bash script. And in doing that, you have two choices. In both cases the script must be kept and maintained on dom0.

Fill the script with qvm-run commands; e.g., qvm-run TEMPLATE --pass-io "sudo apt install firefox-esr". You can then run the script on dom0 once you have created the template. You can even put the template creation command (by cloning a different template, perhaps) in the script. Also, you could copy files (e.g., scripts, configuration files) to the template with this, meaning you have to keep copies of those files on dom0 as well.
Copy the script to the template and run it there. Your commands would look something like sudo apt install firefox-esr. This will mean copying the script to the template, then issuing just one qvm-run command to run the script. The script will be a bit cleaner that way, but you still have to copy any other files to the template from dom0.

Both methods might require copying any configuration, setup, policy files, etc. from dom0 to the template. There are ways to make that less cumbersome, but I don’t want to make this explanation too complicated–just realize this would have to be taken care of.

OH, and you really, really need to make sure all involved files, the script, config files, you name-it, are…(wait for it)…backed up.

dispuser · August 26, 2024, 11:57pm

is there very poor man’s version of poor man’s version?

SteveC · August 27, 2024, 12:00am

I don’t know of anything more basic than bash scripts, honestly, other than typing it in at the command line by hand and from memory…which is worse.

Bash scripts can be as simple as a sequence of commands exactly how you’d type them into the command line anyway.

adw · August 27, 2024, 12:42am

Here’s the short version:

Create some offline qubes, e.g., one for each major area of your digital life.
Keep all of your important data in these offline qubes.
For your most frequent backups, back up all and only these offline qubes.

Done. Now you’re usually backing up only a small subset of qubes instead of your entire system. Since they’re offline qubes with only your important data, they take up much less space and back up much more quickly. (For example, they don’t include bulky-but-replaceable data like browser caches.)

Bonus tip: To make step 3 easier, enable the “Include in backups by default” option for all and only these offline qubes. Now you don’t even have to manually select the important qubes when you do backups, saving even more time.

qubes_username · August 27, 2024, 2:13am

I would make a list of what precisely you want to be backing up frequently. It is likely that is not a huge backup to do. Especially if you’re willing to move some stuff around.

I do backups I call FULL irregularly and MIN very frequently. MIN contains the list I just suggested you make. And it’s about 1/5th the size of FULL.

I even do a super-fast backup that is just some key files, nothing OS-related. That’s a simple drag-and-drop to an external drive. It may be that this is all you need for your MIN.

Quben · August 27, 2024, 4:15pm

I always knew Qubes had a backup system but I didn’t really check it out until that day I booted up and none of my templates existed anymore pretty sure you were there when I came crying about that one lol @solene

That was the day I learned mdadm (the disk tool not the drug) can’t turn a box of second hand laptop disks in questionable condition into a bulletproof NAS that backs itself up

fsflover · August 27, 2024, 5:02pm

This is not as easy as it sounds. For that, I would have to organize a regular copying of my data between “ordinary” and “data” qubes. This means a lot of cumbersome scripts to write. Also whenever something changes in the “ordinary” qubes concerning the data structure, the scripts must be adjusted. Not even speaking about verification of such backup: How can I make sure that I didn’t miss something important?

Do you care about possible Spectre- and Meltdown-like CPU vulnerabilities? Do you use your Qubes OS normally while backing it up or do you Pause All VMs for Secure Operation?

(Also, some slower machines might be not very usable while backup is being made.)

I believe that this relatively simple feature is not sufficiently prioritized in the Qubes development. It affects most new users endangering their data due to unnecessarily complicated verification… It’s a very old Issue and I complained about it before.

In any case it also makes the backups more cumbersome…

adw · August 28, 2024, 1:30pm

I don’t think that’s the only way to go about it. You could just keep each piece of data in its designated qube and work on it there. Also, it’s not a hard requirement that all the qubes be offline. That just simplifies the example. But email qubes, for example, are unlikely to be offline, and that’s fine. The idea is just to understand the general principles and apply them to your own situation in a way that makes sense for you.

Yes, why?

I usually shut down all qubes (except for the backup qube) while I back up.

fsflover · August 28, 2024, 1:44pm

This is why:

So how is that possible that you “spend on the backup process isn’t that long”? You can’t work on your machine for 5 hours, then you must return to it for a few seconds to start a verification, and wait few more hours. It means, you can’t leave the backup for the night (or you have to wake up in the middle of the night). You have to be somewhere around but can’t use the machine. To me this feels like wasting ~8 hours of work, unless you have another equivalent machine nearby.

And this is exactly my problem with the current lack of the automatic verification.