How do you organize your backups?

spergynerd · May 9, 2021, 2:51pm

Before I discovered “Qubes OS” (it’s smart now I’m addicted), I was using Debian and my backups were done with rsync on two external media and with “duplicati” to a remote server. The advantage is that my backups were incremental and I didn’t have to do a full backup every time.

If I understand correctly, I have to make a full backup (of each vm) each time, which can quickly become time consuming if my vm occupy a lot of space.

I would like to have your point of view on how you organize your backups.

unman · May 9, 2021, 3:20pm

You don’t need to make a full Qubes backup every time.
The Qubes backup mechanism backs up the qube configuration and the
data that it contains.
There is no reason why you should not back up just the data using
whatever means you are comfortable with.

Since you have used rsync in the past, you can use that now.
You can attach a USB device to the qube, and rsync to that.
If you want to rsync to another qube, you could use an offline qube to
capture backups from many qubes - take a look at
GitHub - unman/qubes-sync: Simple syncin between qubes over qrexec for a suggestion of how you might
do this.
You can then write off the data archive to external media, or use an
online qube and duplicati.

Once you have this in place you can also run full Qubes backups when
you like.

behemothwerecat · May 9, 2021, 3:24pm

From Backup docs

A standard recommendation is to make backups at least weekly: three copies in two different formats, one off-site.

I am curious how others do the ‘two different formats’. I will continue to use the Qubes Backups app because it’s to two local external drives and the lack of the de-duplication is not an issue for me, but for security I would like a second format as well. I was planning on also using rsync (or perhaps Restic which has de-duplication + good encryption) but it’s not clear to me how to carry this out for several Qubes, rather than manually making the backup from each Qube which would be far too time consuming to do weekly. Presumably rsync/restic would need to run from dom0 - how how would qubes be selected?

Edit: Just seeing Unman’s qubes-sync which looks like it answers this question.

tasket · May 10, 2021, 12:56am

Wyng is an incremental disk image backup that can handle VM data in a fraction of the time it takes rsync, restic, etc. without breaching Qubes security model. I’ve been working on it for a couple years and its been very reliable.

What is different about Wyng is that it uses data change information (deltas) that’s already available to the system (specifically, LVM) so it doesn’t have to scan the source data for changes each time a backup is started; Wyng knows what has changed pretty much instantly. It can also perform de-duplication to reduce the backup size.

One thing I’ve noticed about file-based backups (like rsync) on Qubes is, besides their inherent security risks, they also take a gamble with data corruption when they use file timestamps as a way to shorten backup times. This is because Qubes VMs don’t track the overall system time accurately, and the clock synchronization system Qubes uses is less than perfect; If a VM’s clock snaps backward, files will be written with earlier timestamps.

So overall, the most Qubes-like approach is to backup at the disk image level; it is just tossing blocks around and not interfacing with server or VM filesystems so there’s less risk. This is what Qubes Backup does, except it sends only whole disk images. Wyng asks Linux LVM which bits of a disk image have changed and sends only those blocks, adding them to the earlier blocks in the backup archive.

behemothwerecat · May 10, 2021, 1:43am

Incredible! Do you use this as a second format to Qubes Backup, or exclusively? I noticed Cryfs is mentioned as an encryption option - do people have thoughts on how it compares to gocryptfs for this use case? Is such an encrypted volume + LUKS overkill you think?

adw · May 10, 2021, 5:48am

“Format” here typically refers to the physical medium on which the backup resides. Here are some examples of things commonly regarded as different formats: external hard drive, cloud storage, flash drive, optical disc, tape.

tasket · May 10, 2021, 1:35pm

Wyng is my primary choice for everything. I use it on my non-Qubes systems also, as long as they use LVM thin pools for storage.

Yes, it would be overkill to combine an encrypted filesystem with LUKS. The Wyng notes mention using LUKS (or dm-crypt) for Qubes, as this puts every sensitive aspect under dom0 control without exposing dom0.

OTOH, an encrypted filesystem like gocryptfs would work, but its mostly compromises compared to using LUKS.

spergynerd · May 11, 2021, 7:22am

Thank you for all these recipes, I will try to concoct mine from yours

behemothwerecat · May 30, 2021, 3:41pm

I am really excited to use Wyng once I understand LVM, but haven’t gotten there yet, so I’m looking for a noob friendly option in the meantime.

I was wondering what people think of the Qubes backup advice given at anonymousplanet.org - use clonezilla while the machine is powered down: The Hitchhiker’s Guide to Online Anonymity | The Hitchhiker’s Guide to Online Anonymity

Regarding solutions like Qubes Backup:

These backup utilities will not be able to restore your encrypted drive as-is as they do not support those encrypted file systems natively. And so, these restore will require more work to restore your system in an encrypted state (re-encryption after restore).

Regarding the offline clonezilla proposal:

This backup will back up the encrypted disk as-is and therefore will be encrypted by default with the same mechanism (it is more like a fire and forget solution). The restore will also restore the encryption as-is and your system will immediately be ready to use after a restore.

Regarding Qubes specifically:

Qubes OS recommends using their own utility for backups[…]. But I think it is just a hassle and provides limited added value unless you just want to back-up a single Qube. So instead, I am also recommending just making a full image with Clonezilla which will remove all the hassle and bring you back a working system in a few easy steps.

Would restoring using this clonezilla method be potentially problematic? I don’t know enough about LUKS headers etc to know.
Is there a benefit to Qubes Backup that they aren’t considering? It’s not clear to me what ‘added value’ they reference actually is.

adw · May 31, 2021, 12:51am

One of the main advantages of the Qubes backup system is that the backups it creates are not only encrypted but authenticated (and thereby also integrity-protected). Most other backup methods, including DIY procedures, do not feature this property or do not implement it correctly. (For example, if you’re not sure whether you should encrypt-then-MAC or MAC-then-encrypt, or if you don’t see the difference, then you should not be taking a DIY approach to this if security matters.) Authenticated backups protect against attacks in which an adversary, unbeknownst to you, maliciously modifies or replaces your backup and, upon restoring from that backup, you compromise your system.

unman · May 31, 2021, 1:47pm

It should also be said that taking a full disk image is not in any sense
a useful backup in Qubes. Yes, it will allow you to restore the system at
a specific point in time, but that doesn’t make for a good backup regime.
Regardless of the time involved, and the multiplicity of cloned drives
needed, it is really not suited to the problem.

How often will you be making that disk image?
What will you do with old images?
Will you take a new clone every time you update dom0 or a template?
Every time you do some work?
Really??
What happens if you discover you deleted a crucial entry in KeePassXC
4 months ago?

Clonezilla is a great tool for disk imaging, no doubt. But I don’t see
full disk imaging as particularly useful in Qubes, and certainly not to
replace qube and data backups.

deeplow · June 7, 2021, 9:14pm

4 posts were split to a new topic: Backing stuff into a qube; then backing up that qube?

Rhys-Hussain · August 16, 2022, 1:17pm

I think all encrypted file is authenticated, is this right? If malicious user update your encrypted file, then you can’t decrypt using your password.

adw · August 17, 2022, 12:06am

No. You are conflating two distinct properties: confidentiality and authenticity.

Basically: Confidentiality means that the data is kept secret from all but authorized entities. Authenticity means that it really came from the source it claims to come from and hasn’t been modified in the meantime.

The fact that encrypted data can’t be decrypted without the key ensures confidentiality (as long as the key remains protected), because only those who possess the key can read it.

But an attacker could modify your encrypted data when you’re not looking or replace it with his own encrypted data. When you go to decrypt it, you’re not decrypting what you think you’re decrypting. The data is not authentic. This could be used to exploit a vulnerability in your decryption software, for example. That’s why it’s important to authenticate the data before you try to decrypt it. Simply trying to decrypt it and seeing if it works is not safe, because by then you have already performed complex operations on the data, which is an attack vector. Unless you have some means of verifying authenticity prior to significant parsing of the data (like the Qubes backup system does), you’re vulnerable to this.

Rhys-Hussain · August 17, 2022, 11:18pm

This is the thing I didn’t consider.
I think a hash can prove authentic. You discard the encrypted data whose hash isn’t right. Hash can be stored in different places or even publish it online so malicious user can’t modify it.

adrelanos · July 5, 2023, 8:44pm

Why do I like raw disk backups?

Upgrading Qubes, specifically dom0, has the risk of either bricking boot or networking. Both happened to me several times in the past. These are situations which are difficult to recover from and break productivity.

How else could I test if upgrades with reasonable certainty that my system won’t get bricked? Perhaps by having an identical spare notebook of the same type with the very same hardware.

As for the time required to perform a raw disk backup, it’s going to boot menu, booting a Live DVD or Live USB. Doing the backup. With an internal 1 TB SSD hard drive (M.2 storage chip) (NVMe) to an external USB 3.1 NVMe it took ~ 1 hour. That’s quite okay for my use case.

These aren’t the only backups I am doing. I am also doing backups using the Qubes Backup tool.

Raw disk backups are useful in case:

A) preventing a brick after upgrades
B) preventing to become a victim of a similar backups horror story, other Qubes Backup tool issues. (In the past due to some issue with scrypt, I couldn’t restore backups on a newer Qubes version.)

Documentation on how to create full raw disk backups:

Insurgo · July 23, 2023, 7:41am

@adrelanos

Would be curious of your input on latest clonezilla which supposedly backups the lvms inside of the luks containers now for those raw backups?

adrelanos · July 26, 2023, 12:02pm

LVM can be great for expert users, sysadmins but not needed for most users. At least, I personally feel I don’t need it.

As of Debian bookworm,

the Debian installer (DI) (the “console looking one”, “classic installer”) still creates an encrypted LVM.
when booting the Debian Live DVD and starting the new, graphical calamares based installer, it creates an encrypted /boot without LVM.

The latter is much better for me personally.
(Except from the grub stage1 keyboard layout issues - Encryption does not work well with non-QWERTY keyboards · Issue #1203 · calamares/calamares · GitHub - but that’s a different issue.)

Because if using LVM when I boot form a Live system and want to mount my disk I need to:

cryptsetup mount
LVM mount
mount
do what I actually wanted to do
umount
LVM umount
cryptsetup umount

I don’t want the extra complexity layer of LVM.

Qubes also might be planning to move away from LVM?

github.com/QubesOS/qubes-issues

Switch default pool from LVM to BTRFS-Reflink

opened 09:46AM - 22 Mar 21 UTC

DemiMarie

T: enhancement P: default C: storage

**The problem you're addressing (if any)** In R4.0, the default install uses …LVM thin pools. However, LVM appears to be optimized for servers, which results in several shortcomings: - Space exhaustion is handled poorly, requiring manual recovery. This recovery may sometimes fail. - It is not possible to shrink a thin pool. - Thin pools slow down system startup and shutdown. Additionally, LVM thin pools do not support checksums. This can be achieved via dm-integrity, but that does not support TRIM. **Describe the solution you'd like** I propose that R4.1 use BTRFS+reflinks by default. This is a proposal ― it is by no means finalized. **Where is the value to a user, and who might that user be?** BTRFS has checksums by default, and has full support for TRIM. It is also possible to shrink a BTRFS pool without a full backup+restore. BTRFS does not slow down system startup and shutdown, and does not corrupt data if metadata space is exhausted. When combined with LUKS, BTRFS checksumming provides authentication: it is not possible to tamper with the on-disk data (except by rolling back to a previous version) without invalidating the checksum. Therefore, this is a first step towards untrusted storage domains. Furthermore, BTRFS is the default in Fedora 33 and openSUSE. Finally, with BTRFS, VM images are just ordinary disk files, and the storage pool the same as the dom0 filesystem. This means that issues like #6297 are impossible. **Describe alternatives you've considered** None that are currently practical. bcachefs and ZFS are long-term potential alternatives, but the latter would need to be distributed as source and the former is not production-ready yet. **Additional context** I have had to recover manually from LVM thin pool problems (failure to activate, IIRC) on more than one occasion. Additionally, the only supported interface to LVM is the CLI, which is rather clumsy. The LVM pool requires nearly twice the amount of code as the BTRFS pool, for example. **Relevant [documentation](https://www.qubes-os.org/doc/) you've consulted** `man lvm` **Related, [non-duplicate](https://www.qubes-os.org/doc/reporting-bugs/#new-issues-should-not-be-duplicates-of-existing-issues) issues** #5053 #6297 #6184 #3244 (really a kernel bug) #5826 #3230 ― since reflink files are ordinary disk files we could just rename them without needing a copy #3964 everything in https://github.com/QubesOS/qubes-issues/search?q=lvm+thin+pool&state=open&type=issues

I didn’t look into it.

I don’t even consider this a raw disk backup? Unless I am misunderstanding something. Raw for me implies 1 to 1 exact copy.

(Ideally it would even be bootable. With legacy booting it was. But… Issue: Raw disk backup of Qubes EFI is unbootable · Issue #8363 · QubesOS/qubes-issues · GitHub)

It seems complexity wise somewhat “in the middle” or “different”? It’s not the sophisticated, complex solution with all of its current issues Qubes Backup tool and it’s also not the simple raw backup?

unman · July 26, 2023, 2:55pm

I’ve said before that I don’t consider raw disk, Qubes backups, or LVM
backups like Wyng useful as backups in most cases. They have their place in
particular situations, but lack what I consider to be essential features
in any backup/restore process.(I think that this was what the original
thread was about.)

With Data spread across various qubes, how is the user to backup that
data? How is this to be done in such a way that a user can readily
restore a specific file from (say) 3 days prior, or compare different
versions of a file over the past month?
I use various approaches.

One method is to qvm-block attach the vm-private volume to a backup qubes,
mount the drive, and take a backup. My favourite mechanism is zpaqfranz,
which creates incremental backups in a single file. I haven’t found
anything to match speed and compression in creating encrypted backups for my data.
The basic method is in the docs
Because the latest backup is appended to the file, you can use rsync --append
to quickly transfer that to off site storage.

Another approach is to use qubes-sync to sync data across to
another qube. I use an aggregator qube for this, using rsync with the link-dest option
to provide a Time Machine like backup and save on space. I run this sync
hourly.
I also use zpaqfranz running in a qube to push incremental backups to
other qubes using rsync or sshfs over qrexec.

I do use Syncthing, both to sync data between qubes, and to sync to
external storage. For offline qubes I syncthing over qrexec. There’s a
salt version in GitHub - unman/shaker, and a package available
installable from the Task tool at qubes.3isec.org

All these mechanisms are data focussed, not qube centric.
Every user should look for approaches that secure their data, and make
it easy to access and restore. If the Qubes backup is sufficient for
your use case, make sure you use it, and test your ability to restore
your data.

I never presume to speak for the Qubes team.
When I comment in the Forum or in the mailing lists I speak for myself.