[Ideas] Why do people not make backups? How to improve that?

Each time you run Wyng backup, it transfers to the destination (remote server over ssh or a local directory in a qube) only what changed since last backup. Also, data that are present in multiple qubes only take the disk space once (this is called deduplication, as it avoids data duplication).

Of course, each backup produce a full snapshot of your qubes, you can restore them at any point of time when you did a backup.

The qubes os backup tool does save all qubes separately, without deduplication, and does not support incremental backups. If you have 600 GB to save, every time you backup you produce an archive of around 600 GB. With wyng, you only add the delta every time you backup.

Of course, you can remove older backups in wyng to free some space, but it is good practice to keep a least a backup of a few months / weeks old backup and last n backups, in case something went unnoticed.

So let me see if I understand this correctly.

I have some qube, call it A. If I back it up with Wyng, it will of course copy the entire qube the first time. But then if I go into that qube and change something, say I add an alias to bashrc, Wyng will basically copy just that bit of it (and leave itself a note that this is the changed part of the original backup). Subsequent changes are stored as, basically, “this is what has changed since the last time.”

You still end up restoring the entire qube if you need that file for whatever reason, but at least you didn’t back up complete copies every single time. And presumably Wyng is smart enough to be able to go back three backups and restore the qube as it was then, ignoring the subsequent increments.

So, even though a regular backup will consume less space, you still have each qube being treated as an atomic unit on restore; you have to restore the entire thing. What Wyng does, basically, is store the backups more efficiently.

I could see a strategy like that working for me in a very different context, actually…with regard to my data (that sits in vercrypt containers). Those get “backed up” according to a very primitive scheme, with only the most recent ones saved, but at least only if they’ve changed. So I should maybe see if I can get Wyng to work with that. (I assume it’s FOSS?) It would have to be able to run on my NAS, though.

My qube strategy is essentially to store almost nothing on most qubes. I have perhaps five or six actual “regular” AppVMs, and I make QubesOS backups (i.e., full copies of them) only after I access them. (I’ve essentially gimmicked all menu entries for those qubes to put a flag on my system that says "this qube has been accessed; if it’s not running right now, back it up.) That at least saves me from totally useless identical copies, but still every backup is a full backup.

Other VMs (most of them) I can literally regenerate from scratch if they somehow get corrupted.

yes

yes, but being efficient this way allows to run backups often with less storage burden, win-win

A backup strategy can’t save the whole volume and allow to pick a single file from a volume. You need to decide what to save. I guess you can ask yourself the following questions:

  • do I often need to recover files?
  • do I want to restore the whole qubes automatically?
  • do I need to save files more often than I stop my qubes? (because block level backups require the qube to be shutdown as they always run on a derived disk that you can rollback, which could be used to recover a file modified since boot by the way, if you revert the whole volume)
  • can I wait to restore a whole qube if I need to recover a single file?
1 Like

It’s a single python script. Wyng-util-qubes is another. You could ssh to your NAS.

2 Likes

Hi!

Just sharing my experience with you (and some code).

For the last decade I have been doing recovery / backups in mainly 2 ways:

  • Data backups: daily backups of data I know I want to preserve: documents, photos, work projects, code, etc. I select carefully the directories to include and exclude. Remote, encrypted, incremental.
  • Ready to use system: Once a couple of months, a dd to a spare disk and boot checking.

The Ready to use system allows me to have a working setup where I can restore from the Data backups.

I want to continue with this scheme as it has worked for me for many years. I don’t know if I will continue using QubesOs in let’s say 10 years so I prefer not to tie to something qubes specific.

I moved to QubesOs in mid 2024 and it took me 6 months without Data backups until two weeks ago I found the time to properly think about what to do with it.

I wrote a simple python script to do data backups with borg: Backup files from multiple qubes to the same borg repo ¡ GitHub

It uses a specific qube that acts as the BackupVM (firewalled with only access to borgbase). The private volumes of the qubes to be backuped are mounted read only to a directory inside the BackupVM (eg: /home/user/bkps).

Borg is then run, passing this directory and a list of exclusions.

It is far from perfect but with that scheme I was able to backup to the same borg repo that I was backing up in my old system.

It is very easy to adapt to other backup tools.

Let me know if you think that this data backup system is flawed, I am not a security expert.

After reading all the thread I think that there is no solution that is good for everyone, having more options is good (sometimes!).

Thanks everyone for sharing your insights! :slight_smile:

1 Like

Not sure about this, but is there anything preventing a malicious file in one of the backuped qubes to exploit a bug in borgbackup, compromise the BackupVM and thus, access the content of all the qubes attached to the BackupVM?

I had a quite similar configuration when I started using Qubes OS. I came to the conclusion that it wasn’t consistent with how Qubes OS works, because as far as I understand, even dom0 doesn’t parse the content of the private volume

I don’t think that the access to the files is an issue as the VM is firewalled to only have access to a specific remote server. So it would have to exploit an issue in borgbackup and also in the remote server and from there to send the files elsewere. (In any case passwords and ssh keys are also encrypted in disk)

The issue I see is that if it atacks borgbackup it can tamper with the files and much later when recovering the files run an exploit (eg: injecting an attack in a backuped .bashrc ).

I think I can live with that as I seldom recover files, and if the borgbackup exploit is known I can ditch the borg repo and startover.
What do you think? I am being too much optimistic?

I think not, but then I use your approach.

Like you I run data backups, some daily and some more frequently. I
rarely use the Qubes Backup tool. I’ve written before about why I think
that block based backups - incremental or not - dont suit me, and I
think, many users.

I run backups on a disposable, rather than a persistent BackupVM, which
mitigates risk to some extent.

I never presume to speak for the Qubes team. When I comment in the Forum I speak for myself.

31st March was the world backup day.

To celebrate it, I did a PR to the classic restore, adding support for automatic recognition of some of the popular compression algoriths (more details here).

Here I share some ideas on how to improve your backup routine (general and Qubes OS specific).

  1. Many of us have replaced our mechanical HDDs with faster SSDs. While the old HDDs are almost unusable for general day-to-day usage, they are perfectly usable for yearly backups. Even if they have bad sectors, an imperfect backup is better than no backup at all. Having a cheap USB-SATA cable or a bunch of enclosures should allow you to reuse them for backups.

  2. Many of us have USB-2 sticks which are either too slow or too much used/worn out. Or some of those fancy gift USB sticks which are useless for day-to-day usage. Their capacity might be limited by they are perfectly usable for backups.

  3. Similarly, some of us have lot’s of unused/slow microSD cards. A cheap microSD reader should allow their reuse for backups.

  4. Donating to Wyng project is highly recommended. As I have received news that it might be an integral part of Qubes OS in future releases. This should allow Wyng developer to dedicate more time on the project. Details are available in this forum post.

  5. Always have yearly backups, preferably off-site. For example if you pay a visit to your parents, in-laws, children, … you could drop (an encrypted) backup there.

  6. Prioritize backups. The content you create personally (e.g. your photos, documents you write, spreadsheets, …) have much higher priority than the funny videos available online or the OS installation partition. If you do not have enough storage to backup everything, backup the most important data. Do not wait for purchasing more backup medium.

  7. Cheap 3D printed organizers or ordinary organizers could be used to organize your backup mediums

4 Likes

Let’s remind new community members to not forget about doing proper backups :smiley:

3 Likes

Actually, I think this could be quite useful, since it would be one step closer to an automatic backup system. Following Qubes’ “don’t trust infrastructure” if the backup files could be reasonably secure, then they could just be exported to a cloud service (or self-hosted) from there.

I’d agree with @XMachina that backups need to be automatic (or as close as possible), and I would add as well that they be de-duplicated, incremental, easy to find and restore on a per-file basis, etc., and any steps that get closer to that decrease the burden on the user and increase the number and quality of backups people do/keep.

In an ideal world, it’s one click to setup a basic backup qube and a URL (or several) to send them to (incrementally), with more advanced configuration for those who need/want it.

The Qubes backup and Wyng are not suited for this use case.
I use zpaqfranz to archive the data in my qubes, and to store it off
Qube, and that covers all your requirements. But it isnt for every one.
Borg would probably do.

It’s a fairly straightforward setup to run a shell script that spawns a
disposable, attaches the private disk, and runs an incremental backup of
required data. I run this via cron at varying intervals (hourly, daily,
etc)
I dont see this as an automatic process, since you probably want to
choose what qubes to backup and what data within those qubes, but that’s
your call.

I never presume to speak for the Qubes team.
When I comment in the Forum I speak for myself.

That sounds great. Are there any security concerns with doing so? Like if I backed up a qube with malware?