[Ideas] Why do people not make backups? How to improve that?

alimirjamali · August 24, 2024, 12:31pm

It appears that I have written 19TiB on the drive (it is a 256GB). According to SMART log.

The btrfs check managed to repair the drive from live boot and I can mount it for the time. I will take necessary snapshots and make a backup from the current status again. Then trying to boot Qubes again and make a regular backup from the Qubes Backup App. But I guess I should retire this drive. It is not trustworthy for development anymore.

alimirjamali · August 24, 2024, 1:43pm

Since the primary topic of this thread was about backups (and not hardware failures), I would like to share the benefits of BTRFS. Having the option for block-level subvolume send/receive and differential backups is something which makes it very appealing. It is not up to me to decide. But I would have switched from LVM-Thin to BTRFS in r4.3 (Just like how Fedora did it).

I also suggested separate pools (actually BTRFS subvolumes) for root and private volumes by default. The proposal was rejected. It would give the user the flexibility to snapshot only private volumes all together (which gives the option of easy differential backups) while remaining backward compatible.

rustybird · August 24, 2024, 2:07pm

Btrfs is flexible enough to do this ad hoc: Make a read-write snapshot, delete the image files in the snapshot that you don’t want (by volume type, by VM name, whatever), flip the snapshot to read-only. Then you can btrfs send it.

github.com

rustybird/qubes-frontdisk/blob/main/dom0/sbin/btrefsnap

#!/bin/bash
set -euo pipefail
shopt -s nullglob

reflink_pools=( /var/lib/qubes )

[[ $# == 1 && $1 ]] || { echo "Usage: btrefsnap <snapshot>" >&2; exit 1; }
snap=$1

btrfs subvolume snapshot -- / "$snap"

remove=()
for pool in "${reflink_pools[@]}"; do
    remove+=( "$snap$pool"/{appvms,vm-templates}/*/volatile-dirty.img )
    remove+=( "$snap$pool"/{appvms,vm-templates}/*/*-precache.img )
    remove+=( "$snap$pool"/{appvms,vm-templates}/*/*.img*~* )  # tempfiles
    remove+=( "$snap$pool"/{appvms,vm-templates}/*/*.img.*@*Z )  # revisions
done
rm -fv -- "${remove[@]}"

This file has been truncated. show original

alimirjamali · August 24, 2024, 2:16pm

There were other reasons in the proposal for separate pools. One was using a separate pool for root on a high endurance SLC/MLC/TLC based drive and another on a lower cost/GiB for private pools (where wear-out is not as much as root volumes). But as Marek and Marmarta suggested, it is a heavily customized scenario. I guess very few people write 19TiB to a 256GB drive.

And automatic snapshots (like how openSUSE does with snapper and their complex subvolume scheme) is also appealing. A custom time-machine could be easily built with that.

rustybird · August 24, 2024, 2:47pm

Theoretically, even that is still only a quarter of the 80 TBW that the drive is spec’d to:

alimirjamali · August 24, 2024, 3:38pm

OK. The laptop boots again. Time to make new regular backups…

Maybe this one is an older generation. Considering the SMART logs, I can not trust this drive anymore. Since this laptop does not appear to support PCIe M.2, I am stuck with mSATA variants. They are scarce these days but still cheap. A 256GB is around $25 and a 2GB is $120

deeplow · August 24, 2024, 4:37pm

I have to admit something…

I have made a video on backups, where I tried to explain the easiest way I could find to do backups. There is now a better and equally fast way to do it with a dedicated disposable opened from the deviced menu (thanks marmarta!), but that was not a possibilty at the time. Here is it:

Unfortunately, the video has some audio issues that I cannot fix anymore because I no longer have access to the source files. The reason – very ironically – is that I dind’t have backups for that video editing machine. (In case anyone has been wondering that has been part of the reason why I haven’t posted more tutorials yet). And the dumbest part is that I lost the data because I was switching SSDs back and forth to try and get GPU passthrough on 4.2 and one day I forgot to switch and overwrote the 4.1 working drive…

Why did I not back up?

I keep asking myself why I postponed backing up a machine even as I was doing a video about it. And I think the reason was:

Large files / volumes - in particular video files - many templates from my attempts to buld a video editing template. This would have taken literally hours.
Lack of other drives. - to backup 1TB worth of Qubes data you need at least 4TB. Otherwise you either need to script automatic removal of old backups or manually do it after every other backup. I do have a hard drive for my main machine’s backups, but I had not yet purchased one for this machine.
Lack of habbit - I don’t use this machine every week. So I can’t include it in my daily / weekly habbits

In my view the current backup toolling is insufficient to address 1 and 2. Only with incremental backups can that be aliveated. Both in therms of time and space. Point 3 is more human than anything, but if a backup tool could starty automatically in the background (time machine-like), I might have still had the video files…

Lastly, if anyone is wondering why my email is deeplower@ instead of deeplow@ it was my first big life lesson on backups many years ago.

dispuser · August 24, 2024, 7:27pm

likeafox:

On lists of packages installed in templates

replying to:

dispuser:

Keeping a list of each program in each template is another thing I would have to keep track of.

I’ve started keeping notes of all system configuration/setup I do, and I’ve found it incredibly helpful. Highly recommended! And yes I keep a list of all the packages I’ve installed in my templates too. No time like the present to start. I wouldn’t worry too much if you forget a package. If you ever have to restore your system, it’ll usually become apparent when you go to use a program and the app or command is missing, and you can add it then.

Btw for debian/apt you can get a fairly legible list of things you’ve installed with:
zcat /var/log/apt/history.log.*.gz | cat - /var/log/apt/history.log | grep ^Commandline:

There’s probably something analogous to that for fedora/dnf but I’m not as familiar with that system.

It’s really hard to do this for every template. I don’t always faithfully migrate from one template to the next (like reinstalling and moving everything to fedora-40-xfce) after new template is installed and don’t know how to do in place template updates. Qubes is disorganized for me because of time constraints and low skill levels. I have multiple Fedora templates and only recently got rid of Fedora-37.

Running these commands in every Qubes could take several hours for me to do and catalogue correctly.

It would be so much easier if there was a “Save Template Config” option and “Refresh Template State” option that just allowed downloading the packages as part of backup process.

qubes_username · August 24, 2024, 7:58pm

Technically this is a true statement. It is not difficult at all, it’s quite easy. In the same way it would be easy to stare at a wall for two hours. Nothing hard about that, just look at the wall.

I do regular backups because backing up something like Qubes is just not optional to me. But I hate it. I dread it. Is it easy? Yep. But it takes an extremely long time and I wish it did not.

What you can do is speed it up. I leave to you how to accomplish that. But dear god, make it not take so long.

qubes_username · August 24, 2024, 8:05pm

Actually here is one (probably not new) idea:

I must be backing up a lot of the same stuff over and over again. The exact same code etc. Consider doing something like rsync where you’re only backing up code that has changed. Or some more complicated version of that concept. I don’t need to keep backing up the same boilerplate Linux or Fedora or Qubes code.

likeafox · August 24, 2024, 8:50pm

Incremental Backups is what we call that, and has been echoed in this thread, yes. That is something that becomes more necessary the larger your pool of data you’re backing up.

It would make sense to me that it be implemented for btrfs backed pools, perhaps not for LVM. It would also make sense to promote btrfs to Qubes’s default in tandem with that. The vibe I get is Qubes will move away from LVM-default eventually. @alimirjamali do you know the reasoning why that hasn’t happened yet?

alimirjamali · August 24, 2024, 9:07pm

There are open issues about it on Github. I have not even dared to look at them. File System debates could get very hot. At some points, people religiously defend or attack a particular file system. It is something I try to avoid as much as possible.

BTW, incremental backups is still doable via LVM or other File Systems. Wyng is a proof of that. I personally prefer BTRFS because of various features such as CoW, transparent compression, snapshots, quotas, deduplication, … which are my own personal use cases. It might be different for other users.

likeafox · August 24, 2024, 9:29pm

My thought is if brtfs is to be the future, then the Qubes project should not create even more code to maintain by supporting LVM incremental backups. But perhaps I’m assuming too much.

adw · August 25, 2024, 1:37am

Erm, how is making false assertions about verifiable matters of fact insightful?

No, it doesn’t demonstrate that, because we don’t know why they said what they said. It could be that they misspoke, got confused, made a hasty assumption instead of asking first, or were too lazy to read the documentation. Heck, for all we know, they could have been intentionally employing Cunningham’s Law.

Sometimes the reason is “It seems too hard,” but other times it’s more like, “It’s not hard; I’m just too lazy” or one of a hundred other reasons.

Isn’t this true whenever anyone tries anything new? Like if I try rock climbing for the first time, I don’t know what I don’t know about rock climbing, and I don’t even know what questions I should be asking about rock climbing.

Already linked above:

You don’t have to sit there and stare at it. You can go afk and do other things while it runs.

Depending on the type of data in your backup, you may be able to make it faster by turning off compression. (Uncheck the “Compress backup” option in the GUI or use the --no-compress flag on the command line.)

github.com/QubesOS/qubes-issues

[Contribution] qubes-incremental-backup-poc OR Wyng backup

opened 05:14PM - 08 Mar 15 UTC

marmarek

T: enhancement P: major community dev S: needs review C: contrib package

**Community Devs:** @v6ak, @tasket **@v6ak's PoC:** https://github.com/v6ak/qub…es-incremental-backup-poc **@tasket's PoC:** https://github.com/tasket/wyng-backup | Status update as of 2022-08-16: https://github.com/QubesOS/qubes-issues/issues/858#issuecomment-1217463303 ----- **Reported by joanna on 14 May 2014 10:38 UTC** None Migrated-From: https://wiki.qubes-os.org/ticket/858 --- **Note to any contributors who wish to work on this issue:** Please either ask for details or propose a design before starting serious work on this.

You’re an extremely valuable project contributor. You shouldn’t be deterred from looking at certain GitHub issues for fear of people waging religious wars over file systems in the comments. We already moderate the comments on issues for off-topic and inappropriate comments like these. If we’re ever being too lenient, let me know, and I’ll take care of it. I’ll also take another look at that specific issue now. I don’t want you or anyone else to have to worry about stuff like this, and I certainly won’t allow it to stand in the way of productive work.

qubes_username · August 25, 2024, 2:16am

Yes, I just had a 36 hour command run and was shocked to find out my laptop didn’t blink. No screen, no activity for 12 hours, just kept doing it’s thing.

I am new to Linux, maybe this is all normal here. I’m not used to it.

someguy · August 25, 2024, 6:45am

“Focus on qubes” stated differently: Qubes’ “backup” is an VM recovery tool, not a backup utility. You usually don’t end up with a copy on different media - not a backup.
Security gets in way, which is understandable. Connecting external (disk or qubes’ network) to dom0 is a small DIY engineering effort rather than a config option. Some config detail (DNS, network, proxy, disk type, filesystem, etc.) somewhere is always brittle and may stop working depending on whether instructions you follow were written for your initial config or not (slightly different OS or Qube version, or template for a different distro, or USB attached to this rather than that place, etc.)
What would work for me:
- Dedicated “backup proxy” qube that can be easily configured to either attach a physical device or mount NFS or access the Internet. You can power it on/off and enable one or more of these targets. That’s all. I know “there’s a similar proxy VM template that can be easily configured to do that” and I disagree with that statement. I’d prefer have a backup proxy VM that does just one thing and one thing alone, than spend hours troubleshooting a generic proxy setup to save 1GB of extra disk space that would be taken by a dedicated backup proxy VM.
- backup CLI which reads a CSV with qube_name, [dir_name_1, dir_name_2] type of config and recursively copies files to backup files with or without encryption.
- I don’t care about a GUI or any fancy options, I just want one config file for backup, another for the proxy VM (to enable/configure particular backend), and then run backup --target bkpTgt02 --password s3cr3T --suffix '-20240824' in dom0 or even in qubes (not everyone here is a secret agent).

In my case I can barely get the basic features to work and every time my Qubes 4.2 boots it still takes me to that screen to “finish Qubes configuration”, so I don’t believe I could actually restore VMs in any case. Like many others, I don’t care that much for VM/OS backup so I don’t use the built-in backup tool either. It’s faster for me to reinstall than restore (assuming a restore could successfully complete).

Does this impact me? Well, because of these difficulties associated with backups, I tend to use Qubes for tasks that don’t create or change data (to pay internet bills, for example) and also don’t store data on offline VMs such as Vault (too complicated to get data in) - instead, I use a Web vault from my Internet connected VM.

This data protection feature isn’t ideal and it’s the main reason why I am not fully committed to Qubes - it works great for 5% of my needs, but I use a lousy non-Qubes system for the other 95% (including few CUDA-related apps). I still wonder if it’d be better to consolidate all that in a KVM box for a higher level of “average” security and easier data management.

tanky0u · August 25, 2024, 10:49am

Why don’t people make backups

I have 40 qubes to backup in my QubesOS. It takes around 300 GB of space on my external HDD, and, worse part is, it takes it 3 to 4 hours to finish backing up. Such a long time span requires me to reserve some time period in-advance, which is a hindrance. I can’t just sit down, plug the external drive and do a quick back up.

unman · August 25, 2024, 1:25pm

If you salt your templates, then you have a self documenting system, and
all you have to do is back up those salt files.
The change in your working practice is this: in template you install
programs using dnf or apt, and make a separate note of what you have
installed; using salt you create a file named install.sls in
/srv/salt/ like this:

my_new_template_packages:
  pkg.installed:
    - refresh: True
    - pkgs: 
      - qubes-core-agent-networking
      - qubes-core-agent-passwordless-root
      - qubes-gpg-split
      - openssl
      - neovim

and apply it with:
`sudo qubesctl --skip-dom0 --targets=TEMPLATE --show-output state.apply install

If you want to install another program just add it to the pkgs list
and apply the file again.
After a while, this becomes second nature.

An advantage of using salt is that the same file can be used for
various distributions. Sometimes, you hit an issue where a package has a
different name in (e.g) Debian or Fedora - salt has simple mechanisms for
dealing with this. In my experience even “unsophisticated” users are
able to get a working knowledge of basic salt.

I never presume to speak for the Qubes team.
When I comment in the Forum I speak for myself.

kenosen · August 25, 2024, 4:31pm

Installing Qubes with btrfs, using wyng and wyng-util-qubes, and salting my qubes (all of these steps together, not individually) have encouraged me to backup more regularly and to automate much of what I had once considered burdensome. Having wyng run daily backups to an off-site storage pool is a great peace of mind, and when changes are made to standalones or templates it has quickly become second nature to update the salt scripts (in dom0), which are automatically backed up during the wyng script I’ve written for my use case.

But previously, yes, I had a calendar reminder to backup all qubes using the GUI, once weekly, before bed…and I had to purchase ever-increasing storage drives to contain at least three months worth of backups.

FrancisKing381 · August 25, 2024, 7:33pm

I’ve finally done a couple of back-ups and restores of my ‘personal’ Qube, at 300 GB a time.

Before this thread was started, I wasn’t aware of this facility.

I did back-ups to the /home folder in Dom0, on the internal drive. Each back-up and restore took about 10 seconds.

I didn’t need to read the documentation. I think the average person will not have any problems with using this software, once they’re aware of it.