[Ideas] Why do people not make backups? How to improve that?

It appears that I have written 19TiB on the drive (it is a 256GB). According to SMART log.

The btrfs check managed to repair the drive from live boot and I can mount it for the time. I will take necessary snapshots and make a backup from the current status again. Then trying to boot Qubes again and make a regular backup from the Qubes Backup App. But I guess I should retire this drive. It is not trustworthy for development anymore.

4 Likes

Since the primary topic of this thread was about backups (and not hardware failures), I would like to share the benefits of BTRFS. Having the option for block-level subvolume send/receive and differential backups is something which makes it very appealing. It is not up to me to decide. But I would have switched from LVM-Thin to BTRFS in r4.3 (Just like how Fedora did it).

I also suggested separate pools (actually BTRFS subvolumes) for root and private volumes by default. The proposal was rejected. It would give the user the flexibility to snapshot only private volumes all together (which gives the option of easy differential backups) while remaining backward compatible.

2 Likes

Btrfs is flexible enough to do this ad hoc: Make a read-write snapshot, delete the image files in the snapshot that you donā€™t want (by volume type, by VM name, whatever), flip the snapshot to read-only. Then you can btrfs send it.

4 Likes

There were other reasons in the proposal for separate pools. One was using a separate pool for root on a high endurance SLC/MLC/TLC based drive and another on a lower cost/GiB for private pools (where wear-out is not as much as root volumes). But as Marek and Marmarta suggested, it is a heavily customized scenario. I guess very few people write 19TiB to a 256GB drive.

And automatic snapshots (like how openSUSE does with snapper and their complex subvolume scheme) is also appealing. A custom time-machine could be easily built with that.

1 Like

Theoretically, even that is still only a quarter of the 80 TBW that the drive is specā€™d to:

OK. The laptop boots again. Time to make new regular backupsā€¦

Maybe this one is an older generation. Considering the SMART logs, I can not trust this drive anymore. Since this laptop does not appear to support PCIe M.2, I am stuck with mSATA variants. They are scarce these days but still cheap. A 256GB is around $25 and a 2GB is $120

I have to admit somethingā€¦

I have made a video on backups, where I tried to explain the easiest way I could find to do backups. There is now a better and equally fast way to do it with a dedicated disposable opened from the deviced menu (thanks marmarta!), but that was not a possibilty at the time. Here is it:

Unfortunately, the video has some audio issues that I cannot fix anymore because I no longer have access to the source files. The reason ā€“ very ironically ā€“ is that I dindā€™t have backups for that video editing machine. (In case anyone has been wondering that has been part of the reason why I havenā€™t posted more tutorials yet). And the dumbest part is that I lost the data because I was switching SSDs back and forth to try and get GPU passthrough on 4.2 and one day I forgot to switch and overwrote the 4.1 working driveā€¦ :grimacing:

Why did I not back up?

I keep asking myself why I postponed backing up a machine even as I was doing a video about it. And I think the reason was:

  1. Large files / volumes - in particular video files - many templates from my attempts to buld a video editing template. This would have taken literally hours.
  2. Lack of other drives. - to backup 1TB worth of Qubes data you need at least 4TB. Otherwise you either need to script automatic removal of old backups or manually do it after every other backup. I do have a hard drive for my main machineā€™s backups, but I had not yet purchased one for this machine.
  3. Lack of habbit - I donā€™t use this machine every week. So I canā€™t include it in my daily / weekly habbits

In my view the current backup toolling is insufficient to address 1 and 2. Only with incremental backups can that be aliveated. Both in therms of time and space. Point 3 is more human than anything, but if a backup tool could starty automatically in the background (time machine-like), I might have still had the video filesā€¦

Lastly, if anyone is wondering why my email is deeplower@ instead of deeplow@ it was my first big life lesson on backups many years ago.

3 Likes

Itā€™s really hard to do this for every template. I donā€™t always faithfully migrate from one template to the next (like reinstalling and moving everything to fedora-40-xfce) after new template is installed and donā€™t know how to do in place template updates. Qubes is disorganized for me because of time constraints and low skill levels. I have multiple Fedora templates and only recently got rid of Fedora-37.

Running these commands in every Qubes could take several hours for me to do and catalogue correctly.

It would be so much easier if there was a ā€œSave Template Configā€ option and ā€œRefresh Template Stateā€ option that just allowed downloading the packages as part of backup process.

Technically this is a true statement. It is not difficult at all, itā€™s quite easy. In the same way it would be easy to stare at a wall for two hours. Nothing hard about that, just look at the wall.

I do regular backups because backing up something like Qubes is just not optional to me. But I hate it. I dread it. Is it easy? Yep. But it takes an extremely long time and I wish it did not.

What you can do is speed it up. I leave to you how to accomplish that. But dear god, make it not take so long.

2 Likes

Actually here is one (probably not new) idea:

I must be backing up a lot of the same stuff over and over again. The exact same code etc. Consider doing something like rsync where youā€™re only backing up code that has changed. Or some more complicated version of that concept. I donā€™t need to keep backing up the same boilerplate Linux or Fedora or Qubes code.

Incremental Backups is what we call that, and has been echoed in this thread, yes. That is something that becomes more necessary the larger your pool of data youā€™re backing up.

It would make sense to me that it be implemented for btrfs backed pools, perhaps not for LVM. It would also make sense to promote btrfs to Qubesā€™s default in tandem with that. The vibe I get is Qubes will move away from LVM-default eventually. @alimirjamali do you know the reasoning why that hasnā€™t happened yet?

There are open issues about it on Github. I have not even dared to look at them. File System debates could get very hot. At some points, people religiously defend or attack a particular file system. It is something I try to avoid as much as possible.

BTW, incremental backups is still doable via LVM or other File Systems. Wyng is a proof of that. I personally prefer BTRFS because of various features such as CoW, transparent compression, snapshots, quotas, deduplication, ā€¦ which are my own personal use cases. It might be different for other users.

My thought is if brtfs is to be the future, then the Qubes project should not create even more code to maintain by supporting LVM incremental backups. But perhaps Iā€™m assuming too much.

1 Like

Erm, how is making false assertions about verifiable matters of fact insightful? :sweat_smile:

No, it doesnā€™t demonstrate that, because we donā€™t know why they said what they said. It could be that they misspoke, got confused, made a hasty assumption instead of asking first, or were too lazy to read the documentation. Heck, for all we know, they could have been intentionally employing Cunninghamā€™s Law.

Sometimes the reason is ā€œIt seems too hard,ā€ but other times itā€™s more like, ā€œItā€™s not hard; Iā€™m just too lazyā€ or one of a hundred other reasons.

Isnā€™t this true whenever anyone tries anything new? Like if I try rock climbing for the first time, I donā€™t know what I donā€™t know about rock climbing, and I donā€™t even know what questions I should be asking about rock climbing.

Already linked above:

You donā€™t have to sit there and stare at it. You can go afk and do other things while it runs.

Depending on the type of data in your backup, you may be able to make it faster by turning off compression. (Uncheck the ā€œCompress backupā€ option in the GUI or use the --no-compress flag on the command line.)

Youā€™re an extremely valuable project contributor. You shouldnā€™t be deterred from looking at certain GitHub issues for fear of people waging religious wars over file systems in the comments. We already moderate the comments on issues for off-topic and inappropriate comments like these. If weā€™re ever being too lenient, let me know, and Iā€™ll take care of it. Iā€™ll also take another look at that specific issue now. I donā€™t want you or anyone else to have to worry about stuff like this, and I certainly wonā€™t allow it to stand in the way of productive work.

4 Likes

Yes, I just had a 36 hour command run and was shocked to find out my laptop didnā€™t blink. No screen, no activity for 12 hours, just kept doing itā€™s thing.

I am new to Linux, maybe this is all normal here. Iā€™m not used to it.

  • ā€œFocus on qubesā€ stated differently: Qubesā€™ ā€œbackupā€ is an VM recovery tool, not a backup utility. You usually donā€™t end up with a copy on different media - not a backup.
  • Security gets in way, which is understandable. Connecting external (disk or qubesā€™ network) to dom0 is a small DIY engineering effort rather than a config option. Some config detail (DNS, network, proxy, disk type, filesystem, etc.) somewhere is always brittle and may stop working depending on whether instructions you follow were written for your initial config or not (slightly different OS or Qube version, or template for a different distro, or USB attached to this rather than that place, etc.)
  • What would work for me:
    • Dedicated ā€œbackup proxyā€ qube that can be easily configured to either attach a physical device or mount NFS or access the Internet. You can power it on/off and enable one or more of these targets. Thatā€™s all. I know ā€œthereā€™s a similar proxy VM template that can be easily configured to do thatā€ and I disagree with that statement. Iā€™d prefer have a backup proxy VM that does just one thing and one thing alone, than spend hours troubleshooting a generic proxy setup to save 1GB of extra disk space that would be taken by a dedicated backup proxy VM.
    • backup CLI which reads a CSV with qube_name, [dir_name_1, dir_name_2] type of config and recursively copies files to backup files with or without encryption.
    • I donā€™t care about a GUI or any fancy options, I just want one config file for backup, another for the proxy VM (to enable/configure particular backend), and then run backup --target bkpTgt02 --password s3cr3T --suffix '-20240824' in dom0 or even in qubes (not everyone here is a secret agent).

In my case I can barely get the basic features to work and every time my Qubes 4.2 boots it still takes me to that screen to ā€œfinish Qubes configurationā€, so I donā€™t believe I could actually restore VMs in any case. Like many others, I donā€™t care that much for VM/OS backup so I donā€™t use the built-in backup tool either. Itā€™s faster for me to reinstall than restore (assuming a restore could successfully complete).

Does this impact me? Well, because of these difficulties associated with backups, I tend to use Qubes for tasks that donā€™t create or change data (to pay internet bills, for example) and also donā€™t store data on offline VMs such as Vault (too complicated to get data in) - instead, I use a Web vault from my Internet connected VM.

This data protection feature isnā€™t ideal and itā€™s the main reason why I am not fully committed to Qubes - it works great for 5% of my needs, but I use a lousy non-Qubes system for the other 95% (including few CUDA-related apps). I still wonder if itā€™d be better to consolidate all that in a KVM box for a higher level of ā€œaverageā€ security and easier data management.

Why donā€™t people make backups

I have 40 qubes to backup in my QubesOS. It takes around 300 GB of space on my external HDD, and, worse part is, it takes it 3 to 4 hours to finish backing up. Such a long time span requires me to reserve some time period in-advance, which is a hindrance. I canā€™t just sit down, plug the external drive and do a quick back up.

If you salt your templates, then you have a self documenting system, and
all you have to do is back up those salt files.
The change in your working practice is this: in template you install
programs using dnf or apt, and make a separate note of what you have
installed; using salt you create a file named install.sls in
/srv/salt/ like this:

my_new_template_packages:
  pkg.installed:
    - refresh: True
    - pkgs: 
      - qubes-core-agent-networking
      - qubes-core-agent-passwordless-root
      - qubes-gpg-split
      - openssl
      - neovim

and apply it with:
`sudo qubesctl --skip-dom0 --targets=TEMPLATE --show-output state.apply install

If you want to install another program just add it to the pkgs list
and apply the file again.
After a while, this becomes second nature.

An advantage of using salt is that the same file can be used for
various distributions. Sometimes, you hit an issue where a package has a
different name in (e.g) Debian or Fedora - salt has simple mechanisms for
dealing with this. In my experience even ā€œunsophisticatedā€ users are
able to get a working knowledge of basic salt.

I never presume to speak for the Qubes team.
When I comment in the Forum I speak for myself.

4 Likes

Installing Qubes with btrfs, using wyng and wyng-util-qubes, and salting my qubes (all of these steps together, not individually) have encouraged me to backup more regularly and to automate much of what I had once considered burdensome. Having wyng run daily backups to an off-site storage pool is a great peace of mind, and when changes are made to standalones or templates it has quickly become second nature to update the salt scripts (in dom0), which are automatically backed up during the wyng script Iā€™ve written for my use case.

But previously, yes, I had a calendar reminder to backup all qubes using the GUI, once weekly, before bedā€¦and I had to purchase ever-increasing storage drives to contain at least three months worth of backups.

2 Likes

Iā€™ve finally done a couple of back-ups and restores of my ā€˜personalā€™ Qube, at 300 GB a time.

Before this thread was started, I wasnā€™t aware of this facility.

I did back-ups to the /home folder in Dom0, on the internal drive. Each back-up and restore took about 10 seconds.

I didnā€™t need to read the documentation. I think the average person will not have any problems with using this software, once theyā€™re aware of it.