Backup strategies for Qubes OS

1 Like

@disp6252

If you have ever used Bacula, you would know that Wyng (which is still not ready, IIUC) is no match for it. AFAIK, Wyng is not even planned to have its advanced features. The two are quite different.

just type “qvm-copy-to-vm {filename}”
or open the file browser, right click on the file to transfer and select “copy to VM”

nevermind. Just saw the “isolated” part.

There was some discussion somewhere talking about moving to zfs which, if implemented, would make backups (I.E. copying snapshots) easy. (example sanoid/syncoid). “incremental” would be handled automatically.

@ddevz

There was some discussion somewhere talking about moving to zfs which, if implemented, would make backups (I.E. copying snapshots) easy. (example sanoid/syncoid). “incremental” would be handled automatically.

With that approach:

Can you control which files (not) to backup? (i.e. have complex file sets rules for each qube)
Can you define different backup schedules for different file sets?
Can you backup DBs incrementally?
Can you backup to tape?

If you can find the proper discussion and link to it, we can probably talk there.

I don’t know about “certain files” for a backup…that would require it to look inside the qube.

But a simple option to not back up (again) a qube that hasn’t been accessed at all since the last backup, should be doable. It would require a time-of-last-access property, probably…which as far as I can tell doesn’t exist.

  • Collaboration with grsecurity to get grsecurity kernels and userspace protection on Qubes.

That would be nice indeed.

@SteveC

I don’t know about “certain files” for a backup…that would require it to look inside the qube.

Not necessarily.

Bacula (which I mentioned on many occasions) can backup a whole network without looking inside each machine in the sense you seem to imply (insecurity). It is a multi-component system:

  • Director (which controls all other components, jobs, schedules, file sets etc.)
  • Catalog (DB for various backup related data)
  • Storage (does the actual reading and writing)
  • FileDaemon (a client running on every machine) - only this one has access to whatever you actually backup/restore inside the particular machine
  • Monitor

There can be many Storages (e.g. backup to tape, to hdd, to remote location) and FileDaemons (one for each machine).

In the case of Qubes OS replace “machine” with “qube”. The Director, the Catalog and Storage(s) can be separate qubes (sys-backup-dir, sys-backup-db, sys-backup-storage), strictly controlled by dom0 (which can look inside anything anyway).

Bacula is a network backup tool but I suppose it would not be a great effort to customize (e.g. a through a module) to transfer data without using network between qubes but through Qubes OS’s internal mechanisms. I suppose it is even possible to have disposable FileDaemons, similar to default-mgmt-dvm (which handles updates), running only during backups, thus not requiring a filedaemon service running in the qube.

So, no qube will just look inside any other qube. It may work like (simplified):

  • sys-backup-dir tells qubeN: Give me your data for backup based on file set X.
  • dom0 starts a disposable management qube which sends (encrypted) data from qubeN to sys-backup-dir. qubeN may receive some instructions (data from catalog), so it will know which files to send (incremental/differential backup).
  • sys-backup-dir tells sys-backup-storage: Store this data.
  • sys-backup-dir tells sys-backup-db: Write data about this backup job in the Catalog.

*Sorry for the long post, I hope it is not too off-topic.

1 Like

Incremental backups, please.

Can someone please describe their system, how it is organized and the case when second backup is needed, meaning what exactly from their system should be included in second backup?

I really don’t get it what is there to backup everyday, or a week. Please don’t generalize, just your clear use cases.

I want to understand. Thank you in advance.

I see no need to describe my system. It will be of no help to you.

Just think of what will happen if your box is stolen, or suffers a
catastrophic drive failure.
If you check your emails, and download them to your Qubes box, then you
need to backup as often as you do this. (If you leave email on the
server, fine, but you may lose detail on what’s read/unread/tagged
etc.)
If you create or edit files during the day, then you should back them
up.
If you do not take a backup then you will lose data. This may not matter
to you if you already have data stored online (e.g the email position
above), or in a git repository, or a remote server, or you just don’t
care about what you are doing.
If you only use Qubes for secure browsing, I can imagine having a
completely throw away system. But once you start using bookmarks or
store cookies, you will probably want to keep them somewhere.

So the need for a backup depends entirely on what data you
produce/consume and how important it is to you.
How often you backup will depend on how comfortable you are with losing
data. That’s a judgement only you can make.

I hope that helps.

I never presume to speak for the Qubes team. When I comment in the Forum or in the mailing lists I speak for myself.
1 Like

There is the 3-2-1 rule to know for backups: What is the 3-2-1 Backup Strategy? | Definition from TechTarget

  • 3 copies
  • 2 kind of storage
  • 1 off site storage location

In addition, as your backups should be encrypted, especially on remote storage, you must consider having credentials/keys/hardware token required to access the remote backup if your home burns. I’ve seen too many people being locked out of their backups because their credentials was only on their computer that got destroyed/stolen.

As for the frequency, when you work with your computer, losing data for a day means you waste a day of work. This has a cost on morale, credibility, paycheck.

3 Likes

Or a nuclear attack? Then neither cloud backup will be of a good to me.

I would deserve this if I wouldn’t care about my hardware instead and not checking it on a regular basis.

I wouldn’t agree. I produce files, I download emails/attachments, I use Telegram desktop, I have my phone data stored out of it and whatnots, yet I have them safe and I don’t need to “backup” incrementally or not.

Or, I organized myself not to need backups as often?

Well, obviously to me - it seems, thinking on Qubes when my home burns is not high on my list. It just shows that such a thing, thank God, never happened to you.

Does anyone who choose Qubes? I wouldn’t say so, most probably.

Exactly. And it’s funny that I especially like that word copy used, and not backup.

And how any kind of backup will help you with this exactly? This is just missed subject note and it only proves my point that it’s not the solution in backup but in you. Because, if I can witness home burning, nuclear attacks, catastrophic drive failures, stealing (beauty my laptop that is) and whatnots, what makes you think Qubes won’t screw your backup while creating it at first? Because it does and not that rarely, if you read the forum. Now you think why I emphasized copy word in previous paragraph.

Qubes taught me (how) to be more secure and made me feel more secure, thus immensely more relaxed than I was. Backup screaming only shows people don’t believe neither in Qubes, nor in hardware, nor in their capabilities to keep their data safe and just tend to transfer responsibility to the outer world.

Since I work in disposables only, you imagine where the data gets at the end of the day… That’s what I like is called “thinking with Qubes”.

Backups should be regularly tested. And copies refers to backup copies.

You’re assuming a “backup” must entail some sort of utility that you will need in order to restore the backup. Others are using the term in a more generic sense of having other copies of data both nearby (for quick access if you type rm * in the wrong place) and in a safe distant location (in case your computer is stolen or destroyed). That general sense is also a valid use of the word backup, contrary to your (apparent) assumption.

That said, certainly the backup utility that comes with Qubes is in the class of backup utilities that will leave you in a world of hurt without the utility itself. So I don’t rely on it for anything more than saving my system configuration (yes it captures some data, but I ultimately don’t trust it for that, and am working on ensuring the data is backed up [broad sense of the term] in other ways). If I lose Qubes itself somehow, the configuration is useless; if I don’t lose it, I will be able to restore the configuration.

I use the backp utility to backup VMs, which have some of my data in them (and I need to be better at making sure all of that data is duplicated elsewhere). The overwhelming majority of my data, on the other hand is handled…differently, let’s just leave it at that.

1 Like

@unman gave you an entirely reasonable answer that matches my own case:

Re: frequency

  • emails are only stored locally and not on the server
  • using Qubes OS to do your actual work involving local files

Re: reason (things that actually happened to me)

  • motherboard dies while working at a customers site[^1]
  • I accidentally delete important data because tired / distracted

For me that’s the same as with security. I don’t need to justify my need with catastrophic scenario like “nuclear attack”, “stolen”, “government agencies” … it’s the mundane stuff I worry about. Mostly me being an “idiot” (tired/distracted) and low-life cyber criminals trying to extract data / money from me.

But I have empathy for your attitude towards backups. Most people (including myself) only get it AFTER loosing valuable data and time. Once that happened, you’ll get it too :wink:

I tried to respond to you via PM, but kissed the door.

Apologies to the rest of you - read above

Since I started to use Qubes, I didn’t loose a single byte. I am not into quoting and repeating myself from my previous post there. I have copies of my data, but not copies of backup. I have a single backup copy only, to restore Qubes.
There should be too much hardware to die, me to loose my data, not only MB. I can’t delete my data because of my routine, so unlikely. Even if I’d do it, it should be done on more than one location to delete it accidentally in order to loose it.

As I said, the only thing I will repeat. People tend to transfer responsibility off of themselves.

1 Like

@tempmail

I really don’t get it what is there to backup everyday, or a week. Please don’t generalize, just your clear use cases.

I want to understand.

Consider:

  • a hospital processing patient data
  • a company issuing invoices every day
  • a lawyer working on clients’ cases
  • an investigating journalist compiling important evidence
  • the new chapter of a novel a writer
  • a photographer capturing important images
  • the daily activity of a digital artist working on important projects
    etc.

As you can see, there are cases beyond your particular way of using computers and backup is very justified. Data (especially personal one) is today’s gold.

I would deserve this if I wouldn’t care about my hardware instead and not checking it on a regular basis.

It is not as simple as “It spins and software X shows there are no issues, so it is fine.” Computer chips have defects at the time of manufacturing - all of them. Some hardware is deliberately manufactured to fail after certain time (so you can buy new stuff). Hardware is also proprietary (with very few exceptions). The point is: you cannot be sure that what seems to work now won’t fail in 1 day. However, if you have backup - you can restore your data on a different hardware.

Backup screaming only shows people don’t believe neither in Qubes, nor in hardware, nor in their capabilities to keep their data safe and just tend to transfer responsibility to the outer world.

So, what is responsible? Not to have backup and “believe”?
Let’s rather have secure and efficient backup.

2 Likes

Oh man… Let me try to sum it up: it is obvious to me that the way my simple, explicit question is perceived is completely equivalent to backup perception. Not to speak that I’m not talking about backing up data, but about backup utility feature-adding for which is screamed here.
Not to say, also, that the next feature wanted would be VM indexing, then file indexing, then file content indexing of a backup.
Because, beside the fact it would be insane to create backup after each “new chapter of a novel write”, it would be insanely hard to find a previous version of a chapter in dozens, or even hundreds of a backup copies. But the most insane would be incrementing a backup and loosing previous versions. Oh, no. Yes, this can be overcame by file naming, track changes, etc. Oh more things to remember, learn how-to and at the end “couldn’t this be more simple than this, I just want to be a writer…”

And all of this can be prevented at the moment of creation by manual copying A FILE to a backup location (both internal and external)

I am not writing this to respond to you. I am writing this to give future inexperienced users a right to a different perspective, out of the box one hopefully.

Why would the backup tool not be able to save each increment, allowing you to restore each previous version and save you a lot of disk space.

I assume that was what was asked for with “incremental backups”.

I think a huge part of this argument lies in the fact that one side is disputing the value of a “backup utility” (but usually just calls it “backups”) while everyone else is assuming that individual is advocating for never backing up data by any means.

No, Tempmail’s issue is not with making backups, it’s with backup utility software.

And to that extent s/he has a point. A lot of such software is set up in such a way that the software is required to restore the backup. If that software was installed on the machine that just died…you could have a problem, depending. If it’s the qubes backup software, you need a working installation of Qubes in order to do a restore.

And what if there’s an encryption key that you didn’t copy off the machine that died?? yes, that does happen. Fortunately the QubesOS software uses a simple password as the basis of the key, rather than randomly generating some keyfile you have to remember to keep a copy of, elsewhere.

There’s nothing wrong with automating backups (even if it’s just a bash script copying things). But it’s necessary to ask yourself if you can restore the backups onto a totally new machine…because your old machine is now a brick, or a puddle of slag, or gracing the shelf of your neighborhood burglar’s ‘fence.’ If your answer is yes, then my counter is “are you SURE”?

As I’ve said before, I don’t trust my data (except for a very minimal amount, which I constantly work on reducing), but rather just my configuration (pretty dang complicated) to Qubes OS’s backups; I have other methods for data. But even so it would still be nice to have a mode where a qube does not get backed up if it hasn’t been started up since the last backup.

[For people who didn’t grow up speaking English: ‘fence’ in this context is slang for the individual to whom a burglar takes stolen goods, in exchange for money.]

2 Likes