Why would I compartmentalize always-offline data?

leesa-felonia · July 12, 2024, 11:26am

The question is simple, but I’m really waiting and existed for the answer.
Are there any benefits of separating/isolating things that will never exist unencrypted in a VM or a real machine that has network? I used to use one AppVM for reading and writing to many different kinds of files that’ll never have a network, such as passwords databases, notes, books, etc.

I currently see that as long as the files are in a VM that is not networked, and files will never be in a VM or real machine that is networked, then the most that can be done with those files is getting them deleted (which is not an issue since I do regular backups,) thus I see that there’s no reason to separate those files, nor even do something like changing the TemplateVM to a minimal one. I want to see what am I wrong in and what am I right in.

Why would one have an AppVM for passwords databases, another AppVM for notes, and another AppVM for books, while the three will never be networked? (despite: for organizing.) Why would one have a minimal template for that one or those three AppVMs, while it/they will never be networked?

aronowski · July 12, 2024, 11:34am

Different policies of what they are allowed to do e.g. on behalf of a GPG frontend domain, copying passwords and one-way-copying of documents to the offline domain, but not from it.

Example here:

- notice how, for instance, the “vault-pgp” domain is not allowed to utilize anything regarding the “qubes.Clipboard” policy.

tempmail · July 12, 2024, 1:40pm

Exactly because of that. Any malicious code can destroy your data, regardless of is it offline or not. Not only app software can have such a code, but even system. Data are never safe. There are unsafe and less safe places. For example, vault VM is the least unsafe place, not safe by any means. And the data which is never ran in a qube where iti is sotred is the least unsafe. So, I keep all dat in one place, but modifying it in a different dispVMs: copy it to a dispVm, modifying as wished, copying back to storage VM.

leesa-felonia · July 12, 2024, 2:48pm

From what I understand from you, I would say the following: There’s no other VM that would need to talk in anyway to my vault AppVM I’m asking why would I compartmentalize its data. It’s only used to deal with data inside it, which none of it will ever go somewhere else.

leesa-felonia · July 12, 2024, 2:55pm

I’ve already mentioned that I have no problem with data being destroyed, since I would then be having a backup. I can afford losing the data that’s been added after the backup.

This is interesting though, thanks!
But does that make any difference? Since it’s still: the maximum that can be done is data being deleted, which I can afford it since I do regular backups. Isn’t that right?

aronowski · July 12, 2024, 4:39pm

If there’s a KeePassXC database in that offline vault and it has the clipboard-related policies permitted, it’s possible to accidentally copy something else than the intended password and paste it to another domain.

Separation and proper policies can help with this issue. Same thing as allowing only some domains to kindly ask the GPG backend domain to perform a cryptographic operation and serve the result - wouldn’t want an untrusted domain doing so and me risking a GUI attack that would make me press the Enter key accidentally and have an attacker sign something with my private key.

boreas · July 12, 2024, 6:37pm

If you would connect your (maybe) single up to date backup (that is offline) to your backup disk (also offline) with malware inside, it could lead to the deletion of the backup, as well as the offline qubes, afaik.

leesa-felonia · July 12, 2024, 7:03pm

I’ll kindly say that it’s always possible to accidentally do anything wrong at any time at any place. Only thing one can do is to pay attention to their health, and perhaps not stay up late doing serious stuff with data.
I’m sorry that I’m quite not convinced with this idea, and I’ll just move on.
Keep in mind that you’ve just said is interesting to me and I’ll think about it more later. I thank you of course!

Can you please provide an example or so? I don’t understand. I said that I’ll copy nothing from that AppVM to any other VM. Doesn’t that mean what you said is unrelated here? I’m sorry. I kindly request you to try to help me understand what you mean by this.

leesa-felonia · July 12, 2024, 7:51pm

Here’s what I would do: The backup is ‘backup.tar.gz.gpg’; I’d only create a new DispVM that is what is connected to the disk and nothing else. Then I’d send that backup from that DispVM to a new DispVM. Then decrypt and decompress the backup there.

I believe now this is the ultimate question: Is there a lockout? If the answer is yes, then I’m absolutely convinced to compartmentalize always-offline data.

Here, by lockout I mean that I’ll no longer be able to use the backup due to the malware inside it.

Here’s what I think: I’d decrypt and decompress the backup and have it. What I believe is the following: There’s no way for any malware to run while I’m decrypting and decrompressing. If that’s true, then I believe there’s no lockout, since I could definitely then scan the backup and remove that malware and I’d be lost nothing.

–
Is the following true: The malware could be inside the the always-offline data and delete from it while one is not noticing. This way, regular backups will become useless, since the malware could potentially remove data after being added and before being backed up, and one doesn’t notice.

If it is, then I’m convinced. Please, many people say that this is true if it is. Accepting “yes” should be harder than accepting “no”, because accepting “no” would just make one search more, but accepting “yes” would make one do something, which may be dangerous if “yes” was not the correct answer. Thus, I would like to hear “yes” from many people.

tempmail · July 14, 2024, 9:33am

Yes, I read what you wrote, and without further going into details why I think your approach is not failsafe even without knowing your threat model, I must tell that in decades I never seen anyone serious about computers telling they don’t care about data because “they have backup”.
Your idea looks to me like negating the whole base philosophy of Qubes - compartmentalizing. I am afraid that when you learn it, it’ll hurt, and probably, I could bet this, you would somehow blame Qubes then.

And, it’s not about IF, but WHEN.

I apologize if this looks harsh, but I find it better to tell it and it never happens, than to know it and remain silent but it happens.

XMachina · July 14, 2024, 10:06am

I choose to compartmentalize offline data for a few reasons, but mainly because unless that Qube is entirely isolated (no filecopy in, no copy/paste in, cpu cores isolated, data always at rest and only opened one-at-a-time in dispVMs, etc.) there are still ways it can be extracted. Computers are fickle, and it is unwise too trust that you are fully in control if you don’t fully understand. It wouldn’t be trivial, but for example, a qvm-copy could be hijacked to export sensitive data to a non-isolated qube for extraction.

Also, a word of caution (regarding your stance on data destruction): don’t be too quick to discount things because you’ve considered the possibility. Things don’t go wrong when you’ve planned, but when there was a small hole in that plan that is exploited.

gonzalo-bulnes · July 14, 2024, 1:06pm

Unless I’m missing something myself, I think you’re right.

Putting aside the comments about “you may not doing your backups correctly” because that was not your question, I would mention that depending in the data and how you use it, deletion may not be the worse scenario, data could be merely modified, but in a way that isn’t immediately obvious.

Whether that risk is significant for you, only you can tell. But in theory, the malware that modifies your banking data might have got to your storage qube via an entertainment movie and have become active when you played that movie with XYZ video player. In that sense, if that risk is significant in your threat model, then more compartimentalization, and strictly storage qubes (maybe with disposable qubes for usage of the data), and minimal templates could all be part of risk mitigation.

Now, frankly, a scenario like that seems very targetted, and certainly fairly sophisticated, so it’s probably less than likely for most people in most circumstances, and the usability vs security trade-off is likely significant. That to say: I mention this scenario because I think it matches your question, but use your threat model, not all risks are worth the cost of mitigation!

aronowski · July 14, 2024, 8:57pm

Yes.

Let’s consider a scenario where an attacker wants to sign something with my private key, which is located in the GPG backend domain, and by definition is not extractable.

One way of performing such an attack would be to compromise some domain that can talk to the GPG backend domain and monitor my keystrokes to invoke the request of “Do you want to sign this?” the very moment I’m about to press the Enter key, making me press it the very moment a new window spawns from the GPG backend domain, not the compromised frontend one.

One way of mitigating this is to set proper policies, so that:

only certain frontend domains (e.g. the one with my Git projects, or my email client) are allowed to talk to the backend one
I need to additionally confirm the qubes.Gpg operation from a dom0 prompt, or even better manually type the backend domain’s name

The first one is more on-topic here. There’s no reason why I’d want a domain used for random web browsing allowed to talk to the GPG backend domain. Or for the sake of compartmentalizing offline domains, we can substitute “random web browsing” with “running an offline malware analysis lab”.

adw · July 14, 2024, 10:03pm

Summary of reasons, including some already mentioned by others above:

RPC policy rules and native Qubes backups apply on a per-qube basis. For example, you might want to back up some data separately or compress some qubes but not others. (For example, text files are generally highly compressible, whereas image files are not, because they’re usually already in compressed file formats.) You might want to have different RPC policy rules for different qubes.
Data from different sources likely has different trust levels and associated risks. For example, you might not want to mix files from the Internet with files you create yourself in an offline qube, since files from the Internet may contain malware. You may have some data with high integrity or assurance needs that you don’t want to risk being modified by malware embedded in cat videos.
Side channel attacks make it at least theoretically possible for malware to leak data even from qubes that are completely offline.
After you commingle data of different trust levels, the trust level of the commingled data set generally drops to the lowest common denominator. Since you can’t undo this commingling, a cautious approach would be to avoid it in the first place in case you ever change your mind in the future and decide that you want or need some of the data to exist outside of the offline qube. It’s up to you how cautious you want to be and which trade-offs between security and convenience are worthwhile to you.

It’s at least theoretically possible for maliciously malformed data to exploit a hypothetical vulnerability in the decryption and/or decompression software that is parsing the data.

In general, you can’t assume that you’d always be able to detect and remove malware.

If you run any software in the offline qube, then yes, this is possible. That’s how some ransomware operates.