Securing file integrity in offline storage qubes

LadyPropane · November 1, 2024, 4:31pm

This is likely discussed elsewhere, but I couldn’t find anything.

Threat model: a sophisticated and motivated adversary that targets us specifically. They can’t compromise the Qubes isolation - they can’t “break out of” a qube by compromising Xen or any of the Qubes code that insures isolation, or by exploiting a hardware side-channel vulnerability. Their goal would be to exfiltrate data or compromise as many qubes as possible.

We have an offline storage qube storage-1 that acts as file storage for documents, book, media and so on. If all we do is send files to storage-1 and open them in storage-1, any malicious file that could exploit storage-1 (e.g., a pdf exploiting the pdf reader) shouldn’t be to exfiltrate any data since storage-1 is offline. An “offline” qube is, of course, one with “none” as the netvm, as well as “offline” disposable qubes that could be spawned from it.

However, we often need to send files from storage-1 to another, networked qube - in order to send them to someone else, for example.

How can we be sure that the file we sent to storage-1 (e.g., doc.pdf) is the same file leaving storage-1 afterwards, and not a file that renders as the same pdf, but, for example, includes a list of all other files in storage-1? Surely a malicious pdf would be able to somehow append “find ~ -type f” to any file without us noticing.

I can think of a few solutions:

Never open anything in storage-1.
1.1. Always open files in another offline qube using the “View in disposable qube” option.
This would ensure the integrity of the files in storage-1, as the “View in disposable qube” option doesn’t modify anything on the source qube.
The downside is that it would be slow, as for each file we want to view, we would have to spawn a new disposable qube.
1.2. Always open files in another offline qube (“storage-2”) that’s dedicated to viewing files, but is not disposable.
This would be much faster, as we would only need to qvm-copy the file from storage-1 to storage-2.
Possibly we can make this easier from a UI point of view with dedicated GUI options like “View in storage-2”.
However, storage-1 could still be compromised even without opening files in it - for example, by the file manager that parses the filenames and metadata.
Have storage-1 access the storage in a read-only way.
I’m not sure how that would work, but we can send a file to a qube (“storage-0”) that only stores the files and doesn’t spawn a file manager or much of anything. We can then access that storage from storage-1 by exposing storage-0’s storage as read-only. One issue with that is that we’d need to organize the files in storage-0, which would leave us vulnerable to file manager or terminal exploits via filenames or file metadata when we move files around in storage-0. Otherwise, we would have all files in storage-0 in ~/QubesIncoming, which is not practical. Also, it seems like some of the filename or metadata exploits would still affect the storage-0 qube.
Calculate the hashes of files sent to storage-1 and compare them to files sent from storage-1. If we receive “doc.pdf” and send it to storage-1, another qube (or service?) in between could hash the file and add an entry like “f2ca1bb6c7e907d06dafe4687e579fce76b37e4e93b7605022da52e6ccc26fd2 doc.pdf” outside storage-1. Later, when we want to send doc.pdf from storage-1, the same qube or service could hash the file again and check if it has that hash (“f2ca1bb6c7e907d06dafe4687e579fce76b37e4e93b7605022da52e6ccc26fd2”) in its records, and if the filename is the same as “doc.pdf”. This could be presented to the user with a prompt that shows the hash, the filename and whether the hash or the filename has changed since the table had been updated before with that entry. There could be several scenarios for what the prompt would show:

“hash and filename match” - the user is safe to assume that this is the same doc.pdf they sent to storage-1 in the first place;
“hash match and name mismatch” - the user would have to decide whether they renamed doc.pdf to something else, or if doc.pdf was maliciously switched with another file (e.g., “super_secret_doc.pdf”).
“hash mismatch” - the user would have to decide whether they edited doc.pdf. They could run a diff in another qube to see the changes. That could happen after extracting an archive, as well, since the hashes of the newly-extracted files wouldn’t be recorded. So the user would have to consider that.

I’m not sure about how likely it would be to compromise a qube just via the filenames and metadata, how easy and useful it would be to expose one qube’s storage as read-only to another qube, or how easy and useful it would be to calculate the hashes of incoming and outgoing files. I’m looking forward to a productive discussion on this.

FranklyFlawless · November 2, 2024, 7:26am

You can create multiple storage VMs, then dedicate them for one file each so any malicious PDFs would only be able to exfiltrate the same file list as any other storage VM. You can also convert the PDF using Qubes in-built tool or Dangerzone to a trusted PDF before sending it to the storage VM. Additionally, you can change the file permissions of the PDF to read-only so no malicious code execution could take place.

theo_wreck · November 2, 2024, 7:47am

You can create multiple storage VMs, then dedicate them for one file each so any malicious PDFs would only be able to exfiltrate the same file list as any other storage VM.

can you expand on that…? I don’t get it… the whole VM for one file??? Isn’t that what disposables are for like in DVM section of the qube it self?

FranklyFlawless · November 2, 2024, 8:08am

There are a few issues with DispVMs that make this workflow problematic for file transfer. Since DispVMs shut down when all applications and windows involving it are closed, it requires that the incoming and outgoing file transfer to the DispVM occur in the same session. If power to the machine is abruptly interrupted during this session, the files within the DispVM may be lost unless you have knowledge of file recovery techniques using digital forensics.

theo_wreck · November 2, 2024, 9:11am

So what’s the solution to that… to create a qube each time for just one file???

FranklyFlawless · November 3, 2024, 9:39am

I provided three different suggestions, so feel free to carefully read them again. If there is any need for more, I will generate them.