Double door system when copying files between qubes

The sanitization part sounds really good actually

1 Like

+1 for sanitization, it would be really nice to have

@qubist

Sometimes you do have to follow that workflow. A quick example is something along the line of getting a USB stick with sensitive data on it. You would wanna attach the USB stick to sys-usb, send then files somewhere for sanitization, then them it to a vault VM for long-term storage.

1 Like

@qubist wrote:

There is a misunderstanding IMHO. I’ve given and focused on two examples where and how a double-door could help (sanitizing and doublechecking). The story of the investigative journalist is just a scenario where those two examples of double-doors are used. The process of redacting is not part of those two examples but part of the scenario.

Considering the rest of the discussion, it seems to me the confusion comes from the fact that you use generic terms (copying files between qubes) for actually describing a very specific thing (copying one particular file type, PDF, processing it with a particular rasterizer tool).

Ah, that’s indeed confusing. I’m sorry for that!

Imagine, you want to copy a secret file from a more trusted qube X to a less trusted qube Z.

When I read this, I think of something of ultimate secrecy, e.g. a private key. My first association in regards to sanitizing a received file is “removing malware from a file” (e.g. antivirus cleanup).

Sorry for giving a confusing example or description. TBH I focused on the principle of how to construct the double door system rather than why.

After your example about the journalist, the suggested workflow implies any file format (not just PDF) and is not limited to particular tool (qube’s rasterizer). The security model in my workflow is based entirely on compartmentalization and not on a particular tool “sanitizer” which itself can be vulnerable (e.g. through the libs it uses) and perhaps propagate the dirtiness of data to the “sanitized” file.

Thank you for clarification.

IIUC, your workflow (second example) suggests:

distrusted input → [less trusted qube X] → “sanitize” → review → [more trusted qube Z]

Personally, I would never do that and I would dislike to see such suggestion in a doc, because in such workflow, everything is as secure as the sanitizer package. The very concept of a data flow in the direction distrusted → trusted is insecure IMO, whatever in-between steps there might be.

It’s rather the following data flow:

1. [X] --copy-distrusted-file-F-to--> [Y]
2. [Y] --sanitize-F-in-a-disposable-qube-and-send-sanitized-file-F'-back-to--> [Y]
3. [Y] --move-F'-to--> [Z]

Indeed, the sanitizer is crucial. But you will never get perfect security.

You can, of course, consider all qubes to have the same security level.

I don’t think in levels. I think of purpose and data flow - one qube does one thing only and any qube “tells less than it hears”.

I like this approach. But I also think in security levels. :wink:

Opening a potentially malicious file could me harmful for the whole system.

The rasterizer also opens the file it rasterizes.

And: Since a sanitized file is more trustworthy, you can save it to a “more trusted” qube (whatever “more trusted” means to you).

As per my previous reply, without objective and verifiable measure it means nothing. Any file not created by me from scratch in a clean offline qube is distrusted, so is it’s presence.

Opening a file from an unknown source could be harmful. That’s why the mentioned sanitization tool(s) exist. If you don’t open any file then you have – roughly said – a write-only workflow.

It’s a consideration what risk you want to take. If you open a distrusted file in a disposable VM, bad things could happen. If you “sanitize” it before, bad things could happen, too.

The only exception might be a short plain-text file.

A short plain-text file could also be dangerous (think of UTF-8). :wink:

When connecting several double-doors in a row, it looks like a multi-door, of course.

You can name it multi-single-door, if you will. Or even-more-multi-half-door :slight_smile:

A door heap. :slight_smile:

It’s a consideration what risk you want to take. If you open a distrusted file in a disposable VM, bad things could happen. If you “sanitize” it before, bad things could happen, too.

Assuming an offline disposable, I don’t see what bad things could happen. If that is an actual problem (e.g. a malicious PDF can result in VM escape), that would mean the whole security model of Qubes OS is useless, which is not the case.

A short plain-text file could also be dangerous (think of UTF-8). :wink:

Any examples?

@qubist wrote:

It’s a consideration what risk you want to take. If you open a distrusted file in a disposable VM, bad things could happen. If you “sanitize” it before, bad things could happen, too.

Assuming an offline disposable, I don’t see what bad things could happen. If that is an actual problem (e.g. a malicious PDF can result in VM escape), that would mean the whole security model of Qubes OS is useless, which is not the case.

You may chose what’s more risky for you: VM escape from an offline disposable VM by a malicious PDF or by a “sanitized” PDF. It depends on what you trust more: an unknown PDF or a sanitizer tool from known source.

A short plain-text file could also be dangerous (think of UTF-8). :wink:

Any examples?

I found this but I can’t say whether it’s applicable to plain-text files or not:

You may chose what’s more risky for you: VM escape from an offline disposable VM by a malicious PDF or by a “sanitized” PDF. It depends on what you trust more: an unknown PDF or a sanitizer tool from known source.

You are missing the point. The question is what bad things can happen in an isolated VM and can those things affect anything outside that VM.

In the end, this is not a comparison between unknown and sanitized PDF. The comparison is between the attack surface of Python (quite big), used by the sanitizer tool, and that of e.g. xpdf (less than 4 MiB with all dependencies). Suppose there is a bug in a Python lib, which, when properly exploited, preserves the dangerous parts of a malicious PDF inside the “sanitized” PDF, which you trust.

So, it is not a matter of choice but of objective factors. What you trust is what can affect you. My workflows suggest zero trust and zero assumption that the incoming data is safe. Of course, I reserve my right to be wrong at all times.

A short plain-text file could also be dangerous (think of UTF-8). :wink:

Any examples?

I found this but I can’t say whether it’s applicable to plain-text files or not:

Special characters in window titles do not render ¡ Issue #1059 ¡ QubesOS/qubes-issues ¡ GitHub

IIUC, this is about dom0 (which displays the window titles), not about opening a UTF-8 text file in an offline disposable and the potential dangers of it.

Use your debian template to build another door: in the debian template use synaptic to install all forensics, build a vm called ¨door" or wahatever you like, when you want to copy or move a file to another vm, copy or move it to “door”(or whatever) check the integrity of the file there and copy or move again.

@qubist wrote:

You may chose what’s more risky for you: VM escape from an offline disposable VM by a malicious PDF or by a “sanitized” PDF. It depends on what you trust more: an unknown PDF or a sanitizer tool from known source.

You are missing the point. The question is what bad things can happen in an isolated VM and can those things affect anything outside that VM.

In the end, this is not a comparison between unknown and sanitized PDF. The comparison is between the attack surface of Python (quite big), used by the sanitizer tool, and that of e.g. xpdf (less than 4 MiB with all dependencies). Suppose there is a bug in a Python lib, which, when properly exploited, preserves the dangerous parts of a malicious PDF inside the “sanitized” PDF, which you trust.

Okay, you focus on the attack surface (big vs. small) while I focus on known vs. unknown source (open source code vs. unknown PDF source). Python is big, so is the attack surface. I agree.

The “sanitized” PDF could indeed have dangerous parts preserved. Thus, it could be as dangerous as the original file. That’s why a disposable qube is important even when opening the “sanitized” file. I can’t disagree. But I think it could be helpful to wipe out dangerous parts from a PDF file even if you can’t catch them all.

IMHO, sanitizing files can reduce the attack surface.

Imagine a restaurant where there are two tables: a wiped one and an unwiped one. Which table would you prefer, the wiped table or the unwiped table? Both could be dirty but I prefer the wiped one. My intuition says that I can trust it more. Of course, I wouldn’t lick it either (zero trust etc. :wink: ).

So, it is not a matter of choice but of objective factors. What you trust is what can affect you. My workflows suggest zero trust and zero assumption that the incoming data is safe. Of course, I reserve my right to be wrong at all times.

Reciting your workflow:

The safe workflow would be:

  1. Receive the file in networked qube, using a particular identity (‘receiver’)
  2. Open the file in an offline disposable to review it
  3. Redact the file in another offline qube (‘redactor’).
  4. Store the redacted version in offline qube ‘redacted-docs’.

When revealing time comes

  1. Copy the redacted file from ‘redacted-docs’ to offline disposable and view it to double check it is the right one
  2. Copy to ‘publisher’ qube and reveal it.

When you don’t assume that the incoming data is safe, why do you do something with it (like opening, redacting and even publishing (OMG! :scream:) it)?

A short plain-text file could also be dangerous (think of UTF-8). :wink:

Any examples?

I found this but I can’t say whether it’s applicable to plain-text files or not:

Special characters in window titles do not render ¡ Issue #1059 ¡ QubesOS/qubes-issues ¡ GitHub

IIUC, this is about dom0 (which displays the window titles), not about opening a UTF-8 text file in an offline disposable and the potential dangers of it.

Okay, I just wanted to say that UTF-8 is potentially dangerous and should be distrusted.

IMHO, this discussion is going off-topic (this thread should focus on examples how and when to construct a double door system). We should come to an end. :wink:

@pjmbraet wrote:

Use your debian template to build another door: in the debian template use synaptic to install all forensics, build a vm called ¨door" or wahatever you like, when you want to copy or move a file to another vm, copy or move it to “door”(or whatever) check the integrity of the file there and copy or move again.

Thank you for sharing this! That’s indeed another use case for a “door”.

1 Like

Okay, you focus on the attack surface (big vs. small) while I focus on known vs. unknown source (open source code vs. unknown PDF source).

I simply explained the implications of opening the untrusted PDF in a sanitizer (and then in a viewer) vs opening it in a viewer. Considering both use FOSS, I don’t quite see what you mean by your focus.

But I think it could be helpful to wipe out dangerous parts from a PDF file even if you can’t catch them all.

Only if you intend to assume that the output is trusted. Otherwise, it has zero practical value. In fact, bitmap input can be more cumbersome to redact (depending on specifics), i.e. less practical.

When you don’t assume that the incoming data is safe, why do you do something with it (like opening, redacting

Distrusting is safe. Trusting is not. The more you trust, the more vulnerable you are. Most data is unsafe and approaching it accordingly is the main lesson Qubes teaches + allows to do.

and even publishing (OMG! :scream:) it)?

Publishing can be done with a warning or through a final rasterization (you can publish bitmaps) or conversion to plain text.

@qubist wrote:

Okay, you focus on the attack surface (big vs. small) while I focus on known vs. unknown source (open source code vs. unknown PDF source).

I simply explained the implications of opening the untrusted PDF in a sanitizer (and then in a viewer) vs opening it in a viewer. Considering both use FOSS, I don’t quite see what you mean by your focus.

I just wanted to make sure that we are on the same page („focus“).

But I think it could be helpful to wipe out dangerous parts from a PDF file even if you can’t catch them all.

Only if you intend to assume that the output is trusted.

IIUC, the output of qvm-convert-pdf back to the caller’s qube is essentially a list of RGB values. No space for (malicious) code by definition IIUC.

Otherwise, it has zero practical value. In fact, bitmap input can be more cumbersome to redact (depending on specifics), i.e. less practical.

I don’t know.

When you don’t assume that the incoming data is safe, why do you do something with it (like opening, redacting

Distrusting is safe. Trusting is not. The more you trust, the more vulnerable you are. Most data is unsafe and approaching it accordingly is the main lesson Qubes teaches + allows to do.

An incoming PDF document can be malicious or not. This fact does not depend on whether you distrust it or not.

BTW, your answer does not fit to my question IMO.

Concerning opening a PDF document, here are my actions ordered ascending by minimizing risks:

  1. Opening the PDF on a non-Qubes-OS computer.
  2. Opening the PDF in a non-disposable qube.
  3. Opening the PDF in a disposable qube.
  4. Converting the PDF using qvm-convert-pdf. Then opening the result on a non-Qubes-OS computer.
  5. Converting the PDF using qvm-convert-pdf. Then opening the result in a non-disposable qube.
  6. Converting the PDF using qvm-convert-pdf. Then opening the result in a disposable qube.

and even publishing (OMG! :scream:) it)?

Publishing can be done with a warning or through a final rasterization (you can publish bitmaps) or conversion to plain text.

If just one person downloads the malicious PDF and opens it the normal way (or provides it to other people), bad things could happen (and spread out). So, a warning is not enough IMO.

“Dangerzone” is for rasterization and optionally OCR for getting the text content AFAIK.

I just wanted to make sure that we are on the same page („focus“).

I am not on the page “open source code vs. unknown PDF source”, simply because I don’t understand what you mean by that. Opposing the openness of code (which code?) to the source (origin? or contents?) of a PDF file makes no sense to me.

IIUC, the output of qvm-convert-pdf back to the caller’s qube is essentially a list of RGB values. No space for (malicious) code by definition IIUC.

My note (to which you replied) was in regards to how you intend to treat the data processed by a particular tool, and, based on that, to consider the actual value of “sanitization”. This means:

  • if you sanitize and decide to trust the output (for whatever reason), then sanitizing probably has value in that workflow

  • if you sanitize and still distrust the output, sanitizing is simply a waste of time and resources

You explain that there is no space for malicious code in output data because it is “RGB values”. The fact is, even a rasterized PDF is not just RGB values. It still has a header and structure, which is other data. There is also metadata. How all this is processed further by a viewer/editor software and what vulnerabilities it might trigger is a complex question.

The point is: If you are extra careful about a plain-text UTF-8, which may in some way be dangerous, you should be even more careful with far more complex data, like PDF.

An incoming PDF document can be malicious or not. This fact does not depend on whether you distrust it or not.

Of course. I never said the opposite. Distrusting something is about how you approach something safely, not about making the thing safe.

BTW, your answer does not fit to my question IMO.

Your question was why I would even process distrusted data. I answered by explaining that Qubes allows to do that safely.

If just one person downloads the malicious PDF and opens it the normal way (or provides it to other people), bad things could happen (and spread out). So, a warning is not enough IMO.

I answered what you asked. Now you are criticizing the answer which does not answer deeply the newly introduced, previously unasked, question (how to safeguard the arbitrary Internet user).

To your new question: I don’t know what is “enough” and for what purpose. In any case, signing potentially dangerous files and publishing them without a warning, thus creating a false sense of safety, is far worse than publishing them with a warning.

If you have other off-topic questions, we should better discuss them in separate threads.