Is it safe to use hyper-threading (SMT) with Qubes OS if done the 'correct way'?

barto · February 18, 2024, 10:14pm

[irrelevant comment retracted]

PeakUnshift · February 18, 2024, 11:46pm

When reading these discussions it looks like the threat is major, but if everyone running Windows/OSX/Fedora/Debian/Ubuntu/Archlinux/etc are vulnerable (so including servers?) then I’m quite confused about it, because then everyone not running QubesOS should be sweating. I’m not saying it’s not serious and security is not important, I’m just trying to find my personal balance between security and usability, and ways not to go back to Fedora because I can’t play a video smoothly .

barto · February 19, 2024, 12:05am

[irrelevant comment retracted]

capsizebacklog · February 10, 2025, 2:16pm

I don’t know how you feel about me bumping a year old topic but it was either that or starting a new one and I think it makes more sense to continue this one.

It seems like the reason SMT is disabled on qubes os isn’t because it’s a vulnerability but it’s because it’s an unnecessary attack surface where a new 0-day exploit could be found some day in the future.

That brings the question where do we draw the line what is unnecessary attack surface? Because having an internet connection is the greatest attack surface you can have but we allow that threat.

I am no expert at this but my research results say that all the spectre and meltdown vulnerabilites are mitigated or don’t even exist if you have a modern laptop and keep your system up to date. For example debian systems with intel cpu have intel-microcode package to keep all the firmware updated.

And you can test if your system is vulnerable or not with inxi and GitHub - speed47/spectre-meltdown-checker: Reptar, Downfall, Zenbleed, ZombieLoad, RIDL, Fallout, Foreshadow, Spectre, Meltdown vulnerability/mitigation checker for Linux & BSD. It will show if you are affected/vulnerable/mitigated to any of the spectre and meltdown variants. Affected means the hardware (cpu when out of factory) is known to be concerned about a spectre and meltdown vulnerability. Vulnerable means you are vulnerable because your system don’t have mitigations against it.

If you run this script on a system that is up to date then it should show you’re not vulnerable with hyper-threading enabled.

But like I said in the beginning, it doesn’t mean there won’t be in the future new 0-day vulnerabilities which could maybe be avoided with SMT disabled.

OvalZero · February 10, 2025, 6:41pm

SMT implementations typically share TLB and L1 caches between threads. This can make cache timing attacks much easier, and one has to assume that this will make several “spectre-like” bugs exploitable. While it’s generally a bad idea to run different security domains on different processor threads on the same core, it’s not trivial to modify a scheduler to take this into account (gang scheduling → schedule different security domains on different physical processors).

Xen implements this partially (Xen Project Schedulers - Xen) … but strict core granularity doesn’t work with newer hybrid chips (e.g. Alder Lake and newer) … yet.

So the question – oversimplified and assuming your main goal is compartmentalisation – from a QubesOS user’s perspective is this: What’s the point of compartmentalisation if the compartments, AKA qubes, “share” TLBs/L1 caches, thus enabling cross-boundary attacks between VMs/qubes, even in unpredictable ways? Well … there is none. You’ll have to choose… “best performance”^[1] or “security”^[2]?

While SMT/HT often helps a bit with power savings, SMT doesn’t necessarily have a positive effect on performance; it depends on your workload: intensive parallel tasks would benefit, while purely computational tasks often suffer. However, this also needs to be mentioned: The main performance killer for everyday users under QubesOS seems to be the graphics software rendering in qubes. ↩︎
People often try to argue that a little protection is better than no protection at all. A possible reply – formulated casually and with a wink … adapted from here – could be: Where do you live? On the ground floor? Do you have windows there (pun intended)? Some that you can’t open because they’ll fall on your feet? Hey, at least they keep the insects out! How about on the 3rd floor? Do you have an elevator that sometimes crashes or gets stuck? But at least it takes you up sometimes? That’s better than nothing, isn’t it? And you can always take the stairs. You can’t lean against the handrail, because it could break off and cause serious injury, but it’s better than nothing, isn’t it? Silliness aside … The safety requirements discussed here are in many ways the same as for a handrail or banister: It has to provide support. Security is about reliability, whether in a computer or in life. It has to be deterministic about which attacks it will help against and which it won’t. If the railing can withstand a maximum pressure of 250 kilograms, then you can calculate with that. There is no such guarantee with these mitigations alone, because they are – again, oversimplifying – special treatments against specific attacks (“RIDL”, “Fallout”, “Zombieload”, “Store-to-Leak forwarding”, “Meltdown”, etc.), but they don’t solve a more general underlying problem. All these attacks are more or less variants of the same exploit of the speculative execution model of Intel CPUs. Therefore, a reliable general fix must protect VMs/Qubes from cross-boundary attacks in general. So, you’ll need them all: specific mitigations, SMT disabled, firmware/BIOS updates. ↩︎