Use a Qubes OS machine as home lab?

Hi there community, nice to meet you all.
I’m curious if Qubes OS is suited for my use case or if I’m trying to abuse it for something that it was never built for.

My intention
I want to have a desktop platform where I

  1. Can easily do my regular work on one screen (Windows VM)
  2. Can easily spin up and down images/software to have a lab for testing things / trying out architectures / opening and analyzing files without breaking anything
  3. Be able to suspend the whole thing when I go to sleep

Putting a hypervisor to sleep - is it a good idea?
We get a lot of software/OSes that we have to work with during our assignments and I like the idea of having everything isolated. Also to have a Windows and Linux desktop running in parallel to use as needed, depending on what I’m currently working on. However when I thought about it for a moment it kind of seemed a bit silly to suspend a hypervisor with everything that’s running on top of it every night. I’ve never seen anyone suspend their ESXi hosts, that’s just something that never happens in my experience.
So I’ve asked myself if I’m trying to abuse Qubes OS for something that it wasn’t really built for?

I bricked it once - should I brick it twice or should I just give up?
I did a test setup of Qubes OS and bricked it when I tried to put it to sleep (black screen, could not get back in regardless of what I tried).

Putting a hypervisor to sleep - is it a good idea?
Thought I’ll ask the community about what they think of my use case before I spend too much time trubleshooting that sleep problem. I’d have a bit of a learning curve ahead of me as I’m not that experienced in running and troubleshooting a linux platform. Do you think Qubes OS is suited for the use case described above or will I run into more issues down the road if I (ab)use it like that? If not, would you mind sharing how you’d set up a home lab where you can quickly spin up and down new machines without breaking the bank account?

Guess my biggest problem is that I don’t want to have the lab system up and running 24/7. I just want to use it like I’d use a regular laptop where I just close the lid and walk away once I’m done. And I don’t want to spend a fortune on cloud services either.

Any feedback or idea donations welcome.
Thank you.

Edit:
Specs of the machine I have in mind:

  • HP Victus 15L
  • *64GB RAM (Crucial 64 GB , DDR4-RAM , 3200 MHz , 288-Pin DIMM , DDR4-3200 (PC4-25600))
  • AMD RYZEN 7 5600G
  • AMD Radeon RX 6600XT
  • Reno2 mainboard
  • 2 TB WD Black SN770 (M.2 - it’s fast)
  • it supports way more RAM than the 16GB HPs states in its official documentation for the Reno2 mainboard. sudo dmidecode -t 16 tells me the Reno2 mainboard would support up to 128GB of RAM. I have it up and running with 64GB.

I mention the specs since according to the documentation it’s not a tested configuration. I have Qubes OS running on it though. Only had some minor issues with the screen (Benq GW2765) that was plugged in. X for some reason didn’t like it and the setup would come to a hard stop with a “Warning: dracut-initqueue timeout - starting timeout scripts” (which lead me down a completely wrong rabbit hole). The solution was easy: Once I plugged in a different (newer) screen the setup succeeded without any major issues.
If you are a Linux pro that can immediately tell that this setup is going to cause troubles let me know and stop me from proceeding with Qubes OS on it.

Edit - Why brick it once when you can brick it twice:

  • Qubes 4.1.2, vanilla installation fresh out of the box
  • Hardware see above (HP Victus 15L). Supports sleeping states S0, S3, S5 (dmesg | grep S3)
  • Don’t do any config, just put system to sleep // PC appears to be powered off
  • Wake PC up by pushing the power button since no reaction to user inputs from keyboard/mouse
  • Screen stays black
  • No reaction at all to user inputs and waiting 10min doesn’t help either
  • Push power button for 10sec to shut down the machine then restart
  • After a while get asked for disk password, enter
  • Qubes boots normally // it was a hard shutdown, not a resume
  • journalctl tells me the OS never received a wake up signal
DPM: suspend entry (deep)
\-\- Reboot \-\-
  • Suspended the system with sudo systemctl suspend from the console, same problem

Saw some people suggesting to put a new kernel in place to solve the problem. That still seems a bit challenging to me as I don’t even have internet connectivity up and running yet. Never replaced a kernel either.

Seems like I’ll have to put some work into it to get it to work the way I want to use it.
So initial question remains, is Qubes OS suited as a home lab the way I intend to use it or would I be better off not wasting any more time and set up some other system that allows me to quickly spin up machines?

1 Like

Qubes OS can work as a home lab, but suspend has varying support across devices.

I use the Purism Librem 14 v1 and suspend works exactly like this.

1 Like

Thank you for the feedback and hardware recommendations.

This indeed seems like the right order to get started with Qubes OS:

  1. First pick and buy hardware from the hardware compatibility list
  2. Install Qubes OS on it
  3. Have a good experience where you don’t run into too many unknown bugs

I did it the other (wrong) way round and it’s a journey of pain and suffering for me as a beginner so far.

  1. Buy hardware without ever looking into the hardware compatibility list
  2. Discover Qubes OS
  3. Install Qubes OS on said hardware because it seems cool
  4. Installation was successful, only ran into some minor problems (which I could fix)
  5. Once it’s installed run into real problems for my use case (suspend not working)
  6. Discover the hardware compatibility list and notice that my platform isn’t tested (yet)
  7. Spend a lot of time trying to get things working anyway
  8. (Probably will give up in the next 3-4h if I don’t get it to work the way I want it to work)

I’ll play around with it a bit more for now though. Still kind of curious if Qubes OS would make for a good lab box. So far nobody has told me that it’s a terrible idea.

As for my suspend problem on the HP Victus 15L machine:
tl;dr: Kernen update didn’t help. Checking some logs and settings next before I give up.

  1. Managed to get networking working on Qubes OS, it was surprisingly painless (Just plug in the ethernet cable then start a browser, I couldn’t be bothered with the WiFi)
  2. Downloaded kernel 6.3.9-1 on some cube by following this guide (was painless too)
  3. Took me a good while to get qvm-copy-to-dom0 working (pain)… if you are used to just copy & paste everything everywhere within seconds then the journey to copy the code from here on a cube and then get it over to dom0 (it’s all described in the link above) is quite scenic and long (i can never confess to anyone that I spent over half an hour trying to copy and paste 20 lines of code. If you’re a beginner and don’t want to waste too much time with this but just want to get done with a kernel update just fire up nano on dom0 and quickly type those 20 lines of code). On the bright side you learn about why they make it that difficult on purpose for a good reason.
  4. Once I finally had the qvm-copy-to-dom0 all that was left ot do was to make it executable (chmod u+x qvm-copy-to-dom0 and then run it)
    After that getting the kernel.tar file over to dom0 was easy with qvm-copy-to-dom0
  5. Unpacking and installing the kernel files in dom0 was surprisingly easy and painless, seemed like a much scarier thing to me as a novice (I tweaked the command to sudo dnf install -y kernel-latest-*.rpm which just installed all four kernel .rpms I had downloaded. When I followed this guide word by word it complained something about dom0 not having internet access (guess I made a typo somewhere and it tried to find the .rpm in an online repo which obviously will fail on dom0))
  6. After a reboot the new kernel was up and running (uname -r)
  7. However suspend still isn’t working properly on my HP Victus 15L with the new kernel 6.3.9-1… or well suspend is but resume not really, screen just stays black after resume

Will waste a bit more time on debugging it this week but might eventually decide to opt for the less painful way and either just get hardware that lets Qubes OS behave the way I want it to behave out of box or work with what I have and find some completely different hypervisor that is less pain to get it up and running on my HP box.

Hope those experiences help some other beginners that are looking into Qubes OS.

1 Like

That process can be refined with this thread for a turnkey solution.

1 Like

Allright, thought I’ll give it an update.
My goal is to get this working as a lab machine. I like the idea of a desktop that can easily and securely spin up machines as needed. Suspend still isn’t working as of now. Around 6-8h in reading and debugging so far (I’m a beginner). If you’re a beginner in the same situation that doesn’t want to waste time on debugging things but want a system that works out of the box for the intended purpose I’d suggest to walk away and spend time looking for some other solution. Either that or opt for hardware that has proven to be reliable without problems.

Here is what I did so far:
1) Get the system to a [Minimal Configuration] state
Why: Reduce search scope. Eliminate as many potential sources of the suspend/resume problem as possible. The less potential sources you have, the quicker you’ll solve it.

In my case that meant physically removing the graphics card and unplugging a screen as the first step.

HP Victus 15L

  • *64GB RAM (Crucial 64 GB , DDR4-RAM , 3200 MHz , 288-Pin DIMM , DDR4-3200 (PC4-25600))
  • AMD RYZEN 7 5600G
  • AMD Radeon RX 6600XT
  • External screen 1
  • External screen 2
  • Reno2 mainboard
  • 2 TB WD Black SN770 (M.2 - it’s fast)
  • Any USB device that’s not needed

After that I simply reinstalled Qubes from scratch (4.1.2 - 5.15.94-1.qubes.fc32.x86_x64) on the machine to get a clean start. Once booted up I only opened a single bash window to keep started processes at a minimum too before I sent it to suspend through the GUI menu.
Result: Suspend/resume problem persists even in Minimal physical Configuration state. Keyboard LEDs are not reacting after resume, mouse seems dead too. Fans are spinning though.
Learnings: I don’t need to debug anything related to the RX 6600XT driver for now and can put that aside.

1.1) Verify if you really are in a [Minimal Configuration] state
Why: Unplugging hardware alone won’t get you to a minimal configuration. There is software too to take care of.

  1. go with minimal config, turn off drivers like USB, AGP you don’t really need
  2. turn off APIC and preempt
  3. use ext2. At least it has working fsck. [If something seems to go wrong, force fsck when you have a chance]
  4. turn off modules
  5. use vga text console, shut down X. [If you really want X, you might want to try vesafb later]
  6. try running as few processes as possible, preferably go to single user mode.
  7. due to video issues, swsusp should be easier to get working than S3. Try that first.

When you make it work, try to find out what exactly was it that broke suspend, and preferably fix that.

Source/Credits: Pavel Machek pavel@ucw.cz / https://www.kernel.org/doc/Documentation/power/tricks.txt

To cross 2 and 5 off that list above and put the system into S3 deep sleep on suspend here is what I did:

  1. sudo nano /etc/default/grub

GRUB_CMDLINE_LINUX=“rd.luks.uuid=luks-fc438d0c-2755-4faf-8a4c-dd249cbe7e3c rd.lvm.lv=qubes_dom0/root rd.lvm.lv=qubes_dom0/swap plymouth.ignore-serial-consoles rd.driver.pre=btrfs rhgb quiet noapic acpi=off mem_sleep_default=deep vga=0

Note: Apparently there are better ways than vga=0 to do this → GRUB/Tips and tricks - ArchWiki (Sidenote: Outdated documentation seems to be a problem when troubleshooting. Make sure you find recent documentation.)

  1. Apply the config (you need to specify the output path of grub.cfg with -o, if you don’t it will just print the new config to the terminal but not actually change it. Cheeck here for more information.)
    sudo grub2-mkconfig -o /boot/efi/EFI/qubes/grub.cfg

  2. Restart system
    Result (in my case): Won’t boot. If you’re stuck in a grub boot cycle you can edit the grub config during startup by pushing e in the selection. After manually deleting noapic from the config posted above the system started up after hitting F10 but the GUI was very slow and laggy (with acpi=off). Suspend didn’t work, the screen just froze. Removing the acpi=off option while leaving noapic there resulted in being stuck in a failed boot cycle again.

  3. Check if suspend/resume now works //for me it didn’t

  4. Make 1 modification to the settings at step 1 and try again (Trial & error)
    There are countless things you can change. On my machine which uses an AMD graphics card for example:

amdgpu.gpu_recovery=1		//Activate a mechanism that whenever there is a GPU timeout detected, it will automatically attempt to reset the GPU and bring it back up.
amdgpu.dpm=0				//Dynamic Power Management
amdgpu.dc=0					//Display Core
...

To do:
I haven’t tried all of the above yet. Still need to get a grasp of some things.

  • Learn how to turn off drivers that are not needed (Estimate to learn it: 1-2h)
  • Learn how to turn off APIC and preempt (Estimate to learn it: 1-2h)
  • Learn how to turn off modules (and which ones) (Estimate to learn it: 1h)

2) Learn more about how suspend works
Why: To identify what I need to look at / what exactly my system is doing when I send it to suspend state. Also to get the vocabulary needed to conduct google searches.

Learnings:

  • Suspend-to-ram (s2ram) seems to be a very common problem on Linux since a long time. Even Linus wrote about it in his Tutorial “How to get s2ram working” back in 2006. Graphics hardware in particular seems to be the biggest source of problems.
  • The Linux kernel knows various sleep states (check diagram below). It’s probably good to know which one you want to achieve in order to check your settings / narrow down the scope of your troubleshooting efforts. I want to achieve S3 (suspend to ram) and not hibernation (which would require additional configuration and I’m simply too lazy to do that).

Result:

Note: Diagram above can be edited. Download and rename from .log to .drawio, then you can import it in drawio.
Kernel_sleep_states.drawio.log (6.1 KB)

Based on what I learned I verified that /sys/power/mem_sleep is set to [deep] which should signal the kernel that we want to suspend to ram (S3) when we suspend.

cat /sys/power/mem_sleep
s2idle [deep] //brackets indicate which option is active

I guess suspend itself isn’t so much the issue (system goes to sleep as expected) but resuming is.

3) Next steps

  1. Do some research and see if I can find similar systems that have a desktop, could be used as home lab and support suspend out of the box (Path of least pain atm)
  2. Listen to inputs from the community
  3. Make it work on the current hardware platform: Learn how to set debug levels, gather logs and how to view/interpret them (Goal: Narrow down search scope. Figure out precisely where things go wrong. Estimate that I need 2h or so to read, understand, apply that)
  4. Make it work on the current hardware platform: Bruteforce my way through some kernels if I can’t figure it out myself (Trial & Error)
    Suspend-To-Ram Issues and graphics card bugs seem to be very common. Why waste time chasing after a bug if somebody else might has already fixed it in a different kernel version?
  5. Make it work on a supported platform: Once I have enough $$$ to spare on some new lab hardware I might do that
  6. Make it work on the current hardware platform: Get up to speed with ACPI, DSDT, decompiling and patching things myself (Path of max. pain, I’d expect many many hours spent on that before I get somewhere. Not even sure if I could realistically achieve that, it’s not really my cup of tea)
  7. Something entirely else

Might update in case I made progress.
If this thread doesn’t get an update it means that I have spent more than 16h trying to get suspend to work on this particular system but didn’t succeed and moved on. I’m currently at around 8h in.

Cheers.

===========================================================
Some 2h later… (at around 10h trying to get this to work)

Decided to give step 4 above a go → Bruteforcing through some kernels. Here is how I went about that:

  1. Created a list of kernel versions what appeared to be good candidates that could solve the suspend/resume problem. Did some research and came up with this:
5.15.94-1		//(Current) Doesn't work
5.16.3			//works perfectly fine for me 
6.0.7			//@BenT suggested: Install this kernel in Dom0 (works in both 4.1 and 4.2)
6.0.8			//works properly (tomz17)
6.0.12			//Downgrading to this version fixed the issue (Ideapad 3 15ALC6 - CPU: AMD Ryzen 5 5500U, with integrated graphics) 
6.1.1			//Breaks suspend on AMD laptop (Ideapad 3 15ALC6 - CPU: AMD Ryzen 5 5500U, with integrated graphics)
6.1.19-1-lts	//This (suspend problem) occurs on my pc recent weeks. have to reboot and all my tasks are being terminated.
6.1.22-1-lts	//Breaks hibernation on AMD Ryzen 4700U CPUs
6.2.2-arch2-1	//Breaks suspend

I ended up with a list of four kernels that seemed like good candidates. Idea being to save time and focus on kernels that have a higher chance of potentially solving the problem / not waste time on kernels that are known for having a lot uf suspend/resume problems related to my hardware.

5.16.3			//works perfectly fine for me 
6.0.7			//@BenT suggested: Install this kernel in Dom0 (works in both 4.1 and 4.2)
6.0.8			//works properly (tomz17)
6.0.12			//Downgrading to this version fixed the issue (Ideapad 3 15ALC6 - CPU: AMD Ryzen 5 5500U, with integrated graphics) 
  1. Created a folder for each Kernel version that I was going to download
  2. Downloaded all .rpm from the list above into the corresponding folder
    kernel-latest-*.qubes.x86_64.rpm
    kernel-latest-devel-*.qubes.x86_64.rpm
    kernel-latest-qubes-vm-*.qubes.x86_64.rpm
  3. Now that I had four folders with kernel files I packed them up into a good_kernels.tar archive
  4. Followed this guide to get the good_kernels.tar to dom0
  5. Unpacked the good_kernels.tar → now I had a folder structure with the good kernels
  6. Installed the lowest version (5.16.3) //which was slightly higher than the current kernel
    //cd’ed into the good_kernels\kernel_5.16.3 folder, then
    sudo dnf install -y kernel-latest-*.rpm
  7. rebooted, once up checked kernel version with uname -r and once verified suspended the system
  8. resumed after 5sec or so, checked if screen wakes up or keyboard LED reacts → In case of failure, install next higher version of the kernel (here 6.0.7), test again, repeat //Qubes automatically selects the latest kernel version, thus this approach only works if you go from low kernel version to high kernel version

Unfortunately no luck with the “good kernels” I picked.
Didn’t notice any different behavior in any of the kernels.

I’ll try to get my BIOS to the latest version and see if anything changes. It’s super restricted. Can’t do anything in it. No idea how to get into the “Advanced” section either, it’s hidden or completely disabled. Shame on HP.
I’ll leave that for some other day though. Need to get Windows on that machine first to update the BIOS.

2 Likes