Create a Gaming HVM

1

I recommend you to make a backup using the Qubes Backup tools.
Theoretically you can recover from everything you could fail with the correct option or command line.
But since you are new to that, do a backup.

2: I let someone else answer to that, never tried with laptop.
3: It is better if you reboot using a live distro to gather the information. It can help to understand why something doesn’t work.
4: It is a recoverable situation, in this case, once you reach the grub, press “e” to edit the grub command line and remove the things you added previously to hide the gpu

1 Like

I believe you will need a laptop with a mux-switch and to blacklist the dGPU as you described. If the hdmi on your laptop is directly connected to your dGPU you are good to go.

The issue is getting the patches (or now, the new xen version) to work on your laptop if you have very new hardware.

Mine was working fine on 4.2 rc3, but for some reason, the passthrough no longer works on newer versions of Qubes, and it results on the VMs crashing at boot.

Thank you for your reply.

3: It is better if you reboot using a live distro to gather the information. It can help to understand why something doesn’t work.

I got the latest Kali live ISO but how do I boot it where I can enter those commands to see the information? Can I bother you for some step by step for dummies commands?

Can I bypass checking this IOMMU on reboot? I think everything is fine, I have a high end laptop built with virtualization and multiple monitors in mind, I don’t think my GPU is grouped with anything else.

What can happen if I proceed without checking this IOMMU that’s got me scratching my head for the past 2 days?

Or you can’t proceed without the device ID that I need to get ?

This will make a lot of sense to me after I see it working and analyze what I had to do for that, but right now I am very confused as to what my next steps are.

How to find IOMMU, how to hide from dom0. These are 2 big ones that I don’t seem to be able to get past.

I am willing to pay someone with BTC Zelle or paypal if they can take 1 hr from their time to make this part as for dummies as possible. Name your price. I really want to get this done ASAP.

Thank you!

Thank you for your reply.

My HDMI is connected to the Nvidia directly and the laptop screen to the onboard one. The laptop has multiple modes but in hybrid, this is the configuration as described in this article:

https://device.report/manuals/precision-7770-external-display-connection-guide

The issue is getting the patches (or now, the new xen version) to work on your laptop if you have very new hardware.

I haven’t even gotten there. I will take it one step at a time. Right now i am still trying to get past rebooting/figuring out IOMMU group headache and then hiding it from dom0, rebooting and hopefully it will work.

But I am very confused if the hiding from dom0 part is done from Qubes booted up in a dom0 terminal, or from grub in preboot.

And step by step for dummies will be greatly appreciated and as I specified in my previous post, even monetarily rewarded if that is a motivator for anyone.

I am having a real life headache from trying to figure this out and reading everything for the past days and still having no clue how to begin.

Thank you again for your reply.

Your specs are similar to mine, so I think you are good to go.

Regarding your other statements, the guide above lays out fairly well.
I would flash Ubuntu and, ‘test’ it to get your IOMMU groups, although in my experience, most devices usually have separate IOMMU groups in modern hardware, specially the gpu devices.
Regarding hiding the pci device, you need to modify the file /etc/default/grub in dom0 as outlined above.
Then after you’ve done that, you apply the stubroot patch until we get Qubes 4.2.1.

I mean every step I did in the following post up to the “no bootable device”:


Sure! And thanks for the help, btw!

This time I started off with Qubes 4.2-rc3 and try the stubdomain patch (following this tip, which unfortunately didn’t work). If needed I can try some other time to do the full 4.2 fully updated (the testing xen version) and without the patch, but I have tried it before and the results where the same.

Here’s the high-level detail of what I did:

  1. I installed Qubes 4.2 rc3 (without the graphics card)
  2. added grub boot with my xorg.conf workaround to get Xserver to start
  3. started Qubes with the graphics card attached (boots normally)
  4. Applied your script here which as I understand applies only to qubes starting with gpu_.
  5. Created a qube called gpu_manjaro and attached the graphics card (permissive=True and no-strict-reset=True)
  6. boot from manjaro (arch linux) .ISO
  7. then the error: “no bootable device”

Then I upgraded dom0 the the latest stable version and rebooted.

  1. tried to start gpu_manjaro from manjaro ISO and again the same error: no bootable device

Lastly I tried the testing packages, which includes xen 4.17.2-8 6 which supposedly is patched. This is the version that I obtained the logs from.

  1. renamed gpu_manjaro to manjaro so the patch wouldn’t apply.
  2. tried to start manjaro from manjaro ISO and again the same error: no bootable device

The following are the logs with the loglvl=all and guest_loglvl=all (hopefully I applied it correctly).

xl-dmesg.log (114.0 KB)
guest-manjaro-dm.log (40.9 KB)
guest-manjaro.log (38 Bytes)
lspci.log (1.4 KB)
dnf-list.log (721 Bytes)

1 Like

Isn’t it this issue with booting from ISO with PCI device?

Try to first install the OS from ISO in HVM without PCI devices and attach PCI devices after you’ve installed the OS.

1 Like

Indeed. Maybe it’s not passthough-related at all. I actually had commended on that very issue 4 day ago.

I will try to install. But I suspect it won’t be able to boot after I plug it in. Booting from an ISO should be no different than booting from a disk.

I see two interesting line in your logs:

[2024-01-12 14:55:15] pci 0000:00:00.0: can't claim BAR 6 [mem 0x000c0000-0x000dffff pref]: address conflict with Reserved [mem 0x000a0000-0x000fffff]
[2024-01-12 14:55:15] pcifront pci-0: Could not claim resource 0000:00:00.0/6! Device offline. Try using e820_host=1 in the guest config.
1 Like

What is the list of pci you try to pass in this hvm ?

Thanks for taking a look. I have seen some e820_host references in xen.xml. Do you think I should explore enabling this?

It’s two: the ones listed in lspci.log

I do not known but I would not look in the direction of e820_host. I will try to make some research out of curiosity.

If you want to try: I left a qubes backup here: https://web.neowutran.ovh/deeplow_test .
Password is “deeplow”. Import the backup (2 vm, one template and one appvm). Once the import is over on deeplow_test just attach the video part of your gpu “01:00.0” (and don’t attach the audio part). Launch the deeplow_test vm.
It should work out of the box, as it work out of the box for me on qubes R4.2, latest update, with a 4080 instead of your 4090.
sha512sum: a393df617281f9e295bc0d95eb2bf2fc39bef586006f01fd4c502c149100a630fbb365c3c6bf754b318f909e8d7281728691164d27eeddd89ed8163a434ec253 www/deeplow_test

If it work, the issue is not directly linked to gpu passthrough, and it is back to xen debugging
If it doesn’t work, back to xen debugging

2 Likes

your setup:

[2024-01-12 14:55:14] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved
...
[2024-01-12 14:55:14] pci 0000:00:00.0: reg 0x30: [mem 0x000c0000-0x000dffff pref]
...
pci 0000:00:00.0: can't claim BAR 6 [mem 0x000c0000-0x000dffff pref]: address conflict with Reserved [mem 0x000a0000-0x000fffff]

my setup:

[2024-01-11 00:09:23] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved 
...
[2024-01-11 00:09:23] pci 0000:00:00.0: reg 0x30: [mem 0xfb000000-0xfb07ffff pref] 

It seems that for some reason, on your setup, xen is assigning the BAR 6 of your GPU on a memory range it does not allow to be used ?
nearly 100% sure it is the issue

1 Like

Thanks a lot for going the extra mile. I’ll give it a go probably Sunday.

1 Like

try to compile the stubdom without this line https://github.com/QubesOS/qubes-vmm-xen-stubdom-linux/blob/main/qemu/patches/series#L21

2 Likes

Did a clean 4.2 install and still getting “No Bootable Device” on both Linux and Windows HVMs unfortunately, going to rollback to 4.1 for now!

@Cameron I’ve found that happens when I don’t include the “gpu_” prefix (after having patched the stubdom). Do you have this issue even when trying to start up the qube with 2MB of RAM or less?

1 Like

This is on 4.2 without running without the stubdom pach so no prefix should be required, but I did have it there nonetheless.

Applying the stubdom patch on a clean 4.2 install does get it to boot with 32GB of memory, but I’m getting the same blue screen shortly after boot as here HVM no bootable device after gpu passthrough - #12 by Cameron and no PCI devices except the NVMe are functional.

Removing the prefix and setting the memory to 2GB allows it to boot part way but it throws the same bluescreen before it boots fully (nvlddmkm.sys). Removing the GPU allows this one to boot fully but again none of the PCI devices except for the NVMe drive appear to work.

The PCI devices all appear to not function as well, no network adapter, no USB, no GPU apart from the 4 sensors it briefly displays before BSODing.

I have tried with and without Above 4G Decoding (I had this on previously, doesn’t seem to have any affect) and Resizable BAR (this prevents Qubes from displaying video at any point past GRUB for me)

Going to install 4.1 and see if that still works properly.

EDIT: Fresh 4.1 install with just the stubdom patch applied works perfectly so it’s definitely an issue with 4.2.

2 Likes

I have now tested this with similar results. I tested it on Qubes 4.2 with testing updates (and thus xen 4.17.2-8). Here are the logs:

dnf-list.log (721 Bytes)
guest-manjaro.log (38 Bytes)
guest-manjaro-dm.log (40.9 KB)
lspci.log (1.4 KB)
xl-dmesg.log (114.0 KB)

In particular, I still see the following messages:

[2024-01-14 15:16:15] pci 0000:00:00.0: can't claim BAR 6 [mem 0x000c0000-0x000dffff pref]: address conflict with Reserved [mem 0x000a0000-0x000fffff]
[2024-01-14 15:16:15] pcifront pci-0: Could not claim resource 0000:00:00.0/6! Device offline. Try using e820_host=1 in the guest config.

This seems to be similar to before. However, this VM didn’t seem to have a display output on my main screen (the one where dom0 is), so I couldn’t see the same “no bootable device” text I did in the previous setup. But I did see something similar in xl dmesg:

(d7) Booting from Hard Disk...
(d7) Boot failed: could not read the boot disk
(d7) 
(d7) enter handle_18:
(d7)   NULL
(d7) Booting from Floppy...
(d7) Boot failed: could not read the boot disk
(d7) 
(d7) enter handle_18:
(d7)   NULL
(d7) No bootable device.

So I guess we’re back to XEN debugging. Unfortunately my time for this project is kind of running out. So I don’t know when I’ll be able to continue these tests. I’m hoping someone manages to figure it out.

2 Likes

Thanks you ! :slight_smile:

If others people have the same issue, can dedicate a significant amount of time on this project and have some skills to write and read basic C code, I can help to try to debug xen/qemu.
But since I don’t have the required hardware to have this issue to test it properly, I won’t try to solve it alone

4 Likes