Trying to set up a (preferably windows) gaming HVM with an nvidia GPU on a laptop. "No bootable device" when starting

daffy1234 · February 2, 2024, 5:34am

And yes, I’m aware the combination of “windows gaming HVM”, “nvidia”, and “laptop” are the stuff of nightmares.

I have a Lenovo LOQ gaming laptop (82XV002LUS). It has an Nvidia RTX 3050 6gb laptop gpu, and an intel core i5-13420H with an integrated gpu.

What I have so far is I’ve created a StandaloneVM in HVM mode and installed windows 10 on it, and did all updates. I followed this guide to hide my nvidia gpu (the id’s on my system 01:00.0 and 01:00.1 for a seemingly related audio device). I tried to follow the immou group steps but I couldn’t figure it out. I added the grub options before boot when booting to an xubuntu live usb, but nothing showed up in /sys/kernel/iommu_group. Moving on from that, after I hid those two pci id’s, I verified that both devices showed up in sudo lspci -v with Kernel driver in use: pciback

Finally, when attaching the devices with qvm-pci attach windows dom0:01_00.0 -o permissive=True -o no-strict-reset=True -o persistent=True, the windows qube no longer boots. It simply shows a black SeaBIOS screen that shows it attempted to boot to the hard drive, and then to the floppy disk drive, and couldn’t boot to either, as if they were erased. This happens when either or both of the 01:00.0 or 01:00.1 devices are added, and when both are removed, the qube boots normally again.

Additionally, when doing the exact same steps but using a linux template as a standalone qube, no window appears, and after 60 seconds the qube is forcibly killed.

I’m out of ideas at this point, any help or advice would be appreciated.

Update: When installing Manjaro over the HVM qube, the same result happened. No bootable device after attaching any PCI device.

renehoj · February 2, 2024, 8:21am

How much memory are you trying to allocate?

If it works with 2 GB memory, it might the problem with the current version of stubdom.

I got GPU passthrough working by using the stubdom 4.2.9-1 from testing, I copied the files from a system running testing, you can probably install the rpm from Index of /r4.2/current-testing/dom0/fc37/rpm/

daffy1234 · February 3, 2024, 12:43am

Thanks for the reply!

Installing that testing package got it to boot with the PCI device attached, though windows seems to not know how to use it. Windows Update failed to download the NVIDIA - Display update, and manually installing the driver from nvidia’s website bluescreens mentioning nvlddmkm.sys. At first, the gpu showed up correctly as “Nvidia RTX 3050 Laptop” (or something similar) with a yellow triangle, but now it shows up simply as “display” under unknown devices.

So far I’ve tried using 2gb and 8gb of allocated ram, and tried attaching either/both of the devices with permissive=True and no-strict-reset=True. All to the same result of that same BSOD error.

And now it wont stay booted for more than a minute or two before crashing with the nvlddmkm.sys problem. I’m gonna try more testing to isolate exactly what makes it report as the correct name in device manager, that might be a clue.

I understand if this is outside the scope of this topic at this point, but I appreciate help regardless.

Update: While typing out that message, it reported as “NVIDIA GeForce RTX 3050 6GB Laptop GPU” in device manager while booted to safe mode, with no yellow triangle too! It also says it can’t get the status of the device due to being in safe mode, so the lack of a yellow triangle may not be very special.

neowutran · February 3, 2024, 7:39am

The nvidia kernel doesn’t behave correctly when MSI-X interrupt are available in this context.
A solution should be to report the issue to nvidia and wait for a fix
Or doing reverse engineering of the nvidia kernel to understand how MSI-X interrupt are used

A workaround would be to disable MSI-X support in xen linux stubdom

deeplow · February 3, 2024, 9:53pm

Maybe try increasing the qrexec timeout. I have had to do this on mine. My GPU qube was taking like 4 mins to boot and qrexec would timeout before that.

qvm-prefs <QUBE_NAME> qrexec_timeout 6000