The quick summary is that I can get GPU passthrough “working”, but it is extremely slow. The desktop takes minutes to load, if it loads at all, and is so slow that it’s unusable when it does load. I am able to get things working using QEMU/KVM with the same hardware. I’m not sure how to continue troubleshooting.
I’d greatly appreciate any help. Please let me know if there’s any more information I can get that’d be helpful for troubleshooting this issue.
Details
Qubes
I’ve mostly been following this guide at Create a Gaming HVM. I made a post there describing some of my issues and what I’ve tried a few days ago:
As I mentioned in that post, I was able to get things working using a nvidia GPU without any apparent issues. I’m currently trying to get things working with an rx 6750 xt. I’m currently testing on Ubuntu 23.10, but I don’t have any distribution preference right now, I’m just trying to get it working on any linux distribution.
As a guest in qubes, the behavior seems to been fairly consistent across distributions. I get errors like:
amdgpu 0000:00:05.0: [drm] *ERROR* [PLANE:70:plane-5] commit wait timed out
amdgpu 0000:00:05.0: [drm] *ERROR* [CRTC:91:crtc-0] flip_done timed out
amdgpu 0000:00:05.0: [drm] *ERROR* flip_done timed out
amdgpu 0000:00:05.0: [drm] *ERROR* [CRTC:91:crtc-0] commit wait timed out
amdgpu 0000:00:05.0: [drm] *ERROR* flip_done timed out
I get output through the GPU, but it takes several minutes for the desktop to load, and it is so slow that I can’t really do anything. Mouse movements are smooth, but even doing something as simple as clicking and dragging on the desktop makes everything very choppy.
I’ve tried multiple kernel versions with manjaro specifically and by using the kernels that different distributions are on, but I haven’t noticed any of them making a difference.
I’ve tried a variety of kernel options, but unfortunately I haven’t recorded every one that I’ve tried. I’ve almost always used pci=nomsi
, but I have tried a few times without it just in case that was the issue.
I’ve tried a few BIOS options:
- sr-iov on/off
- resizable bar on/off
- explicitly setting some virtualization options enabled instead of to auto
With the BIOS options, I also didn’t notice any changes other than a message in dmesg complaining about resizeable bar being off when it has been off.
Baremetal
To make sure that there wasn’t a hardware issue, I booted off a fedora live CD using the GPU output. I didn’t notice any issues.
QEUM/KVM
To try to test to see if this was an issue with the this specific GPU being virtualized, I booted off of a fresh fedora install on a usb SSD, installed @virtualization
, installed an Ubuntu 23.10 VM, and then passed through the GPU.
I was able to play a youtube video without any issues, but I did notice an issue on the desktop.
Clicking and dragging on the desktop made everything choppy, but not as bad as in the qubes guest. Clicking on a window to resize it, or releasing after also caused a split second hang, but actually resizing the window didn’t cause any issues. Watching system monitor while clicking and dragging around the desktop showed a single core CPU utilization spike to 100%, so it seems that for some reason the desktop is using software rendering despite the GPU being the only display output.
At some point the host went to sleep. After waking the host, the guest had no output and was showing similar flip_done timed out
errors as qubes guests do at boot. I’m not sure if this is related to whatever is happening in qubes. If it is, it seems like there might be some issue with guests waking the GPU from sleep, and something about the way the GPU is passed to the guest by qubes or xen is different than the way QEMU/KVM passes it, causing the issue to happen immediately with qubes guests but only after a host sleep with the QEMU/KVM guest.
A few times when shutting off the QEMU/KVM guest, virtual machine manager freezes and it seems that this is because the gpu failed to reattach to the host. I’m not sure how to recover from this, even host shutdown hangs and it has to be power cycled.
USB controller
I am not sure if this is related or might help find the issue, I’m including it just in case.
I have a Renesas uPD720201 based USB controller that I’m also trying to pass through. To try to focus on one issue at a time, I have not been passing it to the qubes VM while trying to work on the GPU issue.
When I do try to pass it to a VM in qubes, with or without a GPU, it fails. I get error -110 and lspci -k
shows no driver in use. I tried to find the cause of this and on another forum someone suggested updating the firmware, which I did, but it did not fix the issue.
The controller can be passed to the KVM VM without any issues.
Like the GPU, the controller is in its own iommu group.
Configs and logs
Qubes
<domain type='xen'>
<name>ubuntu23</name>
<uuid>d32c3c8a-31e1-413d-89df-c20a144c036a</uuid>
<memory unit='KiB'>4096000</memory>
<currentMemory unit='KiB'>4096000</currentMemory>
<vcpu placement='static'>2</vcpu>
<os>
<type arch='x86_64' machine='xenfv'>hvm</type>
<loader type='rom'>hvmloader</loader>
<boot dev='cdrom'/>
<boot dev='hd'/>
</os>
<features>
<acpi/>
<apic/>
<pae/>
<viridian/>
<xen>
<e820_host state='on'/>
</xen>
</features>
<cpu mode='host-passthrough'>
<feature policy='disable' name='vmx'/>
<feature policy='disable' name='svm'/>
<feature policy='require' name='invtsc'/>
</cpu>
<clock offset='variable' adjustment='0' basis='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>destroy</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator type='stubdom-linux' cmdline='-qubes-audio:audiovm_xid=0 -qubes-net:client_ip=10.137.0.60,dns_0=10.139.1.1,dns_1=10.139.1.2,gw=10.138.35.186,netmask=255.255.255.255'/>
<disk type='block' device='disk'>
<driver name='phy' type='raw'/>
<source dev='/dev/mapper/qubes_dom0-vm--ubuntu23--root--snap'/>
<script path='/etc/xen/scripts/qubes-block'/>
<target dev='xvda' bus='xen'/>
</disk>
<disk type='block' device='disk'>
<driver name='phy' type='raw'/>
<source dev='/dev/mapper/qubes_dom0-vm--ubuntu23--private--snap'/>
<script path='/etc/xen/scripts/qubes-block'/>
<target dev='xvdb' bus='xen'/>
</disk>
<disk type='block' device='disk'>
<driver name='phy' type='raw'/>
<source dev='/dev/mapper/qubes_dom0-vm--ubuntu23--volatile'/>
<script path='/etc/xen/scripts/qubes-block'/>
<target dev='xvdc' bus='xen'/>
</disk>
<controller type='xenbus' index='0'/>
<interface type='ethernet'>
<mac address='00:16:3e:5e:6c:00'/>
<ip address='10.137.0.60' family='ipv4'/>
<script path='vif-route-qubes'/>
<backenddomain name='sys-firewall'/>
</interface>
<console type='pty'>
<target type='xen' port='0'/>
</console>
<input type='tablet' bus='usb'/>
<input type='mouse' bus='ps2'/>
<input type='keyboard' bus='ps2'/>
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='xen'/>
<source>
<address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
</source>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='xen'/>
<source>
<address domain='0x0000' bus='0x03' slot='0x00' function='0x1'/>
</source>
</hostdev>
<memballoon model='xen'/>
</devices>
</domain>
cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-6.5.0-9-generic root=UUID=bf61702b-336e-4dc1-ba38-e364f19271d5 ro pci=nomsi quiet splash vt.handoff=7
Output of sudo dmesg
is attached as qubes-ubuntu-dmesg.log
. qubes-ubuntu-usb-dmesg.log
is for a boot with the usb controlled attached.
qubes-ubuntu-dmesg.log (83.8 KB)
qubes-ubuntu-usb-dmesg.log (127.3 KB)
KVM
<domain type='kvm'>
<name>ubuntu23.10</name>
<uuid>decd318f-8d3e-491e-a290-e074fd2bc240</uuid>
<title>Ubuntu 23.10</title>
<metadata>
<boxes:gnome-boxes xmlns:boxes="https://wiki.gnome.org/Apps/Boxes">
<os-state>live</os-state>
<media-id>http://ubuntu.com/ubuntu/23.10:0</media-id>
<media>/home/user/Downloads/ubuntu-23.10.1-desktop-amd64.iso</media>
</boxes:gnome-boxes>
<libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
<libosinfo:os id="http://ubuntu.com/ubuntu/23.10"/>
</libosinfo:libosinfo>
</metadata>
<memory unit='KiB'>4194304</memory>
<currentMemory unit='KiB'>4194304</currentMemory>
<vcpu placement='static'>24</vcpu>
<os>
<type arch='x86_64' machine='pc-q35-8.2'>hvm</type>
<boot dev='cdrom'/>
<boot dev='hd'/>
<bootmenu enable='yes'/>
</os>
<features>
<acpi/>
<apic/>
</features>
<cpu mode='host-passthrough' check='none' migratable='on'>
<topology sockets='1' dies='1' clusters='1' cores='12' threads='2'/>
</cpu>
<clock offset='localtime'>
<timer name='rtc' tickpolicy='catchup'/>
<timer name='pit' tickpolicy='delay'/>
<timer name='hpet' present='no'/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>destroy</on_reboot>
<on_crash>destroy</on_crash>
<pm>
<suspend-to-mem enabled='no'/>
<suspend-to-disk enabled='no'/>
</pm>
<devices>
<emulator>/usr/bin/qemu-system-x86_64</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='writeback' discard='unmap'/>
<source file='/home/user/.local/share/gnome-boxes/images/ubuntu23.10'/>
<target dev='vda' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
</disk>
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='/home/user/Downloads/ubuntu-23.10.1-desktop-amd64.iso' startupPolicy='mandatory'/>
<target dev='hdc' bus='sata'/>
<readonly/>
<address type='drive' controller='0' bus='0' target='0' unit='2'/>
</disk>
<controller type='usb' index='0' model='qemu-xhci' ports='15'>
<address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
</controller>
<controller type='sata' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
</controller>
<controller type='pci' index='0' model='pcie-root'/>
<controller type='pci' index='1' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='1' port='0x10'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
</controller>
<controller type='pci' index='2' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='2' port='0x11'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
</controller>
<controller type='pci' index='3' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='3' port='0x12'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
</controller>
<controller type='pci' index='4' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='4' port='0x13'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
</controller>
<controller type='pci' index='5' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='5' port='0x14'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
</controller>
<controller type='pci' index='6' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='6' port='0x15'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/>
</controller>
<controller type='virtio-serial' index='0'>
<address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
</controller>
<controller type='ccid' index='0'>
<address type='usb' bus='0' port='1'/>
</controller>
<interface type='user'>
<mac address='52:54:00:aa:98:d9'/>
<model type='virtio'/>
<address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
</interface>
<smartcard mode='passthrough' type='spicevmc'>
<address type='ccid' controller='0' slot='0'/>
</smartcard>
<serial type='pty'>
<target type='isa-serial' port='0'>
<model name='isa-serial'/>
</target>
</serial>
<console type='pty'>
<target type='serial' port='0'/>
</console>
<channel type='spicevmc'>
<target type='virtio' name='com.redhat.spice.0'/>
<address type='virtio-serial' controller='0' bus='0' port='1'/>
</channel>
<channel type='spiceport'>
<source channel='org.spice-space.webdav.0'/>
<target type='virtio' name='org.spice-space.webdav.0'/>
<address type='virtio-serial' controller='0' bus='0' port='2'/>
</channel>
<input type='tablet' bus='usb'>
<address type='usb' bus='0' port='2'/>
</input>
<input type='mouse' bus='ps2'/>
<input type='keyboard' bus='ps2'/>
<graphics type='spice'>
<listen type='none'/>
<image compression='off'/>
<gl enable='no'/>
</graphics>
<sound model='ich9'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x1b' function='0x0'/>
</sound>
<audio id='1' type='spice'/>
<video>
<model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'>
<acceleration accel3d='no'/>
</model>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
</video>
<redirdev bus='usb' type='spicevmc'>
<address type='usb' bus='0' port='3'/>
</redirdev>
<redirdev bus='usb' type='spicevmc'>
<address type='usb' bus='0' port='4'/>
</redirdev>
<redirdev bus='usb' type='spicevmc'>
<address type='usb' bus='0' port='5'/>
</redirdev>
<redirdev bus='usb' type='spicevmc'>
<address type='usb' bus='0' port='6'/>
</redirdev>
<watchdog model='itco' action='reset'/>
<memballoon model='virtio'>
<address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
</memballoon>
</devices>
</domain>
cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-6.5.0-44-generic root=UUID=ed527a6f-fa79-49ad-b36e-14ce77dde89b ro quiet splash vt.handoff=7
Output of sudo dmesg
is attached as kvm-ubuntu-dmesg.log
kvm-ubuntu-dmesg.log (67.3 KB)