VM with PCI passthrough refusing to restart

My sys-usb, after its template updated to Debian 12.11 fails to start. Template is debian-12, updated through qubes-update-gui, requesting restart from there (which I usually do not do to avoid risking losing my devices).

2025-05-25 17:05:14.950+0000: libxl: libxl_qmp.c:1837:qmp_ev_parse_error_messages: Domain 24:Could not open '/sys/bus/pci/devices/0000:00:00.3/config': No such file or directory
2025-05-25 17:05:14.950+0000: libxl: libxl_pci.c:1275:pci_add_qmp_device_add_cb: Domain 24:Retrying PCI add 10
2025-05-25 17:05:15.276+0000: libxl: libxl_pci.c:1864:device_pci_add_done: Domain 24:libxl__device_pci_add failed for PCI device 0:7:0.3 (rc -9)
2025-05-25 17:05:15.277+0000: libxl: libxl_pci.c:1864:device_pci_add_done: Domain 24:libxl__device_pci_add failed for PCI device 0:7:0.4 (rc -9)
2025-05-25 17:05:15.277+0000: libxl: libxl_create.c:2000:domcreate_attach_devices: Domain 24:unable to add pci devices

I was initially confused about this 0000:00:00.3 which never existed on the host… I guess is the PCI id in the stubdom, to which the host’s 0000:07:00.3 device is passed-through?

A similar post makes me look at swiotlb, and sure enough when the guest fails the guest subdom kernel complains:

[2025-05-25 18:51:49] xen:swiotlb_xen: Warning: only able to allocate 4 MB for software IO TLB
[2025-05-25 18:51:49] xen:swiotlb_xen: Lowering to 2MB
[2025-05-25 18:51:49] xen:swiotlb_xen: Lowering to 2MB
[2025-05-25 18:51:49] xen:swiotlb_xen: Lowering to 2MB
[2025-05-25 18:51:49] xen:swiotlb_xen: Failed to get contiguous memory for DMA from Xen!
[2025-05-25 18:51:49] You either: don't have the permissions, do not have enough free memory under 4GB, or the hypervisor memory is too fragmented! (rc:-12)
[2025-05-25 18:51:49] pcifront pci-0: Could not setup SWIOTLB!
[2025-05-25 18:51:49] pcifront pci-0: 12 Error setting up PCI Frontend

Sure enough this did not happen on previous VM boot:

[2025-05-13 10:36:23] xen:swiotlb_xen: Warning: only able to allocate 4 MB for software IO TLB
[2025-05-13 10:36:23] software IO TLB: mapped [mem 0x0000000003000000-0x0000000003400000] (4MB)

Am I correct that this pcifront failure to initialize is the reason why passthrough configuration will necessarily fail? I guess this could be tested early to produce a more understandable error to the user. Opened issue for that aspect.

I tried to set swiotlb=8192 as suggested in that other thread. Unlike that other post, I still have the same errors referring to a non-existent PCI device, and the Could not setup SWIOTLB! error is still there.

Note the stubdom logs also are a bit more verbose:

[2025-05-25 19:05:14] from-vchan: {"execute":"device_add","id":2020372746,"arguments":{"driver":"xen-pci-passthrough","id":"pci-pt-07_00.3","hostaddr":"0000:07:00.3"}}
[2025-05-25 19:05:14] 
[2025-05-25 19:05:14] from-unix: {"id": 2020372746, "error": {"class": "GenericError", "desc": "Could not open '/sys/bus/pci/devices/0000:00:00.3/config': No such file or directory"}}
[2025-05-25 19:05:14] 

So… maybe I do not have enough free memory under 4GB, or the hypervisor memory is too fragmented - do we have an easy way to check that?

I had this issue again at the start of next week, and just managed to do without any USB device until I could get a closer look. Now to my surprise, when attempting to launch sys-usb to get fresh logs… that time it did manage to launch.

The PCI initialisation of sys-usb-dm sure does include that device, and the 1022:1639 pci id matches that of the USB controllers that get passed through.

[2025-06-09 11:40:45] pcifront pci-0: Installing PCI frontend
[2025-06-09 11:40:45] xen:swiotlb_xen: Warning: only able to allocate 4 MB for software IO TLB
[2025-06-09 11:40:45] software IO TLB: mapped [mem 0x0000000003000000-0x0000000003400000] (4MB)
[2025-06-09 11:40:45] Invalid max_queues (4), will use default max: 1.
[2025-06-09 11:40:45] pcifront pci-0: Creating PCI Frontend Bus 0000:00
[2025-06-09 11:40:45] pcifront pci-0: PCI host bridge to bus 0000:00
[2025-06-09 11:40:45] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
[2025-06-09 11:40:45] pci_bus 0000:00: root bus resource [mem 0x00000000-0xfffffffffff]
[2025-06-09 11:40:45] pci_bus 0000:00: root bus resource [bus 00-ff]
[2025-06-09 11:40:45] pci 0000:00:00.3: [1022:1639] type 00 class 0x0c0330
[2025-06-09 11:40:45] pci 0000:00:00.3: reg 0x10: [mem 0xfe200000-0xfe2fffff 64bit]
[2025-06-09 11:40:45] pcifront pci-0: New device on 0000:00:00.3 found.
[2025-06-09 11:40:45] pci 0000:00:00.4: [1022:1639] type 00 class 0x0c0330
[2025-06-09 11:40:45] pci 0000:00:00.4: reg 0x10: [mem 0xfe100000-0xfe1fffff 64bit]
[2025-06-09 11:40:45] pcifront pci-0: New device on 0000:00:00.4 found.
[2025-06-09 11:40:45] pcifront pci-0: claiming resource 0000:00:00.3/0
[2025-06-09 11:40:45] pcifront pci-0: claiming resource 0000:00:00.4/0
[2025-06-09 11:40:45] tun: Universal TUN/TAP device driver, 1.6

aaand… 2 days later we are, I plug my usual USB audio headset, then assign it to a VM, and sys-usb bombs out with a GP fault:

[2025-06-11 14:03:19] [104971.619778] usb 4-2: new full-speed USB device number 5 using xhci_hcd
[2025-06-11 14:03:19] [104971.769098] usb 4-2: New USB device found, idVendor=12d1, idProduct=0010, bcdDevice= 1.00
[2025-06-11 14:03:19] [104971.769127] usb 4-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[2025-06-11 14:03:19] [104971.769149] usb 4-2: Product: KT USB Audio
[2025-06-11 14:03:19] [104971.769161] usb 4-2: Manufacturer: KTMicro
[2025-06-11 14:03:19] [104971.769185] usb 4-2: SerialNumber: 2021-07-19-0000-0000-0000-
[2025-06-11 14:03:19] [104971.930942] input: KTMicro KT USB Audio Consumer Control as /devices/pci0000:00/0000:00:07.0/usb4/4-2/4-2:1.3/0003:12D1:0010.0009/input/input17
[2025-06-11 14:03:20] [104971.981820] input: KTMicro KT USB Audio as /devices/pci0000:00/0000:00:07.0/usb4/4-2/4-2:1.3/0003:12D1:0010.0009/input/input18
[2025-06-11 14:03:20] [104971.988529] hid-generic 0003:12D1:0010.0009: input,hiddev98,hidraw6: USB HID v1.10 Device [KTMicro KT USB Audio] on usb-0000:00:07.0-2/input3
[2025-06-11 14:03:22] [104974.438004] usb 3-1-port2: Cannot enable. Maybe the USB cable is bad?
[2025-06-11 14:03:26] [104978.541214] usb 3-1-port2: Cannot enable. Maybe the USB cable is bad?
[2025-06-11 14:03:30] [104982.647013] usb 3-1-port2: Cannot enable. Maybe the USB cable is bad?
[2025-06-11 14:03:34] [104986.749241] usb 3-1-port2: Cannot enable. Maybe the USB cable is bad?
[2025-06-11 14:03:38] [104990.853042] usb 3-1-port2: Cannot enable. Maybe the USB cable is bad?
[2025-06-11 14:03:42] [104994.957546] usb 3-1-port2: Cannot enable. Maybe the USB cable is bad?
[2025-06-11 14:03:47] [104999.061043] usb 3-1-port2: Cannot enable. Maybe the USB cable is bad?
[2025-06-11 14:03:49] [105001.638348] usbip-host 4-2: usbip-host: register new device (bus 4 dev 5)
[2025-06-11 14:03:49] [105001.652270] usbip-host 4-2: stub up
[2025-06-11 14:03:50] [105002.002853] usbip-host 4-2: endpoint 0 is stalled
[2025-06-11 14:03:50] [105002.004109] usbip-host 4-2: endpoint 0 is stalled
[2025-06-11 14:03:50] [105002.006163] usbip-host 4-2: endpoint 0 is stalled
[2025-06-11 14:03:50] [105002.040428] usbip-host 4-2: usb_set_interface done: inf 1 alt 0
[2025-06-11 14:03:50] [105002.044120] usbip-host 4-2: usb_set_interface done: inf 1 alt 1
[2025-06-11 14:03:50] [105002.054601] usbip-host 4-2: usb_set_interface done: inf 1 alt 0
[2025-06-11 14:03:50] [105002.057705] usbip-host 4-2: usb_set_interface done: inf 2 alt 0
[2025-06-11 14:03:50] [105002.061124] usbip-host 4-2: usb_set_interface done: inf 2 alt 1
[2025-06-11 14:03:50] [105002.071230] usbip-host 4-2: usb_set_interface done: inf 2 alt 0
[2025-06-11 14:03:50] [105002.075077] usbip-host 4-2: usb_set_interface done: inf 2 alt 2
[2025-06-11 14:03:50] [105002.084228] usbip-host 4-2: usb_set_interface done: inf 2 alt 0
[2025-06-11 14:03:50] [105002.488182] usbip-host 4-2: usb_set_interface done: inf 2 alt 1
[2025-06-11 14:03:50] [105002.510087] usbip-host 4-2: usb_set_interface done: inf 2 alt 0
[2025-06-11 14:03:50] [105002.518127] usbip-host 4-2: usb_set_interface done: inf 2 alt 1
[2025-06-11 14:03:50] [105002.526108] usbip-host 4-2: usb_set_interface done: inf 2 alt 0
[2025-06-11 14:03:50] [105002.559135] usbip-host 4-2: usb_set_interface done: inf 2 alt 1
[2025-06-11 14:03:50] [105002.577155] usbip-host 4-2: usb_set_interface done: inf 2 alt 0
[2025-06-11 14:03:50] [105002.582120] usbip-host 4-2: usb_set_interface done: inf 1 alt 1
[2025-06-11 14:03:50] [105002.603098] usbip-host 4-2: usb_set_interface done: inf 2 alt 1
[2025-06-11 14:03:50] [105002.611106] usbip-host 4-2: usb_set_interface done: inf 2 alt 0
[2025-06-11 14:03:50] [105002.616140] usbip-host 4-2: usb_set_interface done: inf 2 alt 1
[2025-06-11 14:03:50] [105002.623079] usbip-host 4-2: usb_set_interface done: inf 2 alt 0
[2025-06-11 14:03:50] [105002.627108] usbip-host 4-2: usb_set_interface done: inf 2 alt 1
[2025-06-11 14:03:50] [105002.635107] usbip-host 4-2: usb_set_interface done: inf 2 alt 0
[2025-06-11 14:03:50] [105002.639079] usbip-host 4-2: usb_set_interface done: inf 1 alt 0
[2025-06-11 14:03:50] [105002.644178] usbip-host 4-2: usb_set_interface done: inf 2 alt 2
[2025-06-11 14:03:50] [105002.651101] usbip-host 4-2: usb_set_interface done: inf 2 alt 0
[2025-06-11 14:03:50] [105002.655163] usbip-host 4-2: usb_set_interface done: inf 1 alt 1
[2025-06-11 14:03:50] [105002.663083] usbip-host 4-2: usb_set_interface done: inf 1 alt 0
[2025-06-11 14:03:50] [105002.724210] usbip-host 4-2: usb_set_interface done: inf 2 alt 2
[2025-06-11 14:03:51] [105003.165224] usb 3-1-port2: Cannot enable. Maybe the USB cable is bad?
[2025-06-11 14:03:55] [105007.269077] usb 3-1-port2: Cannot enable. Maybe the USB cable is bad?
[2025-06-11 14:03:57] [105009.299282] usbip-host 4-2: usb_set_interface done: inf 1 alt 1
[2025-06-11 14:03:57] [105009.398696] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.398739] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.398777] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.398802] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.398845] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.398882] usbip-host 4-2: unlinked by a call to usb_unlink[2025-06-11 14:03:57] _urb()
[2025-06-11 14:03:57] [105009.398991] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.399088] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.399165] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.399257] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.399335] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.399418] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.403313] usbip-host 4-2: usb_set_interface done: inf 1 alt 0
[2025-06-11 14:03:57] [105009.404205] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.404243] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.404270] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[105009.409159] usbip-host 4-2: usb_set_interface done: inf 2 alt 0
[2025-06-11 14:03:57] [105009.414079] usbip-host 4-2: usb_set_interface done: inf 1 alt 1
[2025-06-11 14:03:57] [105009.422149] usbip-host 4-2: usb_set_interface done: inf 2 alt 2
[2025-06-11 14:03:57] [105009.439860] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.439898] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.439918] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.444157] usbip-host 4-2: usb_set_interface done: inf 2 alt 0
[2025-06-11 14:03:57] [105009.445098] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.445137] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.445157] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.445191] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.445212] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.445237] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.445258] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.445283] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.445304] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.445329] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.445350] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.445378] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.450104] usbip-host 4-2: usb_set_interface done: inf 1 alt 0
[2025-06-11 14:03:57] [105009.455086] usbip-host 4-2: usb_set_interface done: inf 2 alt 2
[2025-06-11 14:03:57] [105009.462295] usbip-host 4-2: usb_set_interface done: inf 1 alt 1
[2025-06-11 14:03:57] [105009.476563] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.476595] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.476640] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.476672] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.476697] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.476727] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-06-11 14:03:57] [105009.476752] usbip-host 4-2: unlinked by a call to[2025-06-11 14:03:57]  usb_unlink_urb()
[2025-06-11 14:03:57] [105009.499221] Oops: general protection fault, probably for non-canonical address 0x431116262faede32: 0000 [#1] PREEMPT SMP NOPTI
[2025-06-11 14:03:57] [105009.499282] Hardware name: Xen HVM domU, BIOS 4.17.5 12/11/2024
[2025-06-11 14:03:57] [105009.499300] RIP: 0010:stub_rx_pdu+0x22c/0x270 [usbip_host]
[2025-06-11 14:03:57] [105009.499320] Code: c3 01 48 8b 3c e8 e8 a3 d3 b1 d8 41 89 c1 83 f8 8d 74 2b 49 8b 44 24 20 41 8b 4c 24 30 89 da 48 c7 c6 20 e7 bf c0 4d 8b 04 24 <48> 8b 04 e8 48 8b 78 40 48 81 c7 a8 00 00 00 e8 c0 4e a2 d8 41 39
[2025-06-11 14:03:57] [105009.499365] RSP: 0018:ffffb60642807e80 EFLAGS: 00010216
[2025-06-11 14:03:57] [105009.499381] RAX: 431116262faede32 RBX: 0000000000000001 RCX: 0000000000000000
[2025-06-11 14:03:57] [105009.499402] RDX: 0000000000000001 RSI: ffffffffc0bfe720 RDI: ffffffff9ba3bb10
[2025-06-11 14:03:57] [105009.499423] RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000ffffffd5
[2025-06-11 14:03:57] [105009.499444] R10: 0000000000000000 R11: ffff910d8f137a00 R12: ffff910d876c0980
[2025-06-11 14:03:57] [105009.499467] R13: ffff910d84026c00 R14: ffff910d84026c94 R15: ffffffffc0bf5890
[2025-06-11 14:03:57] [105009.499491] FS:  0000000000000000(0000) GS:ffff910d90c00000(0000) knlGS:0000000000000000
[2025-06-11 14:03:57] [105009.499513] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2025-06-11 14:03:57] [105009.499531] CR2: 000075f50faafe38 CR3: 0000000007390000 CR4: 0000000000350ef0
[2025-06-11 14:03:57] [105009.499555] Call Trace:
[2025-06-11 14:03:57] [105009.499566]  <TASK>
[2025-06-11 14:03:57] [105009.499577]  stub_rx_loop+0x23/0x40 [usbip_host]
[2025-06-11 14:03:57] [105009.499595]  kthread+0xd2/0x100
[2025-06-11 14:03:57] [105009.499614]  ? __pfx_kthread+0x10/0x10
[2025-06-11 14:03:57] [105009.499629]  ret_from_fork+0x34/0x50
[2025-06-11 14:03:57] [105009.499643]  ? __pfx_kthread+0x10/0x10
[2025-06-11 14:03:57] [105009.499656]  ret_from_fork_asm+0x1a/0x30
[2025-06-11 14:03:57] [105009.499672]  </TASK>
[2025-06-11 14:03:57] [105009.499681] Modules linked in: usbip_host usbip_core snd_usb_audio snd_usbmidi_lib snd_ump snd_hwdep snd_rawmidi snd_ctl_led snd_seq_dummy snd_hrtimer snd_seq snd_seq_device rfcomm ftdi_sio uvcvideo uvc videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc nft_nat xen_netback bnep nft_reject_ipv6 nf_reject_ipv6 btusb btrtl btintel btbcm nft_reject_ipv4 btmtk nf_reject_ipv4 nft_reject bluetooth nft_ct rfkill nft_masq joydev nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xenfs nf_tables nfnetlink binfmt_misc intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel snd_pcm polyval_clmulni polyval_generic snd_timer ghash_clmulni_intel xhci_pci sha512_ssse3 snd sha256_ssse3 drm_vram_helper sha1_ssse3 soundcore ehci_pci drm_ttm_helper xhci_hcd pcspkr ttm ehci_hcd ata_generic floppy pata_acpi i2c_piix4 i2c_smbus serio_raw xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn parport_pc ppdev lp parport fuse loop ip_tables overlay xen_blkfront
[2025-06-11 14:03:57] [105009.499962] ---[ end trace 0000000000000000 ]---
[2025-06-11 14:03:57] [105009.499978] RIP: 0010:stub_rx_pdu+0x22c/0x270 [usbip_host]
[2025-06-11 14:03:57] [105009.500002] Code: c3 01 48 8b 3c e8 e8 a3 d3 b1 d8 41 89 c1 83 f8 8d 74 2b 49 8b 44 24 20 41 8b 4c 24 30 89 da 48 c7 c6 20 e7 bf c0 4d 8b 04 24 <48> 8b 04 e8 48 8b 78 40 48 81 c7 a8 00 00 00 e8 c0 4e a2 d8 41 39
[2025-06-11 14:03:57] [105009.500054] RSP: 0018:ffffb60642807e80 EFLAGS: 00010216
[2025-06-11 14:03:57] [105009.500077] RAX: 431116262faede32 RBX: 0000000000000001 RCX: 0000000000000000
[2025-06-11 14:03:57] [105009.500112] RDX: 0000000000000001 RSI: ffffffffc0bfe720 RDI: ffffffff9ba3bb10
[2025-06-11 14:03:57] [105009.500138] RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000ffffffd5
[2025-06-11 14:03:57] [105009.500164] R10: 0000000000000000 R11: ffff910d8f137a00 R12: ffff910d876c0980
[2025-06-11 14:03:57] [105009.500185] R13: ffff910d84026c00 R14: ffff910d84026c94 R15: ffffffffc0bf5890
[2025-06-11 14:03:57] [105009.500213] FS:  0000000000000000(0000) GS:ffff910d90c00000(0000) knlGS:0000000000000000
[2025-06-11 14:03:57] [105009.500241] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2025-06-11 14:03:57] [105009.500264] CR2: 000075f50faafe38 CR3: 0000000007390000 CR4: 0000000000350ef0
[2025-06-11 14:03:57] [105009.500286] Kernel panic - not syncing: Fatal exception
[2025-06-11 14:03:57] [105009.500535] Kernel Offset: 0x18800000 from 0xffffffff80200000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

On sys-usb-dm-side:

[2025-06-11 14:03:57] {"timestamp": {"seconds": 1749643434, "microseconds": 302222}, "event": "RESET", "data": {"guest": true, "reason": "guest-reset"}}
[2025-06-11 14:03:57] pcifront pci-0: Rescanning PCI Frontend Bus 0000:00
[2025-06-11 14:03:57] pcifront pci-0: Rescanning PCI Frontend Bus 0000:00
[2025-06-11 14:03:57] pci_bus 0000:00: busn_res: [bus 00-ff] is released
[2025-06-11 14:03:57] ------------[ cut here ]------------
[2025-06-11 14:03:57] sysfs group 'power' not found for kobject '0000:00'
[2025-06-11 14:03:57] WARNING: CPU: 0 PID: 10 at 0xffffffff810e970b
[2025-06-11 14:03:57] CPU: 0 PID: 10 Comm: xenwatch Tainted: G                T 5.10.200-xen-stubdom #1
[2025-06-11 14:03:57] RIP: e030:0xffffffff810e970b
[2025-06-11 14:03:57] Code: f6 74 2e 49 89 fc 48 89 df e8 ab fc ff ff 48 89 c3 48 85 c0 75 23 49 8b 14

If after that I restart sys-usb it does start, but does properly start up, no USB device show up. Apparently the PCI devices are seen but… is the emulated HCD just refusing the registration of the passed-through devices?

[2025-06-11 14:22:14] [    2.769335] xhci_hcd 0000:00:07.0: new USB bus registered, assigned bus number 5
[2025-06-11 14:22:14] [    2.769372] xhci_hcd 0000:00:07.0: Host supports USB 3.1 Enhanced SuperSpeed
[2025-06-11 14:22:14] [    2.770233] usb usb4: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 6.12
[2025-06-11 14:22:14] [    2.770272] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[2025-06-11 14:22:14] [    2.770308] usb usb4: Product: xHCI Host Controller
[2025-06-11 14:22:14] [    2.770324] usb usb4: Manufacturer: Linux 6.12.25-1.qubes.fc37.x86_64 xhci-hcd
[2025-06-11 14:22:14] [    2.770358] usb usb4: SerialNumber: 0000:00:07.0
[2025-06-11 14:22:14] [    2.776534] hub 4-0:1.0: USB hub found
[2025-06-11 14:22:14] [    2.776569] hub 4-0:1.0: 4 ports detected
[2025-06-11 14:22:14] [    2.776696] usb usb5: We don't know the algorithms for LPM for this host, disabling LPM.
[2025-06-11 14:22:14] [    2.779356] usb usb5: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 6.12
[2025-06-11 14:22:14] [    2.779386] usb usb5: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[2025-06-11 14:22:14] [    2.779408] usb usb5: Product: xHCI Host Controller
[2025-06-11 14:22:14] [    2.779425] usb usb5: Manufacturer: Linux 6.12.25-1.qubes.fc37.x86_64 xhci-hcd
[2025-06-11 14:22:14] [    2.779447] usb usb5: SerialNumber: 0000:00:07.0
[2025-06-11 14:22:14] [    2.779560] hub 5-0:1.0: USB hub found
[2025-06-11 14:22:14] [    2.779586] hub 5-0:1.0: 2 ports detected
[2025-06-11 14:22:14] [    2.807668] Error: Driver 'pcspkr' is already registered, aborting...
[2025-06-11 14:22:14] [    2.854255] usb 1-1: new high-speed USB device number 2 using ehci-pci
...
[2025-06-11 14:22:14] [    3.005741] usb 1-1: New USB device found, idVendor=0627, idProduct=0001, bcdDevice= 0.00
[2025-06-11 14:22:14] [    3.005774] usb 1-1: New USB device strings: Mfr=1, Product=3, SerialNumber=10
[2025-06-11 14:22:14] [    3.005796] usb 1-1: Product: QEMU USB Tablet
[2025-06-11 14:22:14] [    3.005812] usb 1-1: Manufacturer: QEMU
[2025-06-11 14:22:14] [    3.005826] usb 1-1: SerialNumber: 42
[2025-06-11 14:22:14] [.[0;32m  OK  .[0m] Finished .[0;1;39mqubes-network-upl…e.[0m - Qubes network uplink wait.
[2025-06-11 14:22:14] [    3.025918] input: QEMU QEMU USB Tablet as /devices/pci0000:00/0000:00:05.0/usb1/1-1/1-1:1.0/0003:0627:0001.0001/input/input6
[2025-06-11 14:22:14] [    3.026154] hid-generic 0003:0627:0001.0001: input,hidraw0: USB HID v0.01 Mouse [QEMU QEMU USB Tablet] on usb-0000:00:05.0-1/input0
...
[2025-06-11 14:22:16] sys-usb login: [    9.957256] xhci_hcd 0000:00:07.0: Error while assigning device slot ID: Command Aborted
[2025-06-11 14:22:21] [    9.957260] xhci_hcd 0000:00:06.0: Error while assigning device slot ID: Command Aborted
[2025-06-11 14:22:21] [    9.957286] xhci_hcd 0000:00:07.0: Max number of devices this xHCI host supports is 64.
[2025-06-11 14:22:21] [    9.957307] xhci_hcd 0000:00:06.0: Max number of devices this xHCI host supports is 64.
[2025-06-11 14:22:21] [    9.957327] usb usb4-port2: couldn't allocate usb_device
[2025-06-11 14:22:21] [    9.957348] usb usb2-port1: couldn't allocate usb_device
[2025-06-11 14:22:33] [   22.245254] xhci_hcd 0000:00:06.0: Error while assigning device slot ID: Command Aborted
[2025-06-11 14:22:33] [   22.245285] xhci_hcd 0000:00:06.0: Max number of devices this xHCI host supports is 64.
[2025-06-11 14:22:33] [   22.245309] usb usb2-port3: couldn't allocate usb_device
[2025-06-11 14:22:33] [   22.245324] xhci_hcd 0000:00:07.0: Error while assigning device slot ID: Command Aborted
[2025-06-11 14:22:33] [   22.245396] xhci_hcd 0000:00:07.0: Max number of devices this xHCI host supports is 64.
[2025-06-11 14:22:33] [   22.245445] usb usb4-port3: couldn't allocate usb_device
[2025-06-11 14:22:45] [   34.533273] xhci_hcd 0000:00:06.0: Error while assigning device slot ID: Command Aborted
[2025-06-11 14:22:45] [   34.533304] xhci_hcd 0000:00:06.0: Max number of devices this xHCI host supports is 64.
[2025-06-11 14:22:45] [   34.533328] usb usb3-port1: couldn't allocate usb_device
[2025-06-11 14:22:58] [   46.821240] xhci_hcd 0000:00:06.0: Error while assigning device slot ID: Command Aborted
[2025-06-11 14:22:58] [   46.821269] xhci_hcd 0000:00:06.0: Max number of devices this xHCI host supports is 64.
[2025-06-11 14:22:58] [   46.821293] usb usb2-port4: couldn't allocate usb_device

Hi there.
I noticed a lot of “Maybe the USB cable is bad?” messages.
Could it be… a hardware problem?

1 Like

I thought about this, and indeed the way the cable pulls sideways on the plug has already caused device disconnections so i’m quite careful with it.
But even then, if a bad cable/whatever causes a GP fault that would rather hint that a bit more of fuzzing is going to open the gates here - luckily this is all in sys-usb precisely to mitigate that kind of issues

Oops, I missed that it was a real GPF and kernel panic… Even the worst cable should not do that.

Not wanting to encourage any complacency, but…

On the positive, it looks (to my very inexperienced eyes) like this bit of kernel code is on the host side, only dealing with the USB device, and not with packets from the client VM. This should probably not be very reassuring, but it suggests to me that Qubes kept the damage localised. Not helpful if USB mouse and keyboard have disappeared, though.

For the restart of sys-usb: does the usb controller have 'no strict reset" enabled? If it supports device reset, I would expect it to be “as good as new” for a restart.

For the memory suspicions and swiotlb… I realise I have no real idea how DMA is handled in xen/qubes.

Can the DMA setup fail because the stubdom has insufficient total memory allocated? Or must it be the problem of insufficient contiguous memory accessible to xen?

(My first response to any unexpected VM crash is “more memory”, on the basis of 2 or 3 cases where it helped. I seem to remember there is a default memory setting which is allocated to the stubs. Sys-usb has its own setting, of course. I guess it’s more complicated here)

One parameter which I cannot see in your messages is the kernel version.

I saw discussion of DMA/audio problems in some 6.4 kernels on the ArchLinux forums, here.

It may be useful to try kernel-latest, if you did not already, although I do not immediately see how that problem would prevent the VM from starting. I would probably also consider increasing memory given to the sys-usb, and maybe to the stubdom…

It first occurred with dom0 running kernel-latest-1000:6.12.11-1.qubes.fc37 (at that time kernel-latest-1000:6.14.4-1.qubes.fc37 had already been installed and has been used ever since, thus featuring the second occurrence),

What makes you think of a link with that discussion?

Well, I’d rather find out what happens, just increasing memory here looks like wishful thinking :wink:

I’m sorry, I think that is not the link I meant to post. Somewhere, I’m fairly sure I came across a report with multiple usb connections seeming to lead to fragmentation and allocation problems. I will try to hunt for it.

Yes, there is some of that, but I base it on my model for the memory allocation - although that is probably incorrect. I believe that when the kernel needs a region of contiguous memory for DMA, the zone it takes it from is not reserved only for DMA, and can be used for general allocation if there is memory pressure. For relocatable pages to be moved away to make space for DMA, there must be somewhere for them to go.If the fragmentation is caused by non-relocatable pages, then there’s nothing to be done.

Uninformed experiments and speculation...

I tried watching /proc/buddyinfo in dom0 while plugging and unplugging a usb headset, and it shows creation over 1000 of the smallest fragments (4kb, I think) in the DMA32 zone. Disks and other peripherals seemed to give fewer.
Normally these pages could /should be recombined I think, unless they get interleaved with other small allocations, such as what i see -for example- created just by launching and closing a fedora disposable.

I was quite surprised to see so much activity in DMA32 for fully virtualised VMs. I thought the magic of iommu would avoid that, and allow vms to stay out of the way, above the 4Gb limit, except if their device drivers are not so smart.
It’s making me think that dom0 might be where extra memory could be useful, unless it is possible to set aside more for contiguous allocation (I have read about CMA for this - not used in Qubes- and there seem to be some iommu-related kernel parameters which might contribute.)
OTOH it’s possible that my rather old AMD platform might be influencing the behaviour…

In fact those are caused by a USB hub *even with all ports disabled), but the headset is not plugged on that hub. It’s not impossible it would add some stress onto the USB stack, but I assume that should be directly liked to the main issue. Could not reproduce any error by playing with the USB headset plug, I guess that part can be ignored here.

Now I just got a new occurrence, right after cold-restarting qubes following strange events (after updating templates and restarting sys-net, sys-usb and friends, my container inside a given VM lost access to their DNS (not willing to dig into that at this time, possible interaction between sys-net/firewall restarting and the wireguard configuration, it was time to reboot anyway); then the first boot attempt met the transient display of a “kernel panic” screen with no details (and nothing left in the logs); then the next boot attempt is that, sys-usb refusing to start.

Starting to wonder whether restarting the sys-* qubes might have a bigger impact than they should.

As for understanding the situation:

  • logs are pretty much identical, sys-usb-dm fails to setup pcifront because swiotlb fails to setup, because “You either: don’t have the permissions, do not have enough free memory under 4GB, or the hypervisor memory is too fragmented”
  • not sure how the “don’t have the permissions” could happen?
  • as for available DMA32 memory… /proc/zoneinfo may hint that there is low available memory in there, actually:
Node 0, zone    DMA32
  pages free     191059
        boost    0
        min      1305
        low      1950
        high     2595
        promo    3240
        spanned  1044480
        present  693406
        managed  506608
        cma      0
        protection: (0, 0, 1268, 1268, 1268)
      nr_free_pages 191059
      nr_zone_inactive_anon 0
      nr_zone_active_anon 108932
      nr_zone_inactive_file 9082
      nr_zone_active_file 134686
      nr_zone_unevictable 5526
      nr_zone_write_pending 1
Node 0, zone      DMA      0      0      0      0      1      1      1      1      0      0      0 
Node 0, zone    DMA32    108     66    228     94    129     91     91     26     11      7    152 
Node 0, zone   Normal    406    373    308     54     18      3      0      0      0      0      0 

Now I’m not even sure those numbers are pertinent, as I took time before collecting them, and then as happened previously a new attempt at starting sys-usb did succeed. No success yet getting sys-usb to panic by forwarding my headset to a few qubes…
Damned that’s so frustrating.

Ah it took more time but the sys-usb GPF is there too, slightly different. This time occurred under different conditions: I was simply using the devices (with audio in the headset, though something (else?) was interfering with the audio on the system (also impacting the qubes-audio path for all VMs, not only the usb headset forwarded to one VM specifically), and suddenly sys-usb popped out.

Except this time I am able to restart sys-usb and use the devices again.

[2025-07-02 16:38:43] [106093.440908] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-07-02 16:38:43] [106093.440948] usbip-host 4-2: unlinked by a call to usb_unlink_urb()
[2025-07-02 16:38:43] [106093.497227] xhci_hcd 0000:00:07.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1
[2025-07-02 16:38:43] [106093.497266] xhci_hcd 0000:00:07.0: Looking for event-dma 000000000d596910 trb-start 000000000d596920 trb-end 000000000d596920 seg-start 000000000d596000 seg-end 000000000d596ff0
[2025-07-02 16:38:43] [106093.606667] Oops: general protection fault, probably for non-canonical address 0xa8cdfe6aeb0b71c5: 0000 [#1] PREEMPT SMP NOPTI
[2025-07-02 16:38:43] [106093.606708] CPU: 0 UID: 0 PID: 4292 Comm: stub_rx Not tainted 6.12.25-1.qubes.fc37.x86_64 #1
[2025-07-02 16:38:43] [106093.606735] Hardware name: Xen HVM domU, BIOS 4.17.5 12/11/2024
[2025-07-02 16:38:43] [106093.606755] RIP: 0010:__kmalloc_noprof+0x140/0x410
[2025-07-02 16:38:43] [106093.606775] Code: 83 78 10 00 48 8b 38 0f 84 fd 01 00 00 48 85 ff 0f 84 f4 01 00 00 41 8b 44 24 28 49 8b 9c 24 b8 00 00 00 49 8b 34 24 48 01 f8 <48> 33 18 48 89 c1 48 89 f8 48 0f c9 48 31 cb 48 8d 8a 00 20 00 00
[2025-07-02 16:38:43] [106093.606824] RSP: 0018:ffffae2ac4ad7da0 EFLAGS: 00010286
[2025-07-02 16:38:43] [106093.606844] RAX: a8cdfe6aeb0b71c5 RBX: 47f70461826f71e5 RCX: 0000000000000000
[2025-07-02 16:38:43] [106093.606867] RDX: 0000000180d50000 RSI: 000000000003c860 RDI: a8cdfe6aeb0b7195
[2025-07-02 16:38:43] [106093.606890] RBP: ffffae2ac4ad7df0 R08: ffff9a96c34aa098 R09: 0000000000000000
[2025-07-02 16:38:43] [106093.606913] R10: ffffae2ac4ad7e08 R11: 0000000000000000 R12: ffff9a96c1046300
[2025-07-02 16:38:43] [106093.606936] R13: 0000000000000060 R14: 0000000000000dc0 R15: ffffffffc0e3340e
[2025-07-02 16:38:43] [106093.606963] FS:  0000000000000000(0000) GS:ffff9a96d0c00000(0000) knlGS:0000000000000000
[2025-07-02 16:38:43] [106093.606986] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2025-07-02 16:38:43] [106093.607007] CR2: 000077eee98ff4e0 CR3: 00000000034b6000 CR4: 0000000000350ef0
[2025-07-02 16:38:43] [106093.607032] Call Trace:
[2025-07-02 16:38:43] [106093.607043]  <TASK>
[2025-07-02 16:38:43] [106093.607056]  ? stub_recv_cmd_submit+0x4ee/0x6e0 [usbip_host]
[2025-07-02 16:38:43] [106093.607079]  stub_recv_cmd_submit+0x4ee/0x6e0 [usbip_host]
[2025-07-02 16:38:43] [106093.607099]  ? __pfx_stub_rx_loop+0x10/0x10 [usbip_host]
[2025-07-02 16:38:43] [106093.607118]  stub_rx_pdu+0x15f/0x270 [usbip_host]
[2025-07-02 16:38:43] [106093.607138]  stub_rx_loop+0x23/0x40 [usbip_host]
[2025-07-02 16:38:43] [106093.607158]  kthread+0xd2/0x100
[2025-07-02 16:38:43] [106093.607174]  ? __pfx_kthread+0x10/0x10
[2025-07-02 16:38:43] [106093.607189]  ret_from_fork+0x34/0x50
[2025-07-02 16:38:43] [106093.607205]  ? __pfx_kthread+0x10/0x10
[2025-07-02 16:38:43] [106093.607219]  ret_from_fork_asm+0x1a/0x30
[2025-07-02 16:38:43] [106093.607237]  </TASK>
[2025-07-02 16:38:43] [106093.607247] Modules linked in: usbip_host usbip_core snd_usb_audio snd_usbmidi_lib snd_ump snd_hwdep snd_rawmidi snd_ctl_led snd_seq_dummy snd_hrtimer snd_seq snd_seq_device rfcomm uvcvideo uvc videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc bnep btusb btrtl btintel nft_nat btbcm btmtk bluetooth rfkill xen_netback nft_reject_ipv6 nf_reject_ipv6 joydev nft_reject_ipv4 nf_reject_ipv4 nft_reject intel_rapl_msr intel_rapl_common nft_ct crct10dif_pclmul crc32_pclmul snd_pcm nft_masq crc32c_intel polyval_clmulni polyval_generic snd_timer ghash_clmulni_intel nft_chain_nat nf_nat xenfs nf_conntrack sha512_ssse3 nf_defrag_ipv6 nf_defrag_ipv4 sha256_ssse3 snd soundcore sha1_ssse3 nf_tables pcspkr nfnetlink xhci_pci floppy binfmt_misc ehci_pci xhci_hcd drm_vram_helper ehci_hcd drm_ttm_helper i2c_piix4 ata_generic ttm i2c_smbus pata_acpi serio_raw xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn parport_pc ppdev lp parport loop fuse ip_tables overlay xen_blkfront
[2025-07-02 16:38:43] [106093.607798] ---[ end trace 0000000000000000 ]---
[2025-07-02 16:38:43] [106093.607815] RIP: 0010:__kmalloc_noprof+0x140/0x410
[2025-07-02 16:38:43] [106093.607833] Code: 83 78 10 00 48 8b 38 0f 84 fd 01 00 00 48 85 ff 0f 84 f4 01 00 00 41 8b 44 24 28 49 8b 9c 24 b8 00 00 00 49 8b 34 24 48 01 f8 <48> 33 18 48 89 c1 48 89 f8 48 0f c9 48 31 cb 48 8d 8a 00 20 00 00
[2025-07-02 16:38:43] [106093.607880] RSP: 0018:ffffae2ac4ad7da0 EFLAGS: 00010286
[2025-07-02 16:38:43] [106093.641757] RAX: a8cdfe6aeb0b71c5 RBX: 47f70461826f71e5 RCX: 0000000000000000
[2025-07-02 16:38:43] [106093.641757] RDX: 0000000180d50000 RSI: 000000000003c860 RDI: a8cdfe6aeb0b7195
[2025-07-02 16:38:43] [106093.641757] RBP: ffffae2ac4ad7df0 R08: ffff9a96c34aa098 R09: 0000000000000000
[2025-07-02 16:38:43] [106093.641757] R10: ffffae2ac4ad7e08 R11: 0000000000000000 R12: ffff9a96c1046300
[2025-07-02 16:38:43] [106093.641757] R13: 0000000000000060 R14: 0000000000000dc0 R15: ffffffffc0e3340e
[2025-07-02 16:38:43] [106093.641757] FS:  0000000000000000(0000) GS:ffff9a96d0c00000(0000) knlGS:0000000000000000
[2025-07-02 16:38:43] [106093.641757] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2025-07-02 16:38:43] [106093.641757] CR2: 000077eee98ff4e0 CR3: 00000000034b6000 CR4: 0000000000350ef0
[2025-07-02 16:38:43] [106093.641757] Kernel panic - not syncing: Fatal exception
[2025-07-02 16:38:43] [106093.641788] Kernel Offset: 0x30e00000 from 0xffffffff80200000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Not sure it will help you but i had similar issues after i plugged a new PCI NIC in the systemBoard. Only by taking off i managed to start the OS ! When i have time,i will try switching the PCI slots to see if is the card,slot or software.