Preventing implicit FLR when using sys-usb + USB keyboard

Churros · August 31, 2024, 1:49pm

I have a question, which is, what is the recommended way to prevent FLRs from being sent during early boot to USB controllers? (emphasis because USB controllers are treated differently at boot when using sys-usb - details below)

I should add: Qubes 4.2; functional USB keyboard and USB boot on a different controller on a different bus and iommu group. They’re not the cause, but their presence complicates some solutions. 3 of 5 USB controllers work fine in sys-usb. And spoiler, I have a workaround, inspired by this proxmox post. Also, the issue with the controllers is not caused by IOMMU groups that I could tell, I checked into that already

The Problem

I have two misbehaving onboard PCI USB controllers that are not amenable to FLR, despite advertising the capability

Once FLR is received, the controller goes into a bad state that can’t be recovered from without a host reboot. Unbind, reset, assigning to pci-stub, pciback, etc. didn’t seem to help get it back into a working state

Summary

Maybe a bus-level reset would work, but I’m not sure of that, and would like to avoid it. There’s no need for a reset in my case, I’m not worried about the window of time between boot and assignment to pciback/xhci_pci and startup of sys-usb

Ditto with reset via the bridge using setpci, as mentioned here

Sending FLR to try to restore functionality obviously doesn’t help, and though I can bind it to the pciback driver when in the bad state, lspci shows output suggestive of a corrupt state. More importantly, starting a VM with it attached causes libvirt to choke when it sees the data that I see plainly in lspci

I can add the exact error later from libvirt and the output from lspci, though I don’t know that I want to go back down the rabbit-hole of finding and solving the low-level quirks at the root of the issue

Edit: The error is the same as mentioned here, invalid PCI header ‘127’

I spent many, many hours reading Qubes and Xen documentation the past 2 weeks and can’t do it anymore

Not The Solution

The solution is not no_strict_reset, because the problem manifests in the initramfs stage, during the time that Qubes is assigning USB controllers to the pciback driver. As far as I can tell, the sysfs unbind operation (or maybe the operation to bind it to a different driver) implicitly causes the FLRs to be sent. And that is, as Joanna would say, “game over” for that controller in this case

The Solution / Workaround

If there’s one good thing to come of this frustrating experience, it’s that I learned exactly how the Qubes PCI hiding works

It boils down to a simple shell script interacting with sysfs (in initramfs, as implied by the “rd” in “rd.qubes.hide”:

github.com

QubesOS/qubes-core-admin-linux/blob/3f0afb7030276f7afc0212bbaec187abdab72860/dracut/modules.d/90qubes-pciback/qubes-pciback.sh

#!/bin/bash --

type getarg >/dev/null 2>&1 || . /lib/dracut-lib.sh
unset re HIDE_PCI usb_in_dom0 dev skip exposed

usb_in_dom0=false

if getargbool 0 rd.qubes.hide_all_usb; then
    # Select all networking and USB devices
    re='0(2|c03)'
elif ! getargbool 1 usbcore.authorized_default; then
    # Select only networking devices, but enable USBguard
    re='02'
    usb_in_dom0=true
else
    re='02'
    warn 'USB in dom0 is not restricted. Consider rd.qubes.hide_all_usb or usbcore.authorized_default=0.'
fi

HIDE_PCI=$(set -o pipefail; { lspci -mm -n | awk "/^[^ ]* \"$re/ {print \$1}";}) ||

This file has been truncated. show original

With the following modification to that script, the kernel is led to believe that the function doesn’t have FLR (or any) reset mechanisms

# Fool the kernel into thinking there are
# no reset mechanisms available for
# the specified BDFs, to prevent implicit
# FLR requests from being sent during
# bind/unbind to/from a driver via sysfs
echo "" > /sys/bus/pci/devices/0000:$BDF1/reset_method
echo "" > /sys/bus/pci/devices/0000:$BDF2/reset_method

With those two lines placed before any sysfs operations occur, the problematic devices are successfully given to the pciback driver via the sysfs operations on each BDF without causing an FLR. The boot completes as normal, and the devices can be handed over to sys-usb and the peasants rejoice

Edit: you also need a dracut -f after the changes, to rebuild the initramfs with the modifications made

Summary

It would be nice if there was a rd.qubes.no-_flr=bdf1[,bdf2]…. that made this cleaner out of the box, and by it’s existence documented this problem as “a thing”. I’m not going to send a PR with that until I’m sure there’s not some simple solution that I simply failed to find or use correctly

Summary

As I mentioned, I spent a significant amount of time trying to find “proper” solutions, that didn’t involve modifying the pciback script - mostly in the form of Xen or kernel command line options. I didn’t have any success with any of them

I have a few other general ideas about how to solve this, but I suspect someone here can immediately give me the best way to do so without much thought to it

Forcing these controllers to pciback or pci-stub by BDF, before the referenced pciback script runs, may be what I want?

I can’t simply blacklist the driver that claims these (xhci_pci) because one of my USB controllers needs to be claimed by xhci_pci to operate properly. Normally I use udev for things related to driver timing and conflicts, but udev is too late for this case

I considered adding the problematic controllers to rd.qubes.dom0_usb, but I’m not sure that will actually help. I am burned out and need to read the script again

tl; dr; as initially stated, what’s the best way to “protect” buggy USB controllers from FLRs caused by the Qubes pciback initramfs script? There should be a clean solution offered by Qubes in my opinion. The workaround is good for now, otherwise, I give up

EDIT: For those curious as to what controllers these may be, to work towards the true root cause (the hardware issue itself) - they’re AMD controllers on WRX90 chipset. I suspect the issue has something to do with the onboard IPMI/BMC. I’ve tried hardware toggling and software toggling (via UEFI) both the BMC functionality in its entirety and the onboard VGA device associated with it but it hasn’t helped the controllers to survive FLRs. I’m happy to do specific things suggested by users but I don’t have time to research further, especially as reboots are expensive time-wise, and toggling via hardware or double-checking IOMMU groups is also expensive

Churros · August 31, 2024, 3:43pm

To elaborate on why I believe it might be reasonable for Qubes to offer something to accommodate this…

I understand that FLR should not break a controller, especially if it advertises FLR (these controllers do)

However, we have no-strict-reset for qvm-pci which, while technically mapping to existing Xen features, is deliberately exposed via qvm-pci and documented by Qubes. I consider it a Qubes feature offered to users to workaround situations similar to this one

It seems that there should be an additional commandline parameter, rd.qubes.pci_no_flr=bdf1[,bdf2]… that could be handled in either the same pciback script as I modified or as a separtscript invoked prior to that script

I don’t have a GitHub account so I won’t be creating an issue. Regardless, I would like to wait and see what other fixes may be available as an alternative to changes to Qubes. If there’s nothing better than the “solution” I used, maybe some kind soul could create an issue and a PR

apparatus · September 1, 2024, 8:16am

Maybe you can use softdep like this:

Blacklist xhci_hcd with modprobe.blacklist=xhci_hcd and add /etc/modprobe.d/01-pciback.conf:

softdep xhci_hcd pre: pciback
options pciback ids=VID:PID

Where VID:PID is a VID:PID of your USB controller that you want to hide.

But I’m not sure if it’ll work for pciback or it’s specific to vfio-pci.

Churros · September 1, 2024, 2:49pm

I will give it a shot, thank you for reading my lengthy post!

I knew there were a lot more options/directives/parameters supported in the modprobe configuration, I ought to grok through the docs (or source) at some point, it seems

EDIT: I’m not sure this will actually prevent Qubes pciback script from resetting the device (because it uses lspci) but it’s something I was interested in figuring out how to do with modprobe configs, so it’s a win either way

Churros · September 1, 2024, 3:22pm

Looks like the pciback module only supports a single option, which is “permissive” (and not too useful, it seems)

However, I think you’re on the right track with investigating lesser used modprobe features

There goes my afternoon!

Thanks again

Churros · September 1, 2024, 3:32pm

It looks like pci-stub, however, does support the ids parameter

I know Qubes has pci-stub but I’m not certain if it’s in module form. I have used it via sysfs but not via modprobe

Thinking about it now, I’m wondering…

If a device is claimed by pciback (pr pci-stub), what happens for an unbind/remove?

I think only the kernel source knows this for sure, but if those two modules don’t cause FLRs when claiming or releasing a device, then I think what you suggested (with pci-stub in place of pciback) may do the trick, even if Qubes insists on unbinding devices that are already seized by pciback

Only one way to find out

Churros · September 30, 2024, 11:33am

Forgot to update here

No luck doing this with pciback parameters. Sadly, unlike vfio, it isn’t as configurable at load time

What I ended up doing was modifying the aforementioned initramfs script to add a “proper” option that more cleanly facilitates what I hacked in

It uses the same syntax as the other qubes initramfs options that accept BDFs, so I didn’t have to deal with parsing

Using it looks like this

qubes.rd.pci_noflr=bdf1[,bdf2,…]

I’ll post the diff here in case anyone has a use for it, or would like to send a PR upstream. Having it upstream would be really great, I wouldn’t need to ensure that the changes aren’t overwritten by a dom0 update (happened once already, I need to look into how to add a dnf post-update hook, I’ve only done this with apt in Debian before)

Thanks again for the help and discussion

… and to anyone who may come along with a less invasive way to do this, please post it here. I would love to be rid of this hack, even though it’s more cleanly implemented

q1w2e3 · October 9, 2024, 2:00am

It seems that, for USB, a patch to the picback shell script may not have been necessary afterall - though I haven’t yet tested this:


usbcore.quirks=
			[USB] A list of quirk entries to augment the built-in
			usb core quirk list. List entries are separated by
			commas. Each entry has the form
			VendorID:ProductID:Flags. The IDs are 4-digit hex
			numbers and Flags is a set of letters. Each letter
			will change the built-in quirk; setting it if it is
			clear and clearing it if it is set. The letters have
			the following meanings:
				a = USB_QUIRK_STRING_FETCH_255 (string
					descriptors must not be fetched using
					a 255-byte read);
				b = USB_QUIRK_RESET_RESUME (device can't resume
					correctly so reset it instead);
				c = USB_QUIRK_NO_SET_INTF (device can't handle
					Set-Interface requests);
				d = USB_QUIRK_CONFIG_INTF_STRINGS (device can't
					handle its Configuration or Interface
					strings);
				e = USB_QUIRK_RESET (device can't be reset
					(e.g morph devices), don't use reset);

The USB_QUIRK_RESET looks useful …

This is limited to USB devices, of course, so you can’t use it to prevent resets from being sent to other PCI devices. But it might have been good enough for my issue

Maybe I did work for nothing, but at least something was learned?

(posting from different account)

q1w2e3 · October 9, 2024, 2:09am

This is the patch I had hacked in. A few caveats:

It doesn’'t do any sanity checking on the values
It doesn’t save and then restore the reset_method before/after qubes does it’s reset
It’s not tested by anyone other than me, one my machine
It will get blown away by some dom0 updates eventually

The file is at /usr/lib/dracut/modules.d/90qubes-pciback/qubes-pciback.sh

--- qubes-pciback.sh	2024-08-30 18:22:04.117005953 -0400
+++ qubes-pciback.sh.latest	2024-10-08 22:01:23.715828082 -0400
@@ -20,6 +20,20 @@
 HIDE_PCI=$(set -o pipefail; { lspci -mm -n | awk "/^[^ ]* \"$re/ {print \$1}";}) ||
     die 'Cannot obtain list of PCI devices to unbind.'
 
+# --- pci_noflr hack ---
+noflr_devs=$(getarg rd.qubes.pci_noflr)
+NOFLR_PCI="${noflr_devs//,/ }"
+for dev in $NOFLR_PCI; do
+  BDF=0000:$dev
+  if [ -f "/sys/bus/pci/devices/$BDF/reset_method" ]; then
+    warn "Disabling reset for non-conformant device @ $BDF ..."
+    echo "" > "/sys/bus/pci/devices/$BDF/reset_method"
+  else
+    warn "Unable to disable reset for non-conformant device @ $BDF, no reset_method file present ..."
+  fi
+done
+# --- end pci_noflr hack ---
+
 manual_pcidevs=$(getarg rd.qubes.hide_pci)
 case $manual_pcidevs in
 (*[!0-9a-f.:,]*) warn 'Bogus rd.qubes.hide_pci option - fix your kernel command line!';;

To use it, just add rd.qubes.pci_noflr=00:00.0,00:01.0,..., same syntax as rd.qubes.hide_pci

I don’t necessarily suggest anyone uses this, though. The USB quirks option seems the more appropriate solution

apparatus · October 9, 2024, 7:35am

I think usbcore.quirks is for USB devices and not for PCI devices (e.g. not for PCI USB controllers).

Churros · October 9, 2024, 8:39pm

Ahhhhh you are 100% correct. Should have known this as I used it recently with a USB mass storage device

Schuwi · November 24, 2024, 12:49pm

Thank you so much for sharing this patch!

I was having issues with the MEDIATEK MT7922 Wi-Fi card on my ThinkPad L15 Gen 3.
The bootup took > 1min to get to the disk encryption passphrase screen and another > 1min to get to the login screen.
After some debugging I found out, that the cause for the delay seems to be a timeout being reached while trying to execute a Function Level Reset (FLR) on the Wi-Fi card.

Kernel log - FLR issue

Nov 24 09:17:22 dom0 kernel: pciback 0000:06:00.0: enabling device (0000 -> 0002)
Nov 24 09:17:22 dom0 kernel: xen: registering gsi 40 triggering 0 polarity 1
Nov 24 09:17:22 dom0 kernel: xen: --> pirq=40 -> irq=40 (gsi=40)
Nov 24 09:17:23 dom0 kernel: pciback 0000:06:00.0: not ready 1023ms after FLR; waiting
Nov 24 09:17:24 dom0 kernel: pciback 0000:06:00.0: not ready 2047ms after FLR; waiting
Nov 24 09:17:26 dom0 kernel: pciback 0000:06:00.0: not ready 4095ms after FLR; waiting
Nov 24 09:17:31 dom0 kernel: pciback 0000:06:00.0: not ready 8191ms after FLR; waiting
Nov 24 09:17:39 dom0 kernel: pciback 0000:06:00.0: not ready 16383ms after FLR; waiting
Nov 24 09:17:56 dom0 kernel: pciback 0000:06:00.0: not ready 32767ms after FLR; waiting
Nov 24 09:18:31 dom0 kernel: pciback 0000:06:00.0: not ready 65535ms after FLR; giving up```

Using your patch I could circumvent the issue by disabling the FLR being invoked and thus the system now boots in a normal time.

Thank you for your previous work investigating this potentially problematic behaviour and even sharing a workaround for the issue!

q1w2e3 · November 29, 2024, 2:48pm

Happy to help someone after all of the help I’ve received from others!

You may consider opening a Github issue at qubes-issues Github, to see if there’s an interest in accepting it upstream as part of Qubes. It’s not the most elegant solution to the problem, but as far as I was able to see, there’s really no other way to do it than to make changes in this script

Unless/until it is part of upstream Qubes, you will have to check after updates to see if the modified version was overwritten by the update. And if it was, you’ll have to copy the patched version into place and run dracut -f again. That’s the reason I suggest seeing if upstream may be interested

It seems very few people need it - or maybe those people just didn’t care enough about the impacted device to go through the trouble?