`no-strict-reset` inner workings

Back at my quest to passthrough an AMD RENOIR iGPU to sys-gui-gpu

In /var/log/libvirt/libxl/libxl-driver.log I see:

2021-11-13 14:49:01.060+0000: libxl: libxl_pci.c:1488:libxl__device_pci_reset: write to /sys/bus/pci/devices/0000:07:00.0/reset returned -1: Inappropriate ioctl for device

This looks like an attempt to reset my GPU through sysfs, which sounds fishy for a PCI device declared with no-strict-reset=True, so let’s dig…

In libxl__device_pci_reset() in libxl_pci.c we see that libxl, before the above device reset attempt, first tries to write to /sys/bus/pci/drivers/pciback/do_flr if it exists - it’s not the case here on 5.14.10, but still that sounds like trying to mess with Function-Level Reset, which is (if I understand correcty) precisely what no-strict-reset=True is supposed to avoid. If anything, that seems to confirm this code is not supposed to run for this device.

Up one call level, libxl__device_pci_add(), which looks pretty legal here… does call libxl__device_pci_reset() unconditionally. That does not feel right… but then, how is no-strict-reset=True supposed to work ? Let’s dig further…

strict-reset appears as an XML attribute in the libvirt xml definition for our VM. Its support in qubes comes from a Qubes-specific patch. If we omit all the flag-handling logic, there seems to be a single effect; the resulting virPCIDeviceReset() code is reproduced below for clarity.

virPCIDeviceReset()
int
virPCIDeviceReset(virPCIDevicePtr dev,
                  virPCIDeviceList *activeDevs,
                  virPCIDeviceList *inactiveDevs)
{
    g_autofree char *drvPath = NULL;
    g_autofree char *drvName = NULL;
    int ret = -1;
    int fd = -1;
    int hdrType = -1;

    if (virPCIGetHeaderType(dev, &hdrType) < 0)
        return -1;

    if (hdrType != VIR_PCI_HEADER_ENDPOINT) {
        virReportError(VIR_ERR_INTERNAL_ERROR,
                       _("Invalid attempt to reset PCI device %s. "
                         "Only PCI endpoint devices can be reset"),
                       dev->name);
        return -1;
    }

    if (activeDevs && virPCIDeviceListFind(activeDevs, dev)) {
        virReportError(VIR_ERR_INTERNAL_ERROR,
                       _("Not resetting active device %s"), dev->name);
        return -1;
    }

    /* If the device is currently bound to vfio-pci, ignore all
     * requests to reset it, since the vfio-pci driver will always
     * reset it whenever appropriate, so doing it ourselves would just
     * be redundant.
     */
    if (virPCIDeviceGetDriverPathAndName(dev, &drvPath, &drvName) < 0)
        goto cleanup;

    if (virPCIStubDriverTypeFromString(drvName) == VIR_PCI_STUB_DRIVER_VFIO) {
        VIR_DEBUG("Device %s is bound to vfio-pci - skip reset",
                  dev->name);
        ret = 0;
        goto cleanup;
    }
    VIR_DEBUG("Resetting device %s", dev->name);

    if ((fd = virPCIDeviceConfigOpenWrite(dev)) < 0)
        goto cleanup;

    if (virPCIDeviceInit(dev, fd) < 0)
        goto cleanup;

    /* KVM will perform FLR when starting and stopping
     * a guest, so there is no need for us to do it here.
     */
    if (dev->has_flr) {
        ret = 0;
        goto cleanup;
    }

    /* If the device supports PCI power management reset,
     * that's the next best thing because it only resets
     * the function, not the whole device.
     */
    if (dev->has_pm_reset)
        ret = virPCIDeviceTryPowerManagementReset(dev, fd);

    /* Bus reset is not an option with the root bus */
    if (ret < 0 && dev->address.bus != 0)
        ret = virPCIDeviceTrySecondaryBusReset(dev, fd, inactiveDevs);

    if (ret < 0) {
        virErrorPtr err = virGetLastError();
        virReportError(VIR_ERR_INTERNAL_ERROR,
                       _("Unable to reset PCI device %s: %s"),
                       dev->name,
                       err ? err->message :
                       _("no FLR, PM reset or bus reset available"));
        if (!dev->strictreset)
            /* do not fail */
            ret = 0;
    }

 cleanup:
    virPCIDeviceConfigClose(dev, fd);
    return ret;
}

As I read it, the flag currently does not actually prevent any reset, but rather avoid to return in error if all reset attempts fail. And in fact, the message for commit which added this patch (as well as the added doc) is pretty clear: Make it possible to assign PCI device which does not support reset .

@marmarek, does it mean that we never really had the option to prevent FLR attempts ? Do you see any problems with trying to prevent them altogether ? With a different flag maybe to avoid confusion ?

Or maybe here we just have a harmless error, which maybe should just be silenced when no-strict-reset=True ? Looks like we would need to pass the necessary information from libvirt down to libxl.

Yes, exactly.

Generally, not resetting the device when switching it between VMs is a security risk (see qvm-pci man page). The flag is added to accommodate cases when you are not planning to do it, but it still attempts to reset, just to be on the safe side. If device does not support it, it exits early and should not have negative consequences.

Maybe. Or maybe log another message (from libvirt) that reset failure was ignored because of no-strict-reset flag.

I’m not a bit fan of the “please ignore the other error” approach in geneneral, but regardless of this, knowing the flag was used would be a useful piece of information in its own right when trying to get a proper passthrough setup, will do that.

Still, in my case I don’t have this Unable to reset PCI device from libvirt that I should have if no-strict-reset would get triggered (in whole /var/log/), I’m supecting another problem behind that.

Another thing I forgot: while digging for "who does what with the no-strict-reset flag, I noticed the following in qubes-core-admin, which rather looks like a remnant for a former incarnation of the feature – should I open a ticket to get the test updated ?

qubes/tests/integ/devices_pci.py:            self.vm.features['pci-no-strict-reset/' + pcidev] = True