USBIP protocol errors

I’m trying to use a signet (which is a USB hardware password manager) in Qubes 4.2. It worked fine in 4.1, but after I did a fresh install of 4.2, it doesn’t work. The device passes through and shows up in the target qube, but doesn’t receive any commands sent to /dev/hidraw0 (which I verified with a UART from the device itself).

I believe this is an issue with Qubes 4.2, but I wanted to see if anyone here first. My main two questions are:

  1. Does anyone know about any USB packet filtering for hidraw devices, that started happening in 4.2?
  2. Does anyone know where I should look next to get more information?

Debugging done so far

When the device is first plugged in, I see it in the syslogs of sys-usb using sudo journalctl -f and everything looks fine.

When I pass it through to a qube, it seems to attach just fine. It says “Attaching device” and there’s no errors that pop up.

However, when I attach it, sys-usb immediately starts printing out errors related to usbip:

Feb 05 13:49:06 sys-usb kernel: usbip-host 2-6.3: urb completion with non-zero status -71
Feb 05 13:49:06 sys-usb kernel: usbip-host 2-6.3: urb completion with non-zero status -71
Feb 05 13:49:06 sys-usb kernel: usbip-host 2-6.3: urb completion with non-zero status -71
Feb 05 13:49:07 sys-usb kernel: usbip-host 2-6.3: urb completion with non-zero status -71

Looking at the kernel version, we see this:

Linux sys-usb 6.1.62-1.qubes.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Nov 14 06:16:38 GMT 2023 x86_64 GNU/Linux

Looking at the kernel source for that kernel version, we see that -71 is -EPROTO.

Meanwhile, in the qube, the syslog has errors related to usb:

Feb 05 13:33:18 adam-hax0rbana-no-tor kernel: usb 1-1: reset full-speed USB device number 2 using vhci_hcd
Feb 05 13:33:18 adam-hax0rbana-no-tor kernel: vhci_hcd: vhci_device speed not set
Feb 05 13:33:18 adam-hax0rbana-no-tor kernel: usb 1-1: SetAddress Request (2) to port 0
Feb 05 13:33:18 adam-hax0rbana-no-tor kernel: cdc_acm 1-1:1.0: ttyACM0: USB ACM device
Feb 05 13:33:19 adam-hax0rbana-no-tor kernel: vhci_hcd: vhci_device speed not set
Feb 05 13:33:19 adam-hax0rbana-no-tor kernel: usb 1-1: reset full-speed USB device number 2 using vhci_hcd
Feb 05 13:33:19 adam-hax0rbana-no-tor kernel: vhci_hcd: vhci_device speed not set
Feb 05 13:33:19 adam-hax0rbana-no-tor kernel: usb 1-1: SetAddress Request (2) to port 0
Feb 05 13:33:20 adam-hax0rbana-no-tor kernel: cdc_acm 1-1:1.0: ttyACM0: USB ACM device

Now, if we just ignore these errors, we see that /dev/hidraw0 and /dev/hidraw1 both appear in the target qube, which is the expected behavior. The signet client software opens /dev/signet (which is symlinked to /dev/hidraw0 via a udev rule), it successfully opens the handle and send the startup message to said file handle. It never gets a response.

Using a UART port on the USB device itself, I confirmed that the message never makes it to the device.

Between the errors in the logs, and the message not getting from the target qube to the device, it seem like the attach command is failing in some way which is causing data to not make it through to the qube.

Variations

I’ve tried varying some things to see if I can narrow down the issue, but it persists in all of the cases below:

  • Plug directly into the computer instead of using a USB hub
  • Try a different signet device (same model, different physical instance)
  • Attach it to a different qube
  • Try the device in another computer (it works fine on a Debian 12 machine)
  • Run the signet client from sys-usb; it acted just like in the target qube where it could open the device and send data, but then it never gets a reply
1 Like

Since it has the same issue in sys-usb then it’s not related to USBIP.
Maybe try to use newer kernel from dom0 kernel-latest-qubes-vm package in sys-usb or try the kernel installed in VM:

1 Like

Thanks for the suggestion. I tried the “provided by qube” option for the kernel, which got me 6.6.14, but it behaved the same way.

Linux sys-usb 6.6.14-100.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Jan 26 20:10:55 UTC 2024 x86_64 GNU/Linux

There aren’t any errors in the syslogs when I attempt to use the device in sys-usb, so that part is different, but the client sending off a packet and never getting a response is the same.

If someone in the US is willing to dig into this with me, I’d be willing to ship you a signet device to whatever name and address you give me. I really want the device to work with Qubes again! (but international shipping is really expensive!)

1 Like

I have more information, but no real answers yet.

  1. If I boot into Tails, I also have problems with Signet (I did not do as detailed of debugging to confirm that it’s exactly the same thing, but it acted the same when using the release version of the Signet client). This seems to point to a problem with the USB controller, but I’m already tried two USB controllers on the motherboard and…
  2. Installing a PCI → USB adapter, attaching to it sys-usb, and passing it through to a target qube has the exact same problem! I can see the client send the message, but then it just hangs.

So this one really has me stumped. I did change motherboards when I switched to Qubes 4.2, but since other USB devices seem to work fine, and I chose my motherboard based on the Qubes HCL having multiple reports that the ASRock B450M Pro4 was going to be solid.

My next steps are to test this machine using the PCI card in Tails and try to find another machine where I can install Qubes 4.2. If Signet works fine on another machine with Qubes 4.2, then I know this is a “me problem” and I can figure out if it’s something weird with the motherboard, or maybe a reinstall would fix it, or maybe I’ll figure out something else I can try.

Unfortunately I don’t have the money to just buy another computer consisting of the same components in order to test this. If I still had a high paying tech job, I’d certainly consider that option though! :face_with_monocle: :thinking:

1 Like

I went out and bought a new motherboard (Asus ProArt B550-creator, based on the HCL), and it has this same problem with Qubes 4.2! This is such a mystery!

I’m going to continue testing (the new motherboard with Tails, with Qubes 4.1) to see if I can figure out if I can get it to work, and if so, what the difference is. Just wanted to post an update so people know I’m still very much working on this and trying to find an answer so I can report it back here. The idea of some unknown component of the system messing with hidraw devices in a way that is not understood is… concerning.

After much testing, I found that this is an issue with the USB passthrough system (which may be an upstream bug). I no longer believe it is related to any compile time kernel options or kernel versions.

The problem is exhibited the first time the hidraw device is attached to a qube. Disconnecting the device from the qube and reconnecting it causes the problem to disappear. If the physical hardware is removed and re-inserted, another cycle of attach/detach/attach will be required. Also, the first attachment does not need to be to the same qube as the second attachment.

I can also say that my testing methodology has improved. I can reliably recreate the USB connection failure problem in Qubes 4.2 on multiple computers (I wasn’t able to test Qubes 4.1, but that’ll be EOL soon enough anyway).

I wrote up repro steps on the Signet issue tracker but they won’t be terribly useful unless you have a signet device (or possibly any other USB hidraw device?)

Since I can now reliably repro the problem, I re-tested this on sys-usb itself. The Signet works fine there (no workaround required).

I’m unable to reproduce the problem on any version of Tails, so I’m not sure what was going on there.

My next steps are to write a minimal command line program to demonstrate the issue. Right now I can only demonstrate it with the Signet client, which is a multithreaded QT program that is really gnarly to debug. After I have the minimal example, I’ll see if I can figure out a root cause and open tickets on whatever bug trackers are appropriate.

Hello @hax0rbana_adam, thank you for all this testing. I’ve been able to replicate this issue with Trezor Suite using a Trezor Safe and usb passthrough, and your given workaround works as well (starting from Trezor Suite v24.11.1, previous versions didn’t work even after an attach/detach cycle).

I also confirm that doing everything directly in the qube with the usb device doesn’t give any issues.

So I’m not crazy after all!

I knew that Trezors did not work with USB passthrough before (though I never dug into the details of exactly why), so I’m glad to hear they’ve fixed their side of the issue. Having to attach/detach/attach is better than not working at all, and if someone can figure out and fix this bug in Qubes, we should be all set.

I’m going to update the ticket I opened on the issue tracker, which apparently I failed to link to until now.

Everyone who is affected should feel free to give the issue a thumbs up! :slightly_smiling_face:

Seems that after further testing it was a cache issue as I used the same qube for both trezor versions. Now that I tried to replicate using a fresh qube for each trezor version I concluded that versions before 24.11.1 doesn’t work under any attach/detach cycle, while 24.11.1 works out the box without doing anything additional, so I’m not sure if Trezors are linked to this issue (unless that new version fixed this in their end)

Well, your post at least got me a workaround for the hidraw issue.

Straying a bit off topic here, but how did you get your Trezor to work? When I go to the webpage or start the latest desktop app that qube requests access to sys-usb once per second, forever, all of which are denied by the default policies, of course.

Did you add some policy in dom0 which allows the Trezor to access everythig in sys-usb instead of just using the device that has already been attached to that qube?

Web page is using v24.10.1 which won’t work. Did you download v24.11.1 desktop app? Currently it’s only published in github. No additional policy is needed, just attach Trezor to sys-usb, then do USB passthrough to target qube running trezor suite. This is using stock Qubes 4.2.

Update: While Trezor works correctly initially, after doing some operations it stops working and a detach/attach cycle is needed, so there’s something breaking the communication.

The issue with the Trezor appears to be independent of the issue with the hidraw I saw with Signet.

I can confirm that I’ve had intermittent success with Trezor using USB passthrough on Qubes 4.2.4 before I upgraded the Trezor to firmware 2.8.8.

After upgrading the Trezor to firmware v2.8.8, it appears to attach fine (in Chromium and the AppImage) but then fails to load. When I mouse over the device icon in the upper left, it give the tooltip “Unavailable while loading”. Interestingly, I can see the info about the firmware on the settings side. Plugging this same device into a Debian 12 machine, so I don’t think it is an issue with the firmware being broken. :relieved:

I also tried this on a test machine using Qubes-4.3.202502190427-x86_64.iso and found v2.8.8 fails the same way there. It attaches fine, but then doesn’t ever finish loading. I wish I had another device with the older firmware to test it on Qubes 4.3 to see if it works there like it did in 4.2.4.

At this point, it seems like something Trezor change in their firmware that caused this incompatibility (again).

@equbes, if you have any suggestions on what to try next, I’d be happy to run additional tests on both the current and upcoming versions of Qubes. If we can nail this down, there’s a chance we might still be able to get it fixed in the 4.3 release (assuming it’s even an issue on the Qubes sides).

Just tested with a device running firmware v2.8.7 and confirm it no longer works, it’s detected by trezor suite but after attempting any operation it just hangs, this device was working good a week or so before with the workaround so I assume some package updated in either dom0 or fedora template broke the functionability.

Tested in archlinux appvm and the issue persist.

Also installed a fresh fedora-40 template (which ftp shows september build date) and set sys-usb to use this template, and the issue persist.

So the issue is most likely caused by a dom0 package that got updated in the last 2 weeks. Using dnf history info shows qubes-core-dom0, qubes-core-qrexec, qubes-core-qrexec-dom0 & qubes-core-qrexec-libs. Maybe @marmarek can hint at what changes could’ve have affected usb passthrough.

In the meanwhile, currently the only workaround is to avoid usb passthrough completely and attach the trezor directly on the trezor-suite appvm (recommended to disconnect all other usb devices to avoid more risk exposure)

Have you tried the change described in USBIP connection drop on initial hidraw attachment · Issue #9367 · QubesOS/qubes-issues · GitHub ?

Yes, I’ve tried that however it doesn’t change anything.

Also went a bit more further into downgrading other packages besides the qubes-core-x but the result is the same, so the issue may not be there. I’ll later check if a fresh 4.2.3 qubes install has this issue to trim down the cause.

Some more tests:

Qubes OS 4.2.3 & 4.2.4 iso with no updates
Fedora 40 stock template & Fedora 41 template
Trezor suite both 24.11.1 & 25.2.2 tried
One device running 2.8.3 and the other 2.8.7 firmware

Connection doesn’t work at all besides the suite just detecting the trezor upon initial usb passthrough, after any operation it just hangs.

on appvm running trezor suite it only shows:

vhci_hcd: vhci_device speed not set
usb 2-1: reset full-speed USB device number 2 using vhci_hcd
vhci_hcd: vhci_device speed not set
usb 2-1: SetAddress Request (2) to port 0
vhci_hcd: unlink->seqnum 35
vhci_hcd: urb->status -104

Also, i confirm that 2.8.7 is also borked, so I must have updated it after my last working operation.

I updated to 2.8.9 and now I can get it to work the same way as before on my daily driver (with the reconnect workaround), however I couldn’t ever get it to work on a fresh install for some reason (neither 4.2.3 nor 4.2.3, and also neither on 2.8.3 device nor updated 2.8.9 which work on my daily driver), however it was a different machine.

One thing worth noting is that, electrum works fine under any setup out of the box (maybe it’s not using USBIP?)

@hax0rbana_adam I assume you’re running model T as 2.8.8 is only available for that device, I don’t have that model so can’t check but you should try updating to 2.8.9 (only available in github repo for now) and see if you can get it to work as before.

Managed to borrow a trezor one (firmware 1.12.1) for testing, and it just works out of the box under the fresh install without any extra step/workaround (not even the need for attach/detach cycle) under trezor suite. So this trims down the issue to firmware on newer trezor devices, as trezor suite alone is not the culprit.

Also the same log errors are shown in journalctl for this device so these shouldn’t be related to the issue.

Note: Went ahead and updated to 1.13.0 (latest) and it still works correctly.

Thanks for the additional test results. I can say that my model T (v2.8.8 firmware) also worked fine with electrum on Qubes 4.2.4 (and I didn’t know that worked with hardware wallets, so thank you for that).

I agree with you that the Trezor issue is not with the USBIP. The issue with the Signet passthrough was on the Qubes side, but that’s been fixed with 4.3-alpha.

Since the v2.8.8 Trezor firmware works fine with electrum, it’s clearly not just a problem with the firmware, but it also doesn’t seem to just be an issue with the client either. If there isn’t already an issue opened about this with the Trezor project, someone should do so and link it here for reference.

I guess the only thing we have left to do in this thread is to document which combinations of hardware/firmware/client software are still broken and which ones are working (with or without the reconnect workaround, on Qubes 4.2.x and 4.3-alpha). If we can get this same matrix for a non-Qubes machine and post that on the Trezor bug tracker, that would be helpful to compare to what we’re seeing on Qubes and confirm that everything will be 100% fixed in the upcoming 4.3 release.

When you say non-Qubes machine you mean other type of virtualization? Because Trezor works fine under all setups without usb passthrough so that wouldn’t be of much help. Besides that I can start working on a compatibility matrix (you can help me with the model T).