Kcma-d8 sys-usb working 50% of the time

So the boot at 12:38 (the last one), was with mouse? I can see:

Nov 17 12:38:15 sys-usb kernel: hid-generic 0003:046D:C077.0002: input,hidraw1: USB HID v1.11 Mouse [Logitech USB Optical Mouse] on usb-0000:00:06.0-2/input0

So it seems like it has detected it successfully, but i think that is the last boot. Did that work?

The two boots before, sys-usb cannot even find the controller usb-4-2.

well ok, your log confuses me, the first two boots do not have any mention of sys-usb at all. Does this qube even start if you have no mouse?

yes, the last boot works.

I don’t know if it starts without a mouse. I’ll reboot and see.

That was the log of sys-usb right?
So basically it must have booted. But somehow the sys-usb stuff is not working. Maybe in the dom0s journal is some stuff?

Boot 1: Mouse disconnected during boot. Mouse reconnected after login and works.
Boot2: Mouse connected during boot. Mouse not working.
Boot3 (Currently running): Mouse disconnected during boot. Mouse reconnected after login and works.

Wait a sec… sys-usb is disposable on this install. Crap, only the current one is visible.

Now I need to figure out how to get the log when the mouse isn’t working.

That explains a lot about the weirdness of the logs.

You could try to open a dom0 terminal and copy out the logs from the running disposable. A bit cumbersome, but certainly possible if you have a dom0-terminal shortcut.

Log4: boot with mouse → mouse not working
Log5: boot without mouse, then plugged in after login → mouse not working
Log6: boot with mouse → mouse working

Mouse light does not turn on when it doesn’t work.
Mouse light is on when it works.

log4.log (118.3 KB)
log5.log (120.3 KB)
log6.log (123.8 KB)

Yes, it seems like your machine does not recognize the whole usb 4-2 controller sometimes.
But unfortunately i am not that into debugging USB stuff…

You tried giving it more RAM and disabled memory balancing, which is what i would suggest you should do, so i am pretty much out of ideas. Maybe there is something in the dom0 logs.

Sorry i could not help much further :frowning:

1 Like

Not sure of which commit this corresponds from a coreboot perspective, would have to dig libreboot to find out if necessary…

A quick search didn’t provide direct bug report neither from coreboot nor libreboot for usb controller.

Logs from appvms (console logs, which includes dmesg report from qubes) are present, and rotated, under dom0’s /var/log/xen/console directory (from memory here).

You might want to provide clear logs from there, bringing them to qubes through qvm-copy-to-vm calls for working/non-working scenarios.

Insight is that coreboot is not initializing usb controllers properly/iommu is not setuped properly in firmware.

2016 firmware might be before cbmem was integrated into coreboot so that coreboot logs are reported to OS (to be figured out if debugging coreboot is needed).

The best place to discuss that might be on libreboot mailing list. From recent merge of osboot into libreboot, I expect more people to have interest into libreboot for their Qubes supported boards.

Some kind of behavior isolation might be needed here to make sure of the nature of the problem. We are actually talking about a pci passthrough of a usb controller into a HVM (sys-usb) so we are talking about iommu reliable support from firmware, ACPI enumaration from firmware and then Xen passthrough to sys-usb under qubes use case.

A good first step would be to confirm that booting for example Tails LiveCD results in the same 50% usb contoller discovery/usage rate? If its 100% success, then going into digging IOMMU issues and CPU version/ microcode applied versions and then digging into IOMMU issues might be helpful to resolve the issue. Goal here is to isolate causes.

Next steps:

  • expose failing/success logs from dom0 for sys-usb attempting to get that usb controller passethrough
  • confirmation that a normal (non-xen, no passthrough) is able to deal with usb controller 100%of the time.

And then go from there?

2 Likes

Got curious and took your two log6 (working) and log4(not-working) for comparison.

Applied some filtering to not diff on timestamps with
user@Insurgo:~$ meld <(grep pci ~/Downloads/log6.log | cut -b17- ) <(grep pci ~/Downloads/log4.log| cut -b17- )

To filter only pci related messages, and then cut everything before the 17th caracter.

With the following showing interesting behavior difference for pci controllers:

On the left we see log6, on right log4.

We see on the left: ohci-pci 0000:00:06.0: OHCI PCI host controller where on the right ehci-pci 0000:00:06.0: EHCI Host Controller

Doing a simple meld ~/Downloads/log6.log ~/Downloads/log4.log also shows that when not working (log4), the kernel is throwing at us:

Nov 17 17:11:27 sys-usb kernel: irq 44: nobody cared (try booting with the "irqpoll" option)
Nov 17 17:11:27 sys-usb kernel: CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.15.76-1.fc32.qubes.x86_64 #1
Nov 17 17:11:27 sys-usb kernel: Hardware name: Xen HVM domU, BIOS 4.14.5 10/31/2022
Nov 17 17:11:27 sys-usb kernel: Call Trace:
Nov 17 17:11:27 sys-usb kernel:  <IRQ>
Nov 17 17:11:27 sys-usb kernel:  dump_stack_lvl+0x46/0x5e
Nov 17 17:11:27 sys-usb kernel:  __report_bad_irq+0x35/0xa7
Nov 17 17:11:27 sys-usb kernel:  note_interrupt.cold+0x8/0x58
Nov 17 17:11:27 sys-usb kernel:  handle_irq_event+0xa5/0xb0
Nov 17 17:11:27 sys-usb kernel:  handle_fasteoi_irq+0x78/0x1d0
Nov 17 17:11:27 sys-usb kernel:  handle_irq_desc+0x3a/0x50
Nov 17 17:11:27 sys-usb kernel:  consume_one_event+0x104/0x110
Nov 17 17:11:27 sys-usb kernel:  __evtchn_fifo_handle_events+0x72/0xb0
Nov 17 17:11:27 sys-usb kernel:  __xen_evtchn_do_upcall+0x66/0xb0
Nov 17 17:11:27 sys-usb kernel:  __sysvec_xen_hvm_callback+0x22/0x30
Nov 17 17:11:27 sys-usb kernel:  sysvec_xen_hvm_callback+0x6d/0x90
Nov 17 17:11:27 sys-usb kernel:  </IRQ>
Nov 17 17:11:27 sys-usb kernel:  <TASK>
Nov 17 17:11:27 sys-usb kernel:  asm_sysvec_xen_hvm_callback+0x16/0x20
Nov 17 17:11:27 sys-usb kernel: RIP: 0010:native_safe_halt+0xb/0x10
Nov 17 17:11:27 sys-usb kernel: Code: 0b 34 76 01 48 03 04 d5 c0 ba 6e b9 0f b6 57 09 8b 74 d0 04 8b 3c d0 e9 e3 8a 32 ff cc cc cc eb 07 0f 00 2d 09 c8 4c 00 fb f4 <e9> d0 51 2c 00 eb 07 0f 00 2d f9 c7 4c 00 f4 e9 c1 51 2c 00 cc 0f
Nov 17 17:11:27 sys-usb kernel: RSP: 0018:ffff9e64800b7e70 EFLAGS: 00000246
Nov 17 17:11:27 sys-usb kernel: RAX: 0000000000004000 RBX: 0000000000000001 RCX: 00000000ffffffff
Nov 17 17:11:27 sys-usb kernel: RDX: ffff8f8abbec0000 RSI: ffff8f8a8127bc00 RDI: ffff8f8a81c55000
Nov 17 17:11:27 sys-usb kernel: RBP: ffff8f8a8127bc64 R08: ffffffffb9a6a940 R09: 000000000000005e
Nov 17 17:11:27 sys-usb kernel: R10: ffff8f8abbeefda4 R11: 00000000000000e0 R12: 0000000000000001
Nov 17 17:11:27 sys-usb kernel: R13: ffffffffb9a6a9c0 R14: 0000000000000001 R15: 0000000000000000
Nov 17 17:11:27 sys-usb kernel:  acpi_safe_halt+0x1f/0x30
Nov 17 17:11:27 sys-usb kernel:  acpi_idle_enter+0xde/0x120
Nov 17 17:11:27 sys-usb kernel:  cpuidle_enter_state+0x89/0x350
Nov 17 17:11:27 sys-usb kernel:  cpuidle_enter+0x29/0x40
Nov 17 17:11:27 sys-usb kernel:  cpuidle_idle_call+0x13a/0x1e0
Nov 17 17:11:27 sys-usb kernel:  do_idle+0x81/0xf0
Nov 17 17:11:27 sys-usb kernel:  cpu_startup_entry+0x19/0x20
Nov 17 17:11:27 sys-usb kernel:  secondary_startup_64_no_verify+0xc2/0xcb
Nov 17 17:11:27 sys-usb kernel:  </TASK>
Nov 17 17:11:27 sys-usb kernel: handlers:
Nov 17 17:11:27 sys-usb kernel: [<000000009d4f1bfe>] usb_hcd_irq
Nov 17 17:11:27 sys-usb kernel: Disabling IRQ #44

Seems like something fishy is happening

2 Likes

@Litter_Box Insurgo mention my speculation (concern). Not sure libreboot cause as kgpe-d16 forum hcl report use oem bios and appear similar. If you willing to continue troubleshoot I very much appreciate. Think this could benefit qubes community and in my opinion kcma-d8 underappreciated qubes desktop option. Either way thank you!

Might have been too technical here, but OHCI and EHCI is different. USB controller being detected as OHCI controller works, where when it is detected as EHCI it doesn’t, saying that the IRQ in question was never polled, and stating that a kernel option should be specified to make it polled manually, otherwise that IRQ is being deactivated, which seems to result into your mouse to go undetected. Makes sense?

@confused has a point. Kcma-d8 and kgpe-d16 being really similar, looking at HCL reports, with coreboot/libreboot tells us that they were functional with libreboot release. I do not see HCL reports with Asus firmware there. They were both reported with 2016 release of libreboot, either by Timothy or Vikings. The new HCL report is on a different version of Qubes as well, meaning its a whole different test case (Xen and kernel being different).

You might want to add irqpoll kernel option as suggested. But as of now, the reason why that USB controller is detected non consistently as OHCI/EHCI seems to be the visible cause of the behavior observed.

Addressing past misleading replies:

Memory balancing is not possible on HVMs.

Adding more memory would doubtfully have any impact here, where this can change something for high throughput devices (camera and usb proxy, network drivers) on really tight HVMs having to keep cache to pass that data to other VMs.

I’m not sure if I understand the above statement. I quicly reviewed past HCL reports which are either reported by Timothy for a really old version of Qubes (kgpe-d16 on libreboot) or Vikings (libreboot). I haven’t read a HCL that was reporting using stock firmware? Maybe I missed it.

bios: | 3309 in https://forum.qubes-os.org/uploads/short-url/zoBa1OTH57kJekfXmFSfJhKKq08.yml from Asus KGPE-D16 (6386SE) make think stock firmware. Sorry for confusion.

1 Like

Added irqpoll to sys-usb template. EDIT: nope these are normal logs.

log8.log → usb mouse works
log9.log → usb mouse not working, but works when moved to an external usb hub connected to different usb port.

log8.log (119.9 KB)
log9.log (122.1 KB)

Here are the logs with irqpoll added.

log11.log → usb mouse works
log12.log → usb mouse not working, but works when moved to an external usb hub connected to different usb port.

log11.log (125 KB)
log12.log (120.7 KB)

Command line: root=/dev/mapper/dmroot ro nomodeset console=hvc0 rd_NO_PLYMOUTH rd.plymouth.enable=0 plymouth.enable=0 xen_scrub_pages=0

looks like it’s not seeing the irqpoll argument.

KCMA-D8 protip: buy an external usb hub and use that, it seems to work everytime.

1 Like

Confused is correct. That Qubes 4.0 HCL report was running stock firmware.

Did you find fix?

To my knowledge, this is the best resource one can follow for hardware/stock firmware/libreboot/coreboot documented issues.

Qubes usb controller passthrough not related though:
https://wiki.vikings.net/KCMA-D8

The external usb hub is still working great. I sometimes get two prompts to attach my mouse to dom0. If two prompts come up, it’s the second one that is real.

Have you tried upgrading to Qubes 4.2 yet? Your descriptions above sound just like my experience with the D16. However, since installing 4.2 my onboard USB never works for my mouse (I have to boot with it in the onboard USB and switch to a PCI USB card after boot to use my mouse). And any USB connections for powered devices cannot be shared outside of sys-usb.

I have more details here: USB device and PCI Sound card issues persistent with 4.2-rc4

I will give it a shot since 4.2 dropped recently.