Big update on Arrow Lake running on 4.3 and some more questions

The previous topics got out of hand so I will summarize:
I’ve been playing with an Asus Z890 with a 265K.

There are 3 USB controllers on the mainboard.
Meteor Lake TB4 USB/NHI
JHL9580 TB5 USB/NHI
7f6e USB at 80:14.0
Note: NHI is not a USB controller, it loads the thunderbolt driver. It’s complicated because the NHI and USB share the port depending on what you put in it - TB, USB, DisplayPort alt, etc.

The TB USB controllers
These seem pretty useless if you ask me, whatever I put in those, it somehow gets tunnelled to the 80:14.0. So even if you give them to a VM, it doesn’t help. I did buy a type c to type a hub. Putting inside one of the TB ports, plugging stuff in it, then the stuff shows up on 80:14.0 USB controller.

The 80:14.0 MB USB controller
Out of the box, the USB controller on the mainboard can’t be passthroughed.
@Bloged made a brilliant fix which I finally got to use today. Apparently the architecture of the Arrow Lake is different so there are additional PCI bridges.
The patch basically treats the second bridge as PCI root too. Makes sense but it hasn’t gotten approval from @marmarek yet.

So, after applying the patch, I still have to allow non-strict reset. Otherwise it won’t work.

Curiously, I get the Aura RGB controller plus two other USB devices(Type C to DisplayPort) inside my sys-usb. I wonder if there’s any security risk here. What would happen if the Type C to DisplayPort usb devices were hacked? Could they interact with my monitors.
Technically speaking TypeC to Displayport cables contain these devices but they don’t seem needed.(because even if I keep my main USB controller with pciback driver inside dom0, ignored, my monitors still work).

Getting a second USB controller
So to speak, I’ve added a USB controller via a PCIE slot.
I’ve tried with both the PCI-E 4.0 x4 and the PCI-E 5.0 x16. Both slots work. No need for strict reset or permissive. ASM3142 if anyone is curious.

Ethernet ports working during suspend or shutdown
ErP setting in Bios cuts power
Checking traffic nothing seems to be happening, no open ports either

Some errors I see in the dmesg when the ASM3142 is passthroughed
Will post later but it works.

On usb authorized kernel flag
I can’t do the whole authorized=0 usb flag. My keyboard uses a USB hub so it’s not detected. Probably a common issue.

On the topic of monitors
To get my monitors out, I’ve tried three strategies:
a) use MST and connect directly via Type C to Type C cable(should be using Display Alt)
b) use a TB4 dock in whatever port

  • if I disable TB drivers/bios functions/etc. it functions identically doing Display Alt
  • can be finicky if you replug
  • could be a security risk, has other things in it, which would appear in dom0 like USB devices and so on
    Notes for a) and b):
    Since my monitors have built in hubs and other things, these do actually appear to whichever VM I pass the TB controllers to. But the monitor outputs always stay in Dom0. Passing through these things won’t ever give your VMs monitors.(unless you use an eGPU).

That gets us to the most secure option.
c) Type C to DisplayPort cables
Essentially one per monitor allowing you to bypass the TB, use Display Alt and to not allow the stuff jn your monitors to connect to Dom0.

Audio
Type C to DisplayaPort cables do support Audio too, so if your monitor has a 3.5mm, all set

On the topic of TB things getting into Dom0
Perhaps giving the main USB controller at 80:14.0 to sys-usb actually fixes this, since then those devices would probably get inside sys-usb. I will later try and confirm.

APIC errors
I am getting APIC ID mismatch errors inside Dom0 on boot. Similar to what @Bloged is getting. I wonder if these would be trouble when I get to do a GPU passthrough.
Please advice everyone.
I haven’t updated the BIOS yet btw.

[Firmware Bug]: CPU   .: APIC ID mismatch. CPUD: ... APIC: ...
[Firmware Bug]: CPU    .: APIC ID mismatch. Firmware: ... APIC: ...
cpu ... spinlock event irq 2..

NPU/VPU errors
I noticed the CPU has an NPU for AI stuff. I’ve tried passing it through but inside the VM I am getting an error that it can’t go from D3 Cold state to D0 hot or something. Basically it’s like the device can’t start.

intel_vpu 0000:00:07.0: Refused to change power state from D0 to D3hot

TB errors
Now that I think about it, I think I saw similar D3 cold/hot errors. I will look into it more.

Fans/PWM
I am pleased with the nct6775 driver. Inside dom0, I can get all temps, RPMs and control my fans too. It’s really nice since you can actually bypass the BIOS. Note: the CPU temp is a bit different from the one you’d get outside Xen if you used sensors. I think it’s fine though. No per core temp.

OpenRGB
I did try using it to control the RAMs RGB lights. Bad idea. Got errors and on reboot got stuck until hard reset. On a second note, SPD modification is disabled in the BIOS, so perhaps it would have worked. Too risky anyway. My advice is to stay away unless you know what you’re doing.

Kernels
6.17 seems to behave same as 6.12

Help needed:
APIC ID mismatch errors, NPU/TB cold to hot state D0/D3, errors in Kernel log related to the ASM3142 passthrough

Hello,

I have the same behaviour. It’s so strange! As I did my tests only with USB2 devices, maybe there is a difference between USB2 routing and USB3 and above.

A PCIe Root Complex in the PCH yes, in addition to the one in the CPU.

I’m pretty sure @marmarek cannot accept this patch as it is. At least, we should add code to list the PCI Root Complex and replace in the test if (dev->address.bus == 0x80) the 0x80 by a look up in that list of PCI/PCIe busses served directly by PCIe Root Complex. It would be correct and not assuming that 0x80 value is correct all platforms.

Same for me…

For my understanding, the DIsplayPort signals routing through UCB-C is totally independent from the real USB packets and communication mean. And for me, the reason is that the DisplayPort signals comes from the Graphic Card and not the USB bridge.

What is MST and what is Display Alt, please?

Thanks
Bertrand

Multi-Stream Transport (MST)
is a protocol introduced in DisplayPort 1.2 that uses multiplexing to packetize and transmit multiple independent video and audio streams over a single high-speed main link. Either you daisy-chain monitors or connect them to an MST hub(which visually speaking often looks like a TB dock)

Solving network manager always enabling WiFi
Add a conf to /rw/config/qubes-bind-dirs.d.

DisplayPort Alternate Mode (DP Alt Mode) is a VESA-standardized functional extension of the USB Type-C interface that leverages the USB Power Delivery (PD) protocol to repurpose physical SuperSpeed data lanes for the direct transmission of native DisplayPort signals

So you’re correct, essentially the DisplayAlt bypasses everything and directly connects the monitor to the GPU inside the CPU.

I did test with USB 3 devices, no change. Same behavior as USB 2 tests you did.

Anyway meanwhile I did play with the ASPM.
I’ve noticed some interesting findings:

  • no errors from the ASM3142 controller if ASPM is disabled. error is irrelevant anyway
  • no change for VPU(NPU) regardless, same hot/cold error

I also noticed another USB problem:
The Bluetooth USB device (that I’ve disabled in the BIOS) gets reenabled after suspend/resume.
Worse yet, Dom0 doesn’t block USB devices individually. It fully blocks everything besides hid if you do authorized=0. Otherwise it will load the Bluetooth driver. Adding a simple modprobe.d doesn’t work. Working on blocking it.

I did try using udev rules but failed… Trying more ideas atm.

usbguard generate-policy should help. Yes, solved it. dracut -f at the end.

Thoughts on the VPU/NPU and the APIC errors?

Meanwhile I decided to try something else that’s interesting - SR-IOV with the stock dom0 kernel.

I’ve enabled it inside the BIOS and prepared the Grub params:
intel_iommu=true iommu=pt xe.force_probe=7d67 i915.force_probe=!7d67

Curiously enough, loading XFCE takes much longer like this.

But otherwise it functiones normally.
However dmesg said no SR-IOV with xe for me.
Even though the lspci says the card is capable.

I believe it’s not supported yet. Perhaps under 6.19 or with a custom kernel.

Another interesting note:
Enabling SR-IOV added more error messages
ACPI: Unable to map lapic for logical cpu nimber
in dmesg

Also if you do enable xe and don’t enable intel_iommu and iommu pt flags, the UI doesn’t work right.

I am optimistic that it will work one day.