QCNFA765 (wcn6855) WiFi6 controller not working

On my new laptop (HP 845 G9) the WiFi controller is not working. The device has been forwarded to sys-net but it fails to configure the device and add a network interface for it.

Not sure if this is relevant for the issue, in order to get Qubes running at all on this latpop I had to add x2apic=false to the Xen boot parameters.

Output of lspci -v -nn for the device:

00:09.0 Network controller [0280]: Qualcomm QCNFA765 Wireless Network Adapter [17cb:1103] (rev 01)
        Subsystem: Foxconn International, Inc. Device [105b:e0c4]
        Physical Slot: 9
        Flags: fast devsel
        Memory at f2000000 (64-bit, non-prefetchable) [size=2M]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-
        Capabilities: [70] Express Endpoint, MSI 00
        Kernel modules: ath11k_pci

dmesg shows the following error messages related to it:

[user@sys-net ~]$ sudo dmesg|grep -v audit|grep  -iP '(ath11|mhi)'
[    2.693245] ath11k_pci 0000:00:09.0: BAR 0: assigned [mem 0xf2000000-0xf21fffff 64bit]
[    2.697604] ath11k_pci 0000:00:09.0: MSI vectors: 1
[    2.698198] ath11k_pci 0000:00:09.0: wcn6855 hw2.1
[    2.903740] mhi mhi0: Requested to power ON
[    2.903757] mhi mhi0: Power on setup success
[   95.397747] mhi mhi0: MHI did not load image over BHI, ret: -5
[  192.609086] mhi mhi0: Device failed to clear MHI Reset
[  192.609142] mhi mhi0: Error moving from PM state: Firmware Download Error to: DISABLE
[  192.609385] ath11k_pci 0000:00:09.0: failed to power up mhi: -110
[  192.609416] ath11k_pci 0000:00:09.0: failed to start mhi: -110
[  192.609440] ath11k_pci 0000:00:09.0: failed to power up :-110
[  192.615038] ath11k_pci 0000:00:09.0: failed to create soc core: -110
[  192.615078] ath11k_pci 0000:00:09.0: failed to init core: -110
[  192.687477] ath11k_pci: probe of 0000:00:09.0 failed with error -110

Iā€™ve already upgraded sys-net to kernel 6.0.2-2.fc32.qubes.x86_64 (via the kernel-latest-qubes-vm in dom0, kernel selected in the qubes settings of sys-net).

Any hints on how to get WiFi working on this device? Already tried ā€œConfigure strict reset for PCI devicesā€ in the settings without success.

Is sys-net based on debian or fedora?
Does the other template work? (Ie try fedora if youā€™re using debian)

Itā€™s using the default (Fedora 36). Will try again with Debian tonight but I donā€™t really expect the distro to change anything if the kernel remains the same (from kernel-latest-qubes-vm).

To me it looks you are missing the driver or you forgot to paste it.

Also this line looks bad

which should mean that MSI is supported, but it is disabled, and you should (try to) enable it.

The modules ath11k, ath11k_pci and mhi are definitely loaded:

[user@sys-net ~]$ lsmod|grep -iP 'ath11k|mhi'
ath11k_pci             24576  0
ath11k                471040  1 ath11k_pci
qmi_helpers            36864  1 ath11k
mac80211             1318912  1 ath11k
cfg80211             1134592  2 ath11k,mac80211
mhi                    98304  1 ath11k_pci

What may be related as well: modinfo ath11k_pci only shows a hw2.0 firmware but I have hw2.1:

[user@sys-net ~]$ modinfo ath11k_pci
filename:       /lib/modules/6.0.2-2.fc32.qubes.x86_64/kernel/drivers/net/wireless/ath/ath11k/ath11k_pci.ko
firmware:       ath11k/QCA6390/hw2.0/m3.bin
firmware:       ath11k/QCA6390/hw2.0/amss.bin
firmware:       ath11k/QCA6390/hw2.0/board-2.bin
[...], no more firmware entries

The hw2.1 firmware is available in /usr/lib/firmware/ath11k/WCN6855/hw2.1, maybe the driver is having trouble finding it.

MSI is enabled in the kernel config (both on dom0 and in sys-net):

[user@sys-net ~]$ zcat /proc/config.gz |grep MSI
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
CONFIG_IRQ_MSI_IOMMU=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_PCI_MSI=y
CONFIG_PCI_MSI_IRQ_DOMAIN=y

The MSI items in the kernel config look similar in dom0, but there Iā€™m still using kernel 5.15 due to Reboot loop triggered by auto-starting `sys-net` with `kernel-latest` in dom0 Ā· Issue #7918 Ā· QubesOS/qubes-issues Ā· GitHub.

Not sure if there is anything else where MSI could be manually enabled.

You can try and use the latest firmware

https://mirrors.edge.kernel.org/pub/linux/kernel/firmware/

Iā€™ve already gotten the package linux-firmware-20221109-144.fc36.noarch installed in sys-net and the latest version on kernel.org is also 20221109, so it should be exactly the same version.

Does it matter that modinfo only lists the hw2.0 firmware for ath11k_pci (and not the hw2.1 firmware)?

Donā€™t know if the hw version matters, normally the driver would know which firmware to load.

Is the modem host interface and the ath11k the same driver/firmware?

All drivers are from the 6.0.2 qubes kernel. As far as I can tell only ath11k_pci needs a firmware but not the mhi driver.

https://groups.google.com/g/linux.debian.bugs.dist/c/V4UgRZUYrRQ

Seems like you could be right about the hw version being the issue.

The symlinks from hw2.1 to hw2.0 are already there by default:

/usr/lib/firmware/ath11k/WCN6855/hw2.0/:
total 1764
-rw-r--r-- 1 root root   12000 Nov 15 23:50 Notice.txt.xz
-rw-r--r-- 1 root root 1665096 Nov 15 23:50 amss.bin.xz
-rw-r--r-- 1 root root   15408 Nov 15 23:50 board-2.bin.xz
-rw-r--r-- 1 root root  105872 Nov 15 23:50 m3.bin.xz
-rw-r--r-- 1 root root    2132 Nov 15 23:50 regdb.bin.xz

/usr/lib/firmware/ath11k/WCN6855/hw2.1/:
total 0
lrwxrwxrwx 1 root root 20 Nov 15 23:50 amss.bin.xz -> ../hw2.0/amss.bin.xz
lrwxrwxrwx 1 root root 23 Nov 15 23:50 board-2.bin.xz -> ../hw2.0/board-2.bin.xz
lrwxrwxrwx 1 root root 18 Nov 15 23:50 m3.bin.xz -> ../hw2.0/m3.bin.xz
lrwxrwxrwx 1 root root 21 Nov 15 23:50 regdb.bin.xz -> ../hw2.0/regdb.bin.xz

Also the logs donā€™t really look like it is related to not finding the firmware, this shouldnā€™t lead to a 90s delay between these two lines:

[    2.903757] mhi mhi0: Power on setup success
[   95.397747] mhi mhi0: MHI did not load image over BHI, ret: -5

You may check in your sys-net's msi_bus if itā€™s enabled.

Can you please give me some more details/commands on how to check this (sorry Iā€™m still new to Qubes and I donā€™t have much experience with debugging that kind of driver issues)? Didnā€™t find anything related to msi in the Qube Manager and a Google search for qubes "msi_bus" also didnā€™t return anything useful.

In the mean time Iā€™ve also booted up a FC37 live iso (also using Kernel 6.0) and there it is working. The first notable difference in the dmesg is that it says MSI vectors: 32 with the live system while it only says MSI vectors: 1 in sys-net in Qubes, this may confirm that it is indeed related to MSI.

Of course. Just check in /sys/bus/pci/devices/0000:00:09.0/msi_bus

if you find 1 there it is definitely enabled and vice versa.

There is a 1 there, so it is enabled.

Well Iā€™m sorry, but I canā€™t be of a bigger help then. Iā€™m still confused thereā€™s no ā€œkernel driver in useā€ line in your lspci's output, and "Enable - " should definitely indicate MSI is disabled. I never heard of the opposite.

Gotten another lspci output within the first 90 seconds and there the ath11k_pci driver is connected, so I assume that the driver only disconnects due to an error:

00:08.0 0280: 17cb:1103 (rev 01)
        Subsystem: 105b:e0c4
        Physical Slot: 8
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin ? routed to IRQ 75
        Region 0: Memory at f2000000 (64-bit, non-prefetchable) [size=2M]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-
                Address: fee51000  Data: 0700
        Capabilities: [70] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 75.000W
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <1us, L1 <64us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s (ok), Width x1 (ok)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
                         10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- TPHComp+ ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
                         AtomicOpsCtl: ReqEn-
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Kernel driver in use: ath11k_pci
        Kernel modules: ath11k_pci

This looks better, but still

Kernel driver in use: ath11k_pci

and from the beginning

[user@sys-net ~]$ lsmod|grep -iP ā€˜ath11k|mhiā€™
ath11k_pci 24576 0

Does this produce anything meaningful?

$ sudo modeprobe ath11k_pci

In the beginning lsmod says that ath11k_pci is in use, only later on after a timeout it gets unused (and also the ā€œKernel driver in use:ā€ in lspci disappears.

Removing and reloading the ath11k_pci module in sys-net will freeze the whole system (including dom0) so that it doesnā€™t react to keyboard/mouse input any more, requiring a hard powerdown to recover. But it isnā€™t a kernel panic, the clock in the xfce panel is still updating so something is still running.

No idea indeed (modprobe in dom0?). Maybe starting to research from here would help:

https://wireless.wiki.kernel.org/en/users/drivers/ath11k

Supported Devices
IPQ8074 hw2.0 (v5.6)
IPQ6018 hw1.0 (v5.10)
QCA6390 hw2.0 (v5.10)
QCN9074 hw1.0 (v5.14)
WCN6855 hw2.0 (v5.17)
WCN6855 hw2.1 (v5.17)

https://bugzilla.kernel.org/show_bug.cgi?id=210923