AMD iGPU passthrough attempt

yann · October 6, 2021, 10:30pm

My current status (with the setup described in these salt recipes) shows in the VM logs:

[2021-10-06 22:38:28] [    3.292678] [drm] BIOS signature incorrect 0 0
[2021-10-06 22:38:28] [    3.292699] amdgpu 0000:00:05.0: Invalid PCI ROM data signature: expecting 0x52494350, got 0xcb03aa55
[2021-10-06 22:38:28] [    3.342064] [drm] BIOS signature incorrect 0 0
[2021-10-06 22:38:28] [    3.342169] [drm:amdgpu_get_bios [amdgpu]] *ERROR* Unable to locate a BIOS ROM
[2021-10-06 22:38:28] [    3.342209] amdgpu 0000:00:05.0: amdgpu: Fatal error during GPU init
[2021-10-06 22:38:28] [    3.342284] amdgpu 0000:00:05.0: amdgpu: amdgpu: finishing device.

the stubdom log shows:

[2021-10-06 21:29:01] pcifront pci-0: Installing PCI frontend
[2021-10-06 21:29:01] xen:swiotlb_xen: Warning: only able to allocate 4 MB for software IO TLB
[2021-10-06 21:29:01] software IO TLB: mapped [mem 0x04c00000-0x05000000] (4MB)
[2021-10-06 21:29:01] written 110 bytes to vchan
[2021-10-06 21:29:01] pcifront pci-0: Creating PCI Frontend Bus 0000:00
[2021-10-06 21:29:01] pcifront pci-0: PCI host bridge to bus 0000:00
[2021-10-06 21:29:01] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
[2021-10-06 21:29:01] pci_bus 0000:00: root bus resource [mem 0x00000000-0xffffffffffff]
[2021-10-06 21:29:01] pci_bus 0000:00: root bus resource [bus 00-ff]
[2021-10-06 21:29:01] pci 0000:00:00.0: [1002:1636] type 00 class 0x030000
[2021-10-06 21:29:01] pci 0000:00:00.0: reg 0x10: [mem 0xb0000000-0xbfffffff 64bit pref]
[2021-10-06 21:29:01] pci 0000:00:00.0: reg 0x18: [mem 0xc0000000-0xc01fffff 64bit pref]
[2021-10-06 21:29:01] pci 0000:00:00.0: reg 0x20: [io  0xe000-0xe0ff]
[2021-10-06 21:29:01] pci 0000:00:00.0: reg 0x24: [mem 0xfe400000-0xfe47ffff]
[2021-10-06 21:29:01] pci 0000:00:00.0: reg 0x30: [mem 0x000c0000-0x000dffff pref]
[2021-10-06 21:29:01] pcifront pci-0: claiming resource 0000:00:00.0/0
[2021-10-06 21:29:01] pcifront pci-0: claiming resource 0000:00:00.0/2
[2021-10-06 21:29:01] pcifront pci-0: claiming resource 0000:00:00.0/4
[2021-10-06 21:29:01] pcifront pci-0: claiming resource 0000:00:00.0/5
[2021-10-06 21:29:01] pcifront pci-0: claiming resource 0000:00:00.0/6
[2021-10-06 21:29:01] pci 0000:00:00.0: can't claim BAR 6 [mem 0x000c0000-0x000dffff pref]: address conflict with Reserved [mem 0x000a0000-0x000fffff]
[2021-10-06 21:29:01] pcifront pci-0: Could not claim resource 0000:00:00.0/6! Device offline. Try using e820_host=1 in the guest config.

Could it be that the ROM should be accessible from BAR 6 ?

Is the Try using e820_host=1 in the guest config suggestion useful for us ? From the code it looks like the pci-e820-host feature defaults to 1 already (but then, seeing the libvirt config for sys-gui-gpu would help to confirm where we stand).

Since the ROM apparently can’t be read from the GPU, I booted on a Debian Live stick, and was able to extract it from /sys (though I’m not yet sure it is a pristine ROM image and not a shadow RAM copy that would have been patched, eg. by the EFI driver). It does have the proper signature where the kernel’s pci/rom.c is looking for it. To use this ROM I tried this patch to the pci.xml template:

--- pci.xml.orig	2021-10-05 00:47:56.599213557 +0200
+++ pci.xml	2021-10-06 21:48:24.315969520 +0200
@@ -12,6 +12,9 @@
             slot="0x{{ device.device }}"
             function="0x{{ device.function }}" />
     </source>
+{% if options.get('vga-rom', False) %}
+    <rom bar="on" file="{{ options.get('vga-rom', '') }}" />
+{% endif %}
 </hostdev>
 
 {# vim : set ft=jinja ts=4 sts=4 sw=4 et : #}

… and hacked by hand a vga-rom option in qubes.xml, with:

<option name='vga-rom'>/path/to/renoir.rom</option>

It looks like the <rom> element is indeed parsed (if I place it in a wrong place, eg. inside <source> I do see an error through journalctl)… but that does not change the system’s behaviour.

Anyone with a clue ?

marmarek · October 6, 2021, 11:42pm

yann:

It looks like the <rom> element is indeed parsed (if I place it in a wrong place, eg. inside <source> I do see an error through journalctl )… but that does not change the system’s behaviour.

I’m pretty sure the option+file is not transferred to the qemu in stubdomain. And indeed the “rom” option is documented as QEMU/KVM only.

marmarek · October 7, 2021, 12:02am

This is about the stubdom’s address space, not target domain’s one. So, the e820_host=1 should be added to the stubdom’s “config”. I quote “config”, because it doesn’t really exist, it gets dynamically created, and e820_host setting is not there.
If you are ok with rebuilding xen(-libs) package, you can add libxl_defbool_set(&dm_config->b_info.u.pv.e820_host, true) somewhere there.

Take a look also at IGD passthrough fix by gorbak25 · Pull Request #29 · QubesOS/qubes-vmm-xen-stubdom-linux · GitHub - it’s about (I think) very similar issue for Intel graphics.

yann · October 7, 2021, 7:51pm

Ah, the case of the file should have been pretty clear, in fact

That sounds interesting, indeed. This may possibly shed some light on of a fact for which I didn’t have an explanation for yet: this video card in Qubes does not expose its ROM through sysfs, although the Debian Live kernel does, and the dom0 kernel does see it:

[    1.201712] pci 0000:07:00.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
...
[    3.640623] amdgpu 0000:07:00.0: amdgpu: Fetched VBIOS from ROM BAR
...
[  105.927621] ACPI: video: Video Device [VGA] (multi-head: yes  rom: no  post: no)

(which BTW confirms BAR 6 points to the ROM, and the stubdom error does explain the amdgpu one)

On Debian I get:

[    1.244578] ACPI: Video Device [VGA] (multi-head: yes  rom: no  post: no)
...
[    2.370854] amdgpu 0000:07:00.0: amdgpu: Fetched VBIOS from VFCT

(whereas for the dGPU it does fetch the VBIOS from ROM BAR, claiming ACPI VFCT table present but broken. On Qubes I blacklisted it for now to avoid a boot loop, so no comparison here)

Not really sure why we have a difference here (there is anyway no explicit mention of a reason to expose the ROM in sysfs or not), and what the impact is. Probably useful to dig, as it could be the key to get (next) the dGPU to work…

Seems worth it

yann · October 7, 2021, 10:08pm

Added this (to current 4.14.2, so the impacted file is in a different place, but that func looked identical enough), and then seeing no change added a LOGED() trace, which ends up in libxl-driver.log, showing it gets applied to domains 1, 3, 5 (5 not appearing once sys-gui-gpu is disabled I guess it’s the one). But well, pcifront does not seem to check if we did that before making the suggestion.

Following the lead of the behaviour difference from the Debian kernel with VBIOS fetching, I see that VFCT is skipped if it does not appear in ACPI table. Sure enough it is not reported by the 5.13.13 kernel in dom0 (in “standard” boot, not sys-gui-gpu) … but it did appear in a 5.10.47 log captured on Sep 11th (which was a “standard” boot too).
So I tried to boot that 5.10.47 kernel with sys-gui-gpu and “normal” modes, and sadly it gets no VFCT table either. Has there been a change in Xen since Sep 11th, which would account for that ? At first sight, vmm-xen last changed on Aug 25th, but then maybe it was still only in testing ? There were some AMD/IOMMU and ACPI related changes for XSA-378… I could try to revert those patches.

marmarek · October 7, 2021, 11:20pm

One more idea: see https://github.com/QubesOS/qubes-vmm-xen/blob/xen-4.14/patch-libxl-automatically-enable-gfx_passthru-if-IGD-is-as.patch. It enables “gfx_passthru” option for Intel graphics. It (among other things) grants stubdom access to 0xa0000-0xc0000 address ranges - see libxl__grant_vga_iomem_permission() function in libxl_pci.c file. Maybe a similar thing is needed for AMD too?
A quick and dirty way to test this hypothesis would be applying a patch like this:
https://gist.github.com/marmarek/3b65652bbfc58615d2b880643f24d93a
(totally untested, things may explode, don’t blame me for velociraptors attack)

yann · October 8, 2021, 6:01pm

This seems to be against mainline Xen, Qubes already has a patch that handles vbios at 0xc0000, so I felt that part was not needed and only applied the remaining 2. The libxl log does show gfx_passthru and gfx_passthru_kind set to igd, and the dm log still reports the address conflict.

Reading the libxl_dm code, it looks like it does nothing special from gfx_passthru, and with this patch we neutralize all it’s doing for igd. On qemu side I can’t find any special flag for non-igd GPU. I feel something missing.

Reading the gfx_passthru stuff in xl.cfg.5, I understand when a GPU can be passed through without it, it will be secondary in the VM - I’d thing that will not work as expected of sys-gui-gpu, so it would have to be set whatever the GPU vendor/model, right ?

yann · October 9, 2021, 9:11am

Trying to step back a little, IIUC the core issue is the GPU driver in the guest not getting having access to the VBIOS ROM. It has several ways of accessing it, among which it prefers ACPI ATRM and VFCT tables (but then the guest gets a Xen-forged ACPI table), and reading the ROM BAR (which is thus our focus here).

Now qemu seems to have several ways of exposing the vbios through the ROM BAR. Notably, pci_assign_dev_load_option_rom() which is one of the focus of the IGD patch you mentionned goes reading it from /sys, and using the romfile= parameter seems to be an alternative, and since we’re talking about code in xen_pt_load_rom.c it may have chance to be useful even out of the KVM case.

It looks like we have several families of solutions, including:

full mmap-like passthrough, causing reads of the ROM BAR in the guest to result in reads ot the ROM BAR in dom0 (which is what we’re trying to do right now)
providing the ROM data to qemu, so it can emulate the ROM BAR (which is possibly simpler than using passthrough, and could provide a viable long-term solution), where we have 2 distinct problems:
1. get the VBIOS ROM data, with 2 options:
  - get it from /sys in dom0 (which in my particular case could be linked to ACPI VFCT not being visible in dom0, which looks like a problem we could overcome, as the Qubes kernel was seeing this VFCT table a couple of weeks ago)
  - get it from a file
2. make it available to qemu in stubdom
  - provide to qemu with romfile=
  - get qemu to see the rom in /sys, which I’m not sure would provide any advantage over romfile=

A low-hanging fruit, which would provide a fallback for the (apparently many) cases where reading the ROM requires more work, would be to pass romfile= from a file, and we could fix the more difficult problem from there.

yann · October 10, 2021, 5:20pm

Diving more in the qemu hw/xen code, the pci_assign_dev_load_option_rom() code will only run, through get_vgabios(), if igd-passthru has been set, so I’m trying without your patch commenting it out.

Unfortunately when not set, even though an error seems to be sent through QAPI, none gets to stderr and we can’t see this in the logs, so I’m adding a couple of XEN_PT_LOG calls there.

Iterating on this to assert what path in qemu is actually taken is quite painful though, with all stubdom being rebuilt on each make vmm-xen-stubdom-linux-dom0 (like everything gets rebuilt when asking make qubes-dom0). Isn’t there a simple “rebuild only changed stuff” feature in the builder ?

marmarek · October 10, 2021, 8:41pm

qubes-builder doesn’t give you this option, but to iterate quickly, you can easily clone https://github.com/qubesos/qubes-vmm-xen-stubdom-linux directly and build from there (see README).

yann · October 18, 2021, 10:04pm

Hm, time flies and I still did not find enough of it to do every tests I had in mind for this answer… so it may feel a bit incomplete…

By hacking the xen_pt_realize() test that checks for a hardcoded PFN for the IGD, preventing access to xen_pt_setup_vga() to anything not on 0000:00.02.0 (apparently compared with the PFN in the stubdom, where my iGPU is on 0000:00.00.0 - and I’m wondering why an IGP would get such special treatment that it would not appear as 0000:00.00.0 too in a stubdom), and I can see qemu (expectedly) failing to get the vbios from sysfs, and then happily copying it from memory, getting to the Legacy VBIOS registered trace from xen_pt_direct_vbios_copy().

I find that slightly disturbing, after the can't claim BAR 6 message - but then, it’s (surprisingly?) does not bother to check for any magic number (nor does the /sys/ code path, though in this case modern kernels do their own checks, IIRC).

As for the if(dev->romfile in pci_assign_dev_load_option_rom() I cannot see how it could result in the relevant pci_register_bar() call. So I went forward with hardcoding my video rom in the code for a test… and it turns out the amdgpu driver still prints the same Invalid PCI ROM data signature (with the same got 0xcb03aa55 which in memory spells starting with 0x55 0xaa … which happens to be the 2-byte magic for the BIOS ROM … which I find disturbing but could not make anything of it for now).

To make sure of what gets read in /dev/mem I added a check for the 0x55 0xaa magic number, and it indeed catches what appears not to be a bios rom, starting with 0x0000 - obviously I’ll have to double-check this, dump more memory, and see how this results in amgdpu finding out that signature.

Slow progress, and I again won’t have any time for this until next weekend

yann · October 18, 2021, 10:17pm

Well, the README does not tell about make full, whereas the images generated by make all do not appear to be used (at least the xen.xml template references the “full” version). Maybe this README would benefit from a bit more info ?

Also, building such packages separately, although it avoids full rebuild of everything, requires to install specific qubes devel packages, which ideally are only installed in a chroot to make sure they don’t pollute - or in a separate VM, but having separate temporary VMs to build each such package separately is starting to be heavy.
Maybe I’ll end up resuming my experiments with ISAR first … sooo many nested projects and soo little time

marmarek · October 19, 2021, 12:54am

It is used by default. The reference to “full” version you’ve found is an alternative path (overriding the default) that is used only for very specific configs (with USB or audio passthrough via stubdom).

yann · November 15, 2021, 8:16pm

Finally I settled with disabling the build of the full version to cut build time in half… and enable parallel stubdom builds to divide it further by vcpus. 3 minutes to build started to make iterations reasonable again.

yann · November 15, 2021, 8:58pm

As a first PoC I started with compiling my extracted ROM as static data…

… but to get it loaded at all I also had to revert this patch hunk which assumes that previous code creates a proper shadow copy, which is probably not the case here (or is it ?).

Now my stubtom seems to expose a VGA device with rombar, showing…

[2021-11-14 16:47:43] [00:05.0] xen_pt_realize: Assigning real physical device 07:00.0 to devfn 0x28
[2021-11-14 16:47:43] [00:05.0] xen_pt_realize:  real_device = 0000:07:00.0
[2021-11-14 16:47:43] [00:05.0] xen_pt_realize: Assigning VGA (passthru=1)...
[2021-11-14 16:47:43] [00:05.0] xen_pt_setup_vga: Legacy VBIOS imported
[2021-11-14 16:47:43] [00:05.0] xen_pt_register_regions: IO region 0 registered (size=0x10000000 base_addr=0xb0000000 type: 0x4)
[2021-11-14 16:47:43] [00:05.0] xen_pt_register_regions: IO region 2 registered (size=0x00200000 base_addr=0xc0000000 type: 0x4)
[2021-11-14 16:47:43] [00:05.0] xen_pt_register_regions: IO region 4 registered (size=0x00000100 base_addr=0x0000e000 type: 0x1)
[2021-11-14 16:47:43] [00:05.0] xen_pt_register_regions: IO region 5 registered (size=0x00080000 base_addr=0xfe400000 type: 0)
[2021-11-14 16:47:43] [00:05.0] xen_pt_register_regions: Expansion ROM registered (size=0x00020000 base_addr=0x000c0000)
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0010 mismatch! Emulated=0x0000, host=0xb000000c, syncing to 0xb000000c.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0018 mismatch! Emulated=0x0000, host=0xc000000c, syncing to 0xc000000c.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0020 mismatch! Emulated=0x0000, host=0xe001, syncing to 0xe001.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0024 mismatch! Emulated=0x0000, host=0xfe400000, syncing to 0xfe400000.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0030 mismatch! Emulated=0x0000, host=0xc0002, syncing to 0x0002.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0052 mismatch! Emulated=0x0000, host=0x0003, syncing to 0x0003.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x00a2 mismatch! Emulated=0x0000, host=0x0084, syncing to 0x0080.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0068 mismatch! Emulated=0x0000, host=0x8fa1, syncing to 0x8fa1.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0076 mismatch! Emulated=0x0000, host=0x1104, syncing to 0x1104.
[2021-11-14 16:47:43] [00:05.0] xen_pt_pci_intx: intx=1
[2021-11-14 16:47:43] [00:05.0] xen_pt_realize: Real physical device 07:00.0 registered successfully

… but that does not seem to impress sys-gui-gpu’s amdgpu driver, at all, it still claims:

[2021-11-14 16:47:47] [    2.656523] amdgpu: Topology: Add CPU node
[2021-11-14 16:47:47] [    2.656616] amdgpu 0000:00:05.0: vgaarb: deactivate vga console
[2021-11-14 16:47:47] [    2.657625] [drm] initializing kernel modesetting (RENOIR 0x1002:0x1636 0x1462:0x12AC 0xC6).
[2021-11-14 16:47:47] [    2.657651] amdgpu 0000:00:05.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
[2021-11-14 16:47:47] [    2.657678] [drm] register mmio base: 0xF1200000
[2021-11-14 16:47:47] [    2.657688] [drm] register mmio size: 524288
[2021-11-14 16:47:47] [    2.658964] [drm] add ip block number 0 <soc15_common>
[2021-11-14 16:47:47] [    2.658977] [drm] add ip block number 1 <gmc_v9_0>
[2021-11-14 16:47:47] [    2.658987] [drm] add ip block number 2 <vega10_ih>
[2021-11-14 16:47:47] [    2.658998] [drm] add ip block number 3 <psp>
[2021-11-14 16:47:47] [    2.659008] [drm] add ip block number 4 <smu>
[2021-11-14 16:47:47] [    2.659018] [drm] add ip block number 5 <gfx_v9_0>
[2021-11-14 16:47:47] [    2.659028] [drm] add ip block number 6 <sdma_v4_0>
[2021-11-14 16:47:47] [    2.659039] [drm] add ip block number 7 <dm>
[2021-11-14 16:47:47] [    2.659049] [drm] add ip block number 8 <vcn_v2_0>
[2021-11-14 16:47:47] [    2.659059] [drm] add ip block number 9 <jpeg_v2_0>
[2021-11-14 16:47:47] [    2.701134] [drm] BIOS signature incorrect 0 0
[2021-11-14 16:47:47] [    2.701152] amdgpu 0000:00:05.0: Invalid PCI ROM data signature: expecting 0x52494350, got 0xcb03aa55
[2021-11-14 16:47:47] [    2.742791] [drm] BIOS signature incorrect 0 0
[2021-11-14 16:47:47] [    2.742881] [drm:amdgpu_get_bios [amdgpu]] *ERROR* Unable to locate a BIOS ROM
[2021-11-14 16:47:47] [    2.742898] amdgpu 0000:00:05.0: amdgpu: Fatal error during GPU init
[2021-11-14 16:47:47] [    2.742911] amdgpu 0000:00:05.0: amdgpu: amdgpu: finishing device.

… so it may well be that this ROM is still not provided to the VM where the driver is looking for it (I’m specifically double-checking that this 0x55 0xaa BIOS magic is there)
@marmarek, will gladly accept more ideas at this point

As I’m having doubts (from Qubes 4.0.4 era) that the 5.4 default VM kernel would be able to properly support this hardware anyway, and since that really seems to be the most recent VM kernel around, I also tried to let sys-gui-gpu boot the fc33-provided 5.14 kernel (through qvm-prefs sys-gui-gpu kernel ""). In that case, the amdgpu driver does not even seem to be loaded, and sys-gui-gpu does not appear to start well enough for the Qubes agent to start, and it gets killed soon – the reason from kernel logs being lack of blkfront driver, obviously it cannot start this way without an enhanced initramfs.
Is there really no way to tell dracut not to omit any kernel hardware module ? I can’t believe it but no such thing apepars to be documented

For reference:

Edit: I’ve started to doubt whether the fc33 ramdisk is indeed correctly generated at all, it should include the proper xen block drivers, right ? And a small step back allowed me to see it was kernel-latest-qubes-vm I was really looking for – though it does not help with the PCI ROM. Back to digging

yann · November 20, 2021, 7:24pm

It looks like all checks for e820_host in libxl_x86.c are in fact conditionned by b_info->type == LIBXL_DOMAIN_TYPE_PV, that would explain it has no impact on a HVM. According to the commit introducing e820_host that’s just how it is, “being a PV guest” is a prerequisite.

That said, it would seem that dom0 gets the host e820 map, and as I understand it that 0x000c0000-0x000dffff range does lie in the same reserved region (well, except if dom0 does not get the real BIOS e820 map – but hypervisor.log does not seem to dump it either, and this message during the series review seems to imply that dom0 indeed shows the host’s e820):

[    0.000000] BIOS-provided physical RAM map:
[    0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable
[    0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved
[    0.000000] Xen: [mem 0x0000000000100000-0x0000000009bfefff] usable

(but then, I was only poking around with a fresh look on those previous attempts, that should not get in the way of providing explicit expansion ROM data)

yann · November 21, 2021, 11:05am

Thinking twice about it: I thought it would be expected to see the PCI devices physical addresses protected from the OS by being declared in reserved regions. However, if we compare the ranges of the different BARs:

[2021-11-21 00:13:11] pci 0000:00:00.0: [1002:1636] type 00 class 0x030000
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x10: [mem 0xb0000000-0xbfffffff 64bit pref]
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x18: [mem 0xc0000000-0xc01fffff 64bit pref]
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x20: [io  0xe000-0xe0ff]
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x24: [mem 0xfe400000-0xfe47ffff]
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x30: [mem 0x000c0000-0x000dffff pref]
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/0
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/2
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/4
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/5
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/6
[2021-11-21 00:13:11] pci 0000:00:00.0: can't claim BAR 6 [mem 0x000c0000-0x000dffff pref]: address conflict with Reserved [mem 0x000a0000-0x000fffff]

… with the map provided by the BIOS:

[    0.000000] BIOS-provided physical RAM map:
[    0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable
[    0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved
[    0.000000] Xen: [mem 0x0000000000100000-0x0000000009bfefff] usable
[    0.000000] Xen: [mem 0x0000000009bff000-0x0000000009ffffff] reserved
[    0.000000] Xen: [mem 0x000000000a000000-0x000000000a1fffff] usable
[    0.000000] Xen: [mem 0x000000000a200000-0x000000000a20cfff] ACPI NVS
[    0.000000] Xen: [mem 0x000000000a20d000-0x00000000a9eaafff] usable
[    0.000000] Xen: [mem 0x00000000a9eab000-0x00000000ab3c8fff] reserved
[    0.000000] Xen: [mem 0x00000000ab3c9000-0x00000000ab419fff] ACPI data
[    0.000000] Xen: [mem 0x00000000ab41a000-0x00000000ab786fff] ACPI NVS
[    0.000000] Xen: [mem 0x00000000ab787000-0x00000000ab787fff] reserved
[    0.000000] Xen: [mem 0x00000000ab788000-0x00000000ab98dfff] ACPI NVS
[    0.000000] Xen: [mem 0x00000000ab98e000-0x00000000ad5fefff] reserved
[    0.000000] Xen: [mem 0x00000000ad5ff000-0x00000000adffffff] usable
[    0.000000] Xen: [mem 0x00000000ae000000-0x00000000afffffff] reserved
[    0.000000] Xen: [mem 0x00000000f0000000-0x00000000f7ffffff] reserved
[    0.000000] Xen: [mem 0x00000000fd000000-0x00000000ffffffff] reserved
[    0.000000] Xen: [mem 0x0000000100000000-0x0000000155bc1fff] usable
[    0.000000] Xen: [mem 0x000000042f340000-0x00000004701fffff] reserved
[    0.000000] Xen: [mem 0x000000fd00000000-0x000000ffffffffff] reserved

… we can see that the 0xfe400000-0xfe47ffff range of BAR 5 is indeed intersecting with the 0x00000000fd000000-0x00000000ffffffff reserved region, but the BAR 0 and BAR 2 ranges fall in an “undeclared” gap between 2 reserved regions. And that difference does not result in different handling of those 3 BARs, whose resources are apparently all successfully claimed.

@marmarek, do you see why BAR 5 would not be detected as a conflict by request_resource_conflict() ? The most obvious would be that the stubdom’s memory map would not match the dom0 one (which I guess the host_e820 trick would correct if it was supported for HVM), but then the stubdom kernel is very quiet and does not report its view of the map. Its kconfig does not show a change in default loglevel, and its cmdline is reported as empty, there’s definitely no quiet flag there. How then is it so quiet ?

I also not that pci_claim_resource() would not have to make such a check if we had a shadow copy of the ROM at this point – this could look like a path worth investigating?

yann · November 21, 2021, 5:08pm

Back to the other end of the problem, namely getting to understand why amdgpu is still unable to access the expansion ROM exposed by the stubdom qemu…

Note that I’m starting to consider a new path, which would be to teach amdgpu to load a ROM directly from within the guest (eg. using request_firmware) as a way to advance the PoC, since obviously I’m not getting there as fast as I’d like with the current approaches. Before attempting this, though, there are still a few things that puzzle me and could possibly hint to something:

(traces based on above-mentionned stubtom qemu patches, and those guest linux patches)

One thing that had been there before my eyes from the start but had not stood out until now:

yann:

[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0030 mismatch! Emulated=0x0000, host=0xc0002, syncing to 0x0002.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0052 mismatch! Emulated=0x0000, host=0x0003, syncing to 0x0003.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x00a2 mismatch! Emulated=0x0000, host=0x0084, syncing to 0x0080.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0068 mismatch! Emulated=0x0000, host=0x8fa1, syncing to 0x8fa1.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0076 mismatch! Emulated=0x0000, host=0x1104, syncing to 0x1104.

I’m not really clear yet why we have those mismatches to start with, but whereas virtually all of them result in sync’ing the emulated addresses to the host’s, the first of those expansion-ROM-related ones is not, with 0xc0002 (in the range which is causing those headaches) becoming 0x0002.

A second thing is that qemu claims to expose the BARs for the device at addresses:

[2021-11-21 00:13:11] [00:05.0] xen_pt_register_regions: IO region 0 registered (size=0x10000000 base_addr=0xb0000000 type: 0x4)
[2021-11-21 00:13:11] [00:05.0] xen_pt_register_regions: IO region 2 registered (size=0x00200000 base_addr=0xc0000000 type: 0x4)
[2021-11-21 00:13:11] [00:05.0] xen_pt_register_regions: IO region 4 registered (size=0x00000100 base_addr=0x0000e000 type: 0x1)
[2021-11-21 00:13:11] [00:05.0] xen_pt_register_regions: IO region 5 registered (size=0x00080000 base_addr=0xfe400000 type: 0)
[2021-11-21 00:13:11] [00:05.0] xen_pt_register_regions: Expansion ROM registered (size=0x00020000 base_addr=0x000c0000)

… that closely mimic that of the host:

[    1.104549] pci 0000:07:00.0: [1002:1636] type 00 class 0x030000
[    1.104571] pci 0000:07:00.0: reg 0x10: [mem 0xb0000000-0xbfffffff 64bit pref]
[    1.104586] pci 0000:07:00.0: reg 0x18: [mem 0xc0000000-0xc01fffff 64bit pref]
[    1.104597] pci 0000:07:00.0: reg 0x20: [io  0xe000-0xe0ff]
[    1.104607] pci 0000:07:00.0: reg 0x24: [mem 0xfe400000-0xfe47ffff]
[    1.104624] pci 0000:07:00.0: enabling Extended Tags

… and that in the guest we see noticeably different addresses, which I’d expect to reflect the values set above by qemu, as being physical addresses:

[2021-11-21 00:13:12] [    0.318637] pci 0000:00:05.0: [1002:1636] type 00 class 0x030000
[2021-11-21 00:13:12] [    0.320239] pci 0000:00:05.0: reg 0x10: [mem 0xe0000000-0xefffffff 64bit pref]
[2021-11-21 00:13:12] [    0.322241] pci 0000:00:05.0: reg 0x18: [mem 0xf1000000-0xf11fffff 64bit pref]
[2021-11-21 00:13:12] [    0.324246] pci 0000:00:05.0: reg 0x20: [io  0xc200-0xc2ff]
[2021-11-21 00:13:12] [    0.326236] pci 0000:00:05.0: reg 0x24: [mem 0xf1200000-0xf127ffff]
[2021-11-21 00:13:12] [    0.329241] pci 0000:00:05.0: reg 0x30: [mem 0xf1280000-0xf129ffff pref]
[2021-11-21 00:13:12] [    0.329484] pci 0000:00:05.0: enabling Extended Tags
...
[2021-11-21 00:13:13] [    0.475690] pci 0000:00:05.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]

@marmarek isn’t that one really suspect ?
Is it telling that it’s the shadowed ROM that lives at 0xc0000 and not the ROM itself ?

A third one is the contents that the drivers gets in that memory region it maps as being the expansion ROM:

[2021-11-21 00:13:14] [    2.601896] [drm] amdgpu_atrm_get_bios()
[2021-11-21 00:13:14] [    2.601905] [drm] amdgpu_acpi_vfct_bios()
[2021-11-21 00:13:14] [    2.601914] [drm] igp_read_bios_from_vram()
[2021-11-21 00:13:15] [    2.650226] [drm] BIOS signature incorrect 0 0
[2021-11-21 00:13:15] [    2.650272] [drm] amdgpu_read_bios()
[2021-11-21 00:13:15] [    2.650285] amdgpu 0000:00:05.0: pci_map_rom()
[2021-11-21 00:13:15] [    2.650296] amdgpu 0000:00:05.0: pci_map_rom: start=0000000017dcda60, size=20000
[2021-11-21 00:13:15] [    2.650313] amdgpu 0000:00:05.0: pci_enable_rom: shadow copy, nothing to do
[2021-11-21 00:13:15] [    2.650337] amdgpu 0000:00:05.0: PCI ROM @00: aa55 cb03 0000 0000
[2021-11-21 00:13:15] [    2.650379] amdgpu 0000:00:05.0: PCI ROM @16: 0000 0000 001c 5024
[2021-11-21 00:13:15] [    2.650396] amdgpu 0000:00:05.0: Invalid PCI ROM data signature: expecting 0x52494350, got 0xcb03aa55
[2021-11-21 00:13:15] [    2.650415] amdgpu 0000:00:05.0: pci_map_rom: pci_get_rom_size failed
[2021-11-21 00:13:15] [    2.650432] [drm] amdgpu_read_bios_from_rom()
[2021-11-21 00:13:15] [    2.650443] [drm] amdgpu_read_bios_from_rom: amdgpu_asic_read_bios_from_rom failed
[2021-11-21 00:13:15] [    2.650458] [drm] amdgpu_read_disabled_bios()
[2021-11-21 00:13:15] [    2.650469] [drm] igp_read_bios_from_vram()
[2021-11-21 00:13:15] [    2.692561] [drm] BIOS signature incorrect 0 0
[2021-11-21 00:13:15] [    2.692590] [drm] amdgpu_read_platform_bios()
[2021-11-21 00:13:15] [    2.692601] amdgpu 0000:00:05.0: amdgpu: Unable to locate a BIOS ROM

That is, the 0xaa55 expansion ROM signature is really there, but it’s the only value that looks right. At 0x16 where we should have the offset to the VBIOS signature, we get 0x0000, which explains the strange-looking signature causing the extraction to abort. I feel the biggest question here would be, why do we have the first 2 bytes correct, if the rest is just junk ?
I’m especially wondering if there would not be a link with my first question above, as the junk here starts at what should be physical address 0xc0002. Could it be that ROM shadowing gets broken because of this ?

yann · November 24, 2021, 11:43pm

So here I am with a small PoC commit doing precisely this. And indeed there is some good news: the driver does load the my VBIOS ROM and appears to like it, but soon things turn out not to be so fine (at first sight unrelated with VBIOS) with…

a strange-looking MTRR write failure
some trouble with the PSP firmware failing to load, triggering the termination of the amdgpu driver
… and then dereferencing a bad pointer (bug in the error path?) sends the kernel to panic, and possibly inducing a qemu segfault
… which result in unresponsive Qubes and requires hard poweroff

[2021-11-23 21:05:52] [    4.297684] amdgpu 0000:00:05.0: amdgpu: Fetched VBIOS from firmware file
[2021-11-23 21:05:52] [    4.297709] amdgpu: ATOM BIOS: 113-RENOIR-025
[2021-11-23 21:05:52] [    4.302046] [drm] VCN decode is enabled in VM mode
[2021-11-23 21:05:52] [    4.302066] [drm] VCN encode is enabled in VM mode
[2021-11-23 21:05:52] [    4.302078] [drm] JPEG decode is enabled in VM mode
[2021-11-23 21:05:52] [    4.302144] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
[2021-11-23 21:05:52] [    4.302181] amdgpu 0000:00:05.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
[2021-11-23 21:05:52] [    4.302217] amdgpu 0000:00:05.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[2021-11-23 21:05:52] [    4.302246] amdgpu 0000:00:05.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
[2021-11-23 21:05:52] [    4.302268] mtrr: base(0x430000000) is not aligned on a size(0x20000000) boundary
[2021-11-23 21:05:52] [    4.302289] Failed to add WC MTRR for [000000000998bb55-00000000eb9e681e]; performance may suffer.
[2021-11-23 21:05:52] [    4.302295] [drm] Detected VRAM RAM=512M, BAR=512M
[2021-11-23 21:05:52] [    4.302341] [drm] RAM width 128bits DDR4
[2021-11-23 21:05:52] [    4.302401] [drm] amdgpu: 512M of VRAM memory ready
[2021-11-23 21:05:52] [    4.302412] [drm] amdgpu: 691M of GTT memory ready.
[2021-11-23 21:05:52] [    4.302437] [drm] GART: num cpu pages 262144, num gpu pages 262144
[2021-11-23 21:05:52] [    4.302565] [drm] PCIE GART of 1024M enabled.
[2021-11-23 21:05:52] [    4.302575] [drm] PTB located at 0x000000F400900000
[2021-11-23 21:05:52] [    4.312921] amdgpu 0000:00:05.0: amdgpu: PSP runtime database doesn't exist
[2021-11-23 21:05:52] [    4.342353] [drm] Loading DMUB firmware via PSP: version=0x01010019
[2021-11-23 21:05:52] [    4.346679] [drm] Found VCN firmware Version ENC: 1.14 DEC: 5 VEP: 0 Revision: 20
[2021-11-23 21:05:52] [    4.346723] amdgpu 0000:00:05.0: amdgpu: Will use PSP to load VCN firmware
[2021-11-23 21:05:52] [    4.978736] [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR
[2021-11-23 21:05:52] 
[2021-11-23 21:05:52] Fedora 33 (Thirty Three)
[2021-11-23 21:05:52] Kernel 5.14.15-1.fc32.qubes.x86_64 on an x86_64 (hvc0)
[2021-11-23 21:05:52] 
[2021-11-23 21:05:52] sys-gui-gpu login: [    5.136770] input: dom0: AT Translated Set 2 keyboard as /devices/virtual/input/input7
...
[2021-11-23 21:05:55] [    7.675982] [drm] psp command (0xFFFFFFFF) failed and response status is (0xFFFFFFFF)
[2021-11-23 21:05:55] [    7.676007] [drm:psp_hw_start [amdgpu]] *ERROR* PSP load tmr failed!
[2021-11-23 21:05:55] [    7.676213] [drm:psp_hw_init [amdgpu]] *ERROR* PSP firmware loading failed
[2021-11-23 21:05:55] [    7.676371] [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* hw_init of IP block <psp> failed -22
[2021-11-23 21:05:55] [    7.676530] amdgpu 0000:00:05.0: amdgpu: amdgpu_device_ip_init failed
[2021-11-23 21:05:55] [    7.676563] amdgpu 0000:00:05.0: amdgpu: Fatal error during GPU init
[2021-11-23 21:05:55] [    7.676578] amdgpu 0000:00:05.0: amdgpu: amdgpu: finishing device.
[2021-11-23 21:05:55] [    7.679044] amdgpu: probe of 0000:00:05.0 failed with error -22
[2021-11-23 21:05:55] [    7.679102] BUG: unable to handle page fault for address: ffffb1f120cdf000
[2021-11-23 21:05:55] [    7.679117] #PF: supervisor write access in kernel mode
[2021-11-23 21:05:55] [    7.679129] #PF: error_code(0x0002) - not-present page
[2021-11-23 21:05:55] [    7.679140] PGD 1000067 P4D 1000067 PUD 11dc067 PMD 0 
[2021-11-23 21:05:55] [    7.679154] Oops: 0002 [#1] SMP NOPTI
[2021-11-23 21:05:55] [    7.679163] CPU: 0 PID: 276 Comm: systemd-udevd Not tainted 5.14.15-1.fc32.qubes.x86_64 #1
[2021-11-23 21:05:55] [    7.679180] Hardware name: Xen HVM domU, BIOS 4.14.3 11/14/2021
[2021-11-23 21:05:55] [    7.679194] RIP: 0010:vcn_v2_0_sw_fini+0x10/0x40 [amdgpu]
[2021-11-23 21:05:55] [    7.679367] Code: 66 f0 83 c2 81 c6 ea 05 00 00 31 c9 4c 89 cf e9 b6 4d ee ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 87 38 17 01 00 48 89 fd <c7> 00 00 00 00 00 e8 d5 d5 f1 ff 48 89 ef e8 2d 20 ff ff 85 c0 74
[2021-11-23 21:05:55] [    7.679402] RSP: 0018:ffffb1f1002cfc30 EFLAGS: 00010206
[2021-11-23 21:05:55] [    7.679414] RAX: ffffb1f120cdf000 RBX: ffff8b4d9a675620 RCX: 0000000000000000
[2021-11-23 21:05:55] [    7.679429] RDX: 000000000000000e RSI: 0000000000000003 RDI: ffff8b4d9a660000
[2021-11-23 21:05:55] [    7.679444] RBP: ffff8b4d9a660000 R08: 000000000000000f R09: 000000008010000f
[2021-11-23 21:05:55] [    7.679459] R10: 0000000040000000 R11: 000000001b99d000 R12: ffff8b4d9a675590
[2021-11-23 21:05:55] [    7.679474] R13: ffff8b4d9a676400 R14: 000000000000000c R15: ffff8b4d813ef36c
[2021-11-23 21:05:55] [    7.679490] FS:  000073bc16d48380(0000) GS:ffff8b4dbcc00000(0000) knlGS:0000000000000000
[2021-11-23 21:05:55] [    7.679507] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2021-11-23 21:05:55] [    7.679520] CR2: ffffb1f120cdf000 CR3: 0000000004160000 CR4: 0000000000350ef0
[2021-11-23 21:05:55] [    7.679536] Call Trace:
[2021-11-23 21:05:55] [    7.679545]  amdgpu_device_ip_fini.isra.0+0xb6/0x1e0 [amdgpu]
[2021-11-23 21:05:55] [    7.679691]  amdgpu_device_fini_sw+0xe/0x100 [amdgpu]
[2021-11-23 21:05:55] [    7.679835]  amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
[2021-11-23 21:05:55] [    7.679978]  devm_drm_dev_init_release+0x3d/0x60 [drm]
[2021-11-23 21:05:55] [    7.680008]  devres_release_all+0xb8/0x100
[2021-11-23 21:05:55] [    7.680019]  really_probe+0x100/0x310
[2021-11-23 21:05:55] [    7.680029]  __driver_probe_device+0xfe/0x180
[2021-11-23 21:05:55] [    7.680040]  driver_probe_device+0x1e/0x90
[2021-11-23 21:05:55] [    7.680050]  __driver_attach+0xc0/0x1c0
[2021-11-23 21:05:55] [    7.680059]  ? __device_attach_driver+0xe0/0xe0
[2021-11-23 21:05:55] [    7.680070]  ? __device_attach_driver+0xe0/0xe0
[2021-11-23 21:05:55] [    7.680081]  bus_for_each_dev+0x89/0xd0
[2021-11-23 21:05:55] [    7.680090]  bus_add_driver+0x12b/0x1e0
[2021-11-23 21:05:55] [    7.680099]  driver_register+0x8f/0xe0
[2021-11-23 21:05:55] [    7.680109]  ? 0xffffffffc0e7b000
[2021-11-23 21:05:55] [    7.680117]  do_one_initcall+0x57/0x200
[2021-11-23 21:05:55] [    7.680128]  do_init_module+0x5c/0x260
[2021-11-23 21:05:55] [    7.680137]  __do_sys_finit_module+0xae/0x110
[2021-11-23 21:05:55] [    7.680149]  do_syscall_64+0x3b/0x90
[2021-11-23 21:05:55] [    7.680158]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[2021-11-23 21:05:55] [    7.680170] RIP: 0033:0x73bc17ce9edd
[2021-11-23 21:05:55] [    7.680180] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6b 7f 0c 00 f7 d8 64 89 01 48
[2021-11-23 21:05:55] [    7.680215] RSP: 002b:00007fffa9b51688 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[2021-11-23 21:05:55] [    7.680231] RAX: ffffffffffffffda RBX: 0000602da93e3120 RCX: 000073bc17ce9edd
[2021-11-23 21:05:55] [    7.680246] RDX: 0000000000000000 RSI: 000073bc17e2732c RDI: 0000000000000014
[2021-11-23 21:05:55] [    7.680260] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000602da93e3bb0
[2021-11-23 21:05:55] [    7.680275] R10: 0000000000000014 R11: 0000000000000246 R12: 000073bc17e2732c
[2021-11-23 21:05:55] [    7.680290] R13: 0000602da9338960 R14: 0000000000000007 R15: 0000602da93e4000
[2021-11-23 21:05:55] [    7.680306] Modules linked in: joydev intel_rapl_msr amdgpu(+) intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ip6table_filter ip6table_mangle ip6table_raw ip6_tables iommu_v2 gpu_sched ipt_REJECT i2c_algo_bit nf_reject_ipv4 drm_ttm_helper ttm xt_state xt_conntrack iptable_filter iptable_mangle iptable_raw drm_kms_helper ehci_pci xt_MASQUERADE iptable_nat nf_nat nf_conntrack ehci_hcd cec nf_defrag_ipv6 serio_raw nf_defrag_ipv4 i2c_piix4 ata_generic pata_acpi pcspkr xen_scsiback target_core_mod xen_netback uinput xen_privcmd xen_gntdev drm xen_gntalloc xen_blkback fuse xen_evtchn bpf_preload ip_tables overlay xen_blkfront
[2021-11-23 21:05:55] [    7.876218] CR2: ffffb1f120cdf000
[2021-11-23 21:05:55] [    7.876227] ---[ end trace 36c4552e098fcc4e ]---
[2021-11-23 21:05:55] [    7.876239] RIP: 0010:vcn_v2_0_sw_fini+0x10/0x40 [amdgpu]
[2021-11-23 21:05:55] [    7.876400] Code: 66 f0 83 c2 81 c6 ea 05 00 00 31 c9 4c 89 cf e9 b6 4d ee ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 87 38 17 01 00 48 89 fd <c7> 00 00 00 00 00 e8 d5 d5 f1 ff 48 89 ef e8 2d 20 ff ff 85 c0 74
[2021-11-23 21:05:55] [    7.876439] RSP: 0018:ffffb1f1002cfc30 EFLAGS: 00010206
[2021-11-23 21:05:55] [    7.876451] RAX: ffffb1f120cdf000 RBX: ffff8b4d9a675620 RCX: 0000000000000000
[2021-11-23 21:05:55] [    7.876467] RDX: 000000000000000e RSI: 0000000000000003 RDI: ffff8b4d9a660000
[2021-11-23 21:05:55] [    7.876483] RBP: ffff8b4d9a660000 R08: 000000000000000f R09: 000000008010000f
[2021-11-23 21:05:55] [    7.876500] R10: 0000000040000000 R11: 000000001b99d000 R12: ffff8b4d9a675590
[2021-11-23 21:05:55] [    7.876515] R13: ffff8b4d9a676400 R14: 000000000000000c R15: ffff8b4d813ef36c
[2021-11-23 21:05:55] [    7.876533] FS:  000073bc16d48380(0000) GS:ffff8b4dbcc00000(0000) knlGS:0000000000000000
[2021-11-23 21:05:55] [    7.876551] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2021-11-23 21:05:55] [    7.876565] CR2: ffffb1f120cdf000 CR3: 0000000004160000 CR4: 0000000000350ef0
[2021-11-23 21:05:55] [    7.876582] Kernel panic - not syncing: Fatal exception
[2021-11-23 21:05:55] [    7.877654] Kernel Offset: 0x1000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

in stubdom:

[2021-11-23 21:05:55] qemu[195]: segfault at 0 ip 00005caaf4d1a060 sp 00007fffa06b82b8 error 4 in qemu[5caaf4a9f000+3e9000]
[2021-11-23 21:05:55] Code: 48 8b 4c 24 20 e8 e0 3b 0f 00 48 83 c4 20 e9 a4 fe ff ff 0f 1f 80 00 00 00 00 48 8b 07 48 8b 00 48 8b 00 c3 66 0f 1f 44 00 00 <48> 8b 07 c3 66 66 2e 0f 1f 84 00 00 00 00 00 90 48 8b 07 0f b6 40

The kernel crash seems not too deep: the IPs are initialized in order, it is a PSP init failure that causes to stop and cleanup, the crash appears in vcn_v2_0_sw_fini dereferencing a fw_shared_cpu_addr pointer initialized during VCN init. When the fault occurs the pointer is non-NULL, could be a use-after-free ?

Quite some things to investigate and try next:

check if that bug still happens in 5.15/5.16rc; if still there use this occasion to play with KASAN – but it may not be that nuch of a blocker if I can…
… avoid use of PSP (move away _ta and _asd firmwares, or use module params ip_block_mask or fw_load_type)
check whether the suspect-looking points in former post have an impact here

yann · November 25, 2021, 11:09pm

Still there in 5.15.4.

On this direction…

renaming firmware files: they’re not optional, that causes early psp IP init failure (early enough that no vcn init/fini is run, thus no panic, but no help)
option amdgpu fw_load_type=1 in /etc/modprobe.d/ (supposed to force firmware load to go through smu instead of psp) seems to be ignored (and in fact the code shows it is ignored, only 0 can change anything)
option amdgpu ip_block_mask=0xfff7 to disable the PSP, OTOH, does have an impact: the psp is not initialized (though several components still claim they’ll use it) changes the error and proceeds into the kernel panic path:

[2021-11-25 23:30:22] [    3.855687] [drm] sw_init of IP block <vega10_ih>...
[2021-11-25 23:30:22] [    3.856832] [drm] sw_init of IP block <smu>...
[2021-11-25 23:30:22] [    3.856864] [drm] sw_init of IP block <gfx_v9_0>...
[2021-11-25 23:30:22] [    3.865352] [drm] sw_init of IP block <sdma_v4_0>...
[2021-11-25 23:30:22] [    3.865439] [drm] sw_init of IP block <dm>...
[2021-11-25 23:30:22] [    3.865880] [drm] Loading DMUB firmware via PSP: version=0x01010019
[2021-11-25 23:30:22] [    3.865905] [drm] sw_init of IP block <vcn_v2_0>...
[2021-11-25 23:30:22] [    3.868761] [drm] Found VCN firmware Version ENC: 1.14 DEC: 5 VEP: 0 Revision: 20
[2021-11-25 23:30:22] [    3.868804] amdgpu 0000:00:05.0: amdgpu: Will use PSP to load VCN firmware
[2021-11-25 23:30:22] [    3.936773] [drm] sw_init of IP block <jpeg_v2_0>...
[2021-11-25 23:30:22] [    3.940481] amdgpu 0000:00:05.0: amdgpu: SMU is initialized successfully!
[2021-11-25 23:30:22] [    3.943960] [drm] kiq ring mec 2 pipe 1 q 0
[2021-11-25 23:30:22] [    4.106258] input: dom0: Power Button as /devices/virtual/input/input7
[2021-11-25 23:30:22] [    4.109534] input: dom0: Power Button as /devices/virtual/input/input8
[2021-11-25 23:30:22] [    4.109748] input: dom0: Video Bus as /devices/virtual/input/input9
[2021-11-25 23:30:22] [    4.109877] input: dom0: AT Translated Set 2 keyboard as /devices/virtual/input/input10
[2021-11-25 23:30:22] [    4.110764] input: dom0: ELAN2203:00 04F3:30AA Mouse as /devices/virtual/input/input11
[2021-11-25 23:30:22] [    4.131382] amdgpu 0000:00:05.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
[2021-11-25 23:30:22] [    4.131566] [drm:amdgpu_gfx_enable_kcq.cold [amdgpu]] *ERROR* KCQ enable failed
[2021-11-25 23:30:22] [    4.131761] [drm:amdgpu_device_ip_init [amdgpu]] *ERROR* hw_init of IP block <gfx_v9_0> failed -110
[2021-11-25 23:30:22] [    4.131953] amdgpu 0000:00:05.0: amdgpu: amdgpu_device_ip_init failed
[2021-11-25 23:30:22] [    4.131968] amdgpu 0000:00:05.0: amdgpu: Fatal error during GPU init
[2021-11-25 23:30:22] [    4.145153] input: dom0: ETPS/2 Elantech Touchpad as /devices/virtual/input/input12
[2021-11-25 23:30:22] [    4.149031] input: dom0: ELAN2203:00 04F3:30AA Touchpad as /devices/virtual/input/input13
[2021-11-25 23:30:22] [    4.160053] input: dom0: Sleep Button as /devices/virtual/input/input14
[2021-11-25 23:30:22] [    4.243266] amdgpu 0000:00:05.0: amdgpu: amdgpu: finishing device.
[2021-11-25 23:30:22] [    4.256416] amdgpu: probe of 0000:00:05.0 failed with error -110
[2021-11-25 23:30:22] [    4.256443] [drm] sw_fini of IP block <jpeg_v2_0>...
[2021-11-25 23:30:22] [    4.256466] [drm] sw_fini of IP block <vcn_v2_0>...
[2021-11-25 23:30:22] [    4.256482] BUG: unable to handle page fault for address: ffffbaa420cdf000