AMD iGPU passthrough attempt

This seems to be against mainline Xen, Qubes already has a patch that handles vbios at 0xc0000, so I felt that part was not needed and only applied the remaining 2. The libxl log does show gfx_passthru and gfx_passthru_kind set to igd, and the dm log still reports the address conflict.

Reading the libxl_dm code, it looks like it does nothing special from gfx_passthru, and with this patch we neutralize all it’s doing for igd. On qemu side I can’t find any special flag for non-igd GPU. I feel something missing.

Reading the gfx_passthru stuff in xl.cfg.5, I understand when a GPU can be passed through without it, it will be secondary in the VM - I’d thing that will not work as expected of sys-gui-gpu, so it would have to be set whatever the GPU vendor/model, right ?

Trying to step back a little, IIUC the core issue is the GPU driver in the guest not getting having access to the VBIOS ROM. It has several ways of accessing it, among which it prefers ACPI ATRM and VFCT tables (but then the guest gets a Xen-forged ACPI table), and reading the ROM BAR (which is thus our focus here).

Now qemu seems to have several ways of exposing the vbios through the ROM BAR. Notably, pci_assign_dev_load_option_rom() which is one of the focus of the IGD patch you mentionned goes reading it from /sys, and using the romfile= parameter seems to be an alternative, and since we’re talking about code in xen_pt_load_rom.c it may have chance to be useful even out of the KVM case.

It looks like we have several families of solutions, including:

  • full mmap-like passthrough, causing reads of the ROM BAR in the guest to result in reads ot the ROM BAR in dom0 (which is what we’re trying to do right now)
  • providing the ROM data to qemu, so it can emulate the ROM BAR (which is possibly simpler than using passthrough, and could provide a viable long-term solution), where we have 2 distinct problems:
    1. get the VBIOS ROM data, with 2 options:
      • get it from /sys in dom0 (which in my particular case could be linked to ACPI VFCT not being visible in dom0, which looks like a problem we could overcome, as the Qubes kernel was seeing this VFCT table a couple of weeks ago)
      • get it from a file
    2. make it available to qemu in stubdom
      • provide to qemu with romfile=
      • get qemu to see the rom in /sys, which I’m not sure would provide any advantage over romfile=

A low-hanging fruit, which would provide a fallback for the (apparently many) cases where reading the ROM requires more work, would be to pass romfile= from a file, and we could fix the more difficult problem from there.

Diving more in the qemu hw/xen code, the pci_assign_dev_load_option_rom() code will only run, through get_vgabios(), if igd-passthru has been set, so I’m trying without your patch commenting it out.

Unfortunately when not set, even though an error seems to be sent through QAPI, none gets to stderr and we can’t see this in the logs, so I’m adding a couple of XEN_PT_LOG calls there.

Iterating on this to assert what path in qemu is actually taken is quite painful though, with all stubdom being rebuilt on each make vmm-xen-stubdom-linux-dom0 (like everything gets rebuilt when asking make qubes-dom0). Isn’t there a simple “rebuild only changed stuff” feature in the builder ?

qubes-builder doesn’t give you this option, but to iterate quickly, you can easily clone https://github.com/qubesos/qubes-vmm-xen-stubdom-linux directly and build from there (see README).

Hm, time flies and I still did not find enough of it to do every tests I had in mind for this answer… so it may feel a bit incomplete…

By hacking the xen_pt_realize() test that checks for a hardcoded PFN for the IGD, preventing access to xen_pt_setup_vga() to anything not on 0000:00.02.0 (apparently compared with the PFN in the stubdom, where my iGPU is on 0000:00.00.0 - and I’m wondering why an IGP would get such special treatment that it would not appear as 0000:00.00.0 too in a stubdom), and I can see qemu (expectedly) failing to get the vbios from sysfs, and then happily copying it from memory, getting to the Legacy VBIOS registered trace from xen_pt_direct_vbios_copy().

I find that slightly disturbing, after the can't claim BAR 6 message - but then, it’s (surprisingly?) does not bother to check for any magic number (nor does the /sys/ code path, though in this case modern kernels do their own checks, IIRC).

As for the if(dev->romfile in pci_assign_dev_load_option_rom() I cannot see how it could result in the relevant pci_register_bar() call. So I went forward with hardcoding my video rom in the code for a test… and it turns out the amdgpu driver still prints the same Invalid PCI ROM data signature (with the same got 0xcb03aa55 which in memory spells starting with 0x55 0xaa … which happens to be the 2-byte magic for the BIOS ROM … which I find disturbing but could not make anything of it for now).

To make sure of what gets read in /dev/mem I added a check for the 0x55 0xaa magic number, and it indeed catches what appears not to be a bios rom, starting with 0x0000 - obviously I’ll have to double-check this, dump more memory, and see how this results in amgdpu finding out that signature.

Slow progress, and I again won’t have any time for this until next weekend :disappointed:

Well, the README does not tell about make full, whereas the images generated by make all do not appear to be used (at least the xen.xml template references the “full” version). Maybe this README would benefit from a bit more info ?

Also, building such packages separately, although it avoids full rebuild of everything, requires to install specific qubes devel packages, which ideally are only installed in a chroot to make sure they don’t pollute - or in a separate VM, but having separate temporary VMs to build each such package separately is starting to be heavy.
Maybe I’ll end up resuming my experiments with ISAR first :slight_smile: … sooo many nested projects and soo little time :frowning:

It is used by default. The reference to “full” version you’ve found is an alternative path (overriding the default) that is used only for very specific configs (with USB or audio passthrough via stubdom).

Finally I settled with disabling the build of the full version to cut build time in half… and enable parallel stubdom builds to divide it further by vcpus. 3 minutes to build started to make iterations reasonable again.

As a first PoC I started with compiling my extracted ROM as static data…

… but to get it loaded at all I also had to revert this patch hunk which assumes that previous code creates a proper shadow copy, which is probably not the case here (or is it ?).

Now my stubtom seems to expose a VGA device with rombar, showing…

[2021-11-14 16:47:43] [00:05.0] xen_pt_realize: Assigning real physical device 07:00.0 to devfn 0x28
[2021-11-14 16:47:43] [00:05.0] xen_pt_realize:  real_device = 0000:07:00.0
[2021-11-14 16:47:43] [00:05.0] xen_pt_realize: Assigning VGA (passthru=1)...
[2021-11-14 16:47:43] [00:05.0] xen_pt_setup_vga: Legacy VBIOS imported
[2021-11-14 16:47:43] [00:05.0] xen_pt_register_regions: IO region 0 registered (size=0x10000000 base_addr=0xb0000000 type: 0x4)
[2021-11-14 16:47:43] [00:05.0] xen_pt_register_regions: IO region 2 registered (size=0x00200000 base_addr=0xc0000000 type: 0x4)
[2021-11-14 16:47:43] [00:05.0] xen_pt_register_regions: IO region 4 registered (size=0x00000100 base_addr=0x0000e000 type: 0x1)
[2021-11-14 16:47:43] [00:05.0] xen_pt_register_regions: IO region 5 registered (size=0x00080000 base_addr=0xfe400000 type: 0)
[2021-11-14 16:47:43] [00:05.0] xen_pt_register_regions: Expansion ROM registered (size=0x00020000 base_addr=0x000c0000)
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0010 mismatch! Emulated=0x0000, host=0xb000000c, syncing to 0xb000000c.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0018 mismatch! Emulated=0x0000, host=0xc000000c, syncing to 0xc000000c.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0020 mismatch! Emulated=0x0000, host=0xe001, syncing to 0xe001.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0024 mismatch! Emulated=0x0000, host=0xfe400000, syncing to 0xfe400000.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0030 mismatch! Emulated=0x0000, host=0xc0002, syncing to 0x0002.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0052 mismatch! Emulated=0x0000, host=0x0003, syncing to 0x0003.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x00a2 mismatch! Emulated=0x0000, host=0x0084, syncing to 0x0080.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0068 mismatch! Emulated=0x0000, host=0x8fa1, syncing to 0x8fa1.
[2021-11-14 16:47:43] [00:05.0] xen_pt_config_reg_init: Offset 0x0076 mismatch! Emulated=0x0000, host=0x1104, syncing to 0x1104.
[2021-11-14 16:47:43] [00:05.0] xen_pt_pci_intx: intx=1
[2021-11-14 16:47:43] [00:05.0] xen_pt_realize: Real physical device 07:00.0 registered successfully

… but that does not seem to impress sys-gui-gpu's amdgpu driver, at all, it still claims:

[2021-11-14 16:47:47] [    2.656523] amdgpu: Topology: Add CPU node
[2021-11-14 16:47:47] [    2.656616] amdgpu 0000:00:05.0: vgaarb: deactivate vga console
[2021-11-14 16:47:47] [    2.657625] [drm] initializing kernel modesetting (RENOIR 0x1002:0x1636 0x1462:0x12AC 0xC6).
[2021-11-14 16:47:47] [    2.657651] amdgpu 0000:00:05.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
[2021-11-14 16:47:47] [    2.657678] [drm] register mmio base: 0xF1200000
[2021-11-14 16:47:47] [    2.657688] [drm] register mmio size: 524288
[2021-11-14 16:47:47] [    2.658964] [drm] add ip block number 0 <soc15_common>
[2021-11-14 16:47:47] [    2.658977] [drm] add ip block number 1 <gmc_v9_0>
[2021-11-14 16:47:47] [    2.658987] [drm] add ip block number 2 <vega10_ih>
[2021-11-14 16:47:47] [    2.658998] [drm] add ip block number 3 <psp>
[2021-11-14 16:47:47] [    2.659008] [drm] add ip block number 4 <smu>
[2021-11-14 16:47:47] [    2.659018] [drm] add ip block number 5 <gfx_v9_0>
[2021-11-14 16:47:47] [    2.659028] [drm] add ip block number 6 <sdma_v4_0>
[2021-11-14 16:47:47] [    2.659039] [drm] add ip block number 7 <dm>
[2021-11-14 16:47:47] [    2.659049] [drm] add ip block number 8 <vcn_v2_0>
[2021-11-14 16:47:47] [    2.659059] [drm] add ip block number 9 <jpeg_v2_0>
[2021-11-14 16:47:47] [    2.701134] [drm] BIOS signature incorrect 0 0
[2021-11-14 16:47:47] [    2.701152] amdgpu 0000:00:05.0: Invalid PCI ROM data signature: expecting 0x52494350, got 0xcb03aa55
[2021-11-14 16:47:47] [    2.742791] [drm] BIOS signature incorrect 0 0
[2021-11-14 16:47:47] [    2.742881] [drm:amdgpu_get_bios [amdgpu]] *ERROR* Unable to locate a BIOS ROM
[2021-11-14 16:47:47] [    2.742898] amdgpu 0000:00:05.0: amdgpu: Fatal error during GPU init
[2021-11-14 16:47:47] [    2.742911] amdgpu 0000:00:05.0: amdgpu: amdgpu: finishing device.

… so it may well be that this ROM is still not provided to the VM where the driver is looking for it (I’m specifically double-checking that this 0x55 0xaa BIOS magic is there) :confused:
@marmarek, will gladly accept more ideas at this point :slight_smile:

As I’m having doubts (from Qubes 4.0.4 era) that the 5.4 default VM kernel would be able to properly support this hardware anyway, and since that really seems to be the most recent VM kernel around, I also tried to let sys-gui-gpu boot the fc33-provided 5.14 kernel (through qvm-prefs sys-gui-gpu kernel ""). In that case, the amdgpu driver does not even seem to be loaded, and sys-gui-gpu does not appear to start well enough for the Qubes agent to start, and it gets killed soon – the reason from kernel logs being lack of blkfront driver, obviously it cannot start this way without an enhanced initramfs.
Is there really no way to tell dracut not to omit any kernel hardware module ? I can’t believe it but no such thing apepars to be documented :frowning:

For reference:

Edit: I’ve started to doubt whether the fc33 ramdisk is indeed correctly generated at all, it should include the proper xen block drivers, right ? And a small step back allowed me to see it was kernel-latest-qubes-vm I was really looking for – though it does not help with the PCI ROM. Back to digging :slight_smile:

1 Like

It looks like all checks for e820_host in libxl_x86.c are in fact conditionned by b_info->type == LIBXL_DOMAIN_TYPE_PV, that would explain it has no impact on a HVM. According to the commit introducing e820_host that’s just how it is, “being a PV guest” is a prerequisite.

That said, it would seem that dom0 gets the host e820 map, and as I understand it that 0x000c0000-0x000dffff range does lie in the same reserved region (well, except if dom0 does not get the real BIOS e820 map – but hypervisor.log does not seem to dump it either, and this message during the series review seems to imply that dom0 indeed shows the host’s e820):

[    0.000000] BIOS-provided physical RAM map:
[    0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable
[    0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved
[    0.000000] Xen: [mem 0x0000000000100000-0x0000000009bfefff] usable

(but then, I was only poking around with a fresh look on those previous attempts, that should not get in the way of providing explicit expansion ROM data)

1 Like

Thinking twice about it: I thought it would be expected to see the PCI devices physical addresses protected from the OS by being declared in reserved regions. However, if we compare the ranges of the different BARs:

[2021-11-21 00:13:11] pci 0000:00:00.0: [1002:1636] type 00 class 0x030000
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x10: [mem 0xb0000000-0xbfffffff 64bit pref]
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x18: [mem 0xc0000000-0xc01fffff 64bit pref]
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x20: [io  0xe000-0xe0ff]
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x24: [mem 0xfe400000-0xfe47ffff]
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x30: [mem 0x000c0000-0x000dffff pref]
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/0
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/2
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/4
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/5
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/6
[2021-11-21 00:13:11] pci 0000:00:00.0: can't claim BAR 6 [mem 0x000c0000-0x000dffff pref]: address conflict with Reserved [mem 0x000a0000-0x000fffff]

… with the map provided by the BIOS:

[    0.000000] BIOS-provided physical RAM map:
[    0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable
[    0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved
[    0.000000] Xen: [mem 0x0000000000100000-0x0000000009bfefff] usable
[    0.000000] Xen: [mem 0x0000000009bff000-0x0000000009ffffff] reserved
[    0.000000] Xen: [mem 0x000000000a000000-0x000000000a1fffff] usable
[    0.000000] Xen: [mem 0x000000000a200000-0x000000000a20cfff] ACPI NVS
[    0.000000] Xen: [mem 0x000000000a20d000-0x00000000a9eaafff] usable
[    0.000000] Xen: [mem 0x00000000a9eab000-0x00000000ab3c8fff] reserved
[    0.000000] Xen: [mem 0x00000000ab3c9000-0x00000000ab419fff] ACPI data
[    0.000000] Xen: [mem 0x00000000ab41a000-0x00000000ab786fff] ACPI NVS
[    0.000000] Xen: [mem 0x00000000ab787000-0x00000000ab787fff] reserved
[    0.000000] Xen: [mem 0x00000000ab788000-0x00000000ab98dfff] ACPI NVS
[    0.000000] Xen: [mem 0x00000000ab98e000-0x00000000ad5fefff] reserved
[    0.000000] Xen: [mem 0x00000000ad5ff000-0x00000000adffffff] usable
[    0.000000] Xen: [mem 0x00000000ae000000-0x00000000afffffff] reserved
[    0.000000] Xen: [mem 0x00000000f0000000-0x00000000f7ffffff] reserved
[    0.000000] Xen: [mem 0x00000000fd000000-0x00000000ffffffff] reserved
[    0.000000] Xen: [mem 0x0000000100000000-0x0000000155bc1fff] usable
[    0.000000] Xen: [mem 0x000000042f340000-0x00000004701fffff] reserved
[    0.000000] Xen: [mem 0x000000fd00000000-0x000000ffffffffff] reserved

… we can see that the 0xfe400000-0xfe47ffff range of BAR 5 is indeed intersecting with the 0x00000000fd000000-0x00000000ffffffff reserved region, but the BAR 0 and BAR 2 ranges fall in an “undeclared” gap between 2 reserved regions. And that difference does not result in different handling of those 3 BARs, whose resources are apparently all successfully claimed.

@marmarek, do you see why BAR 5 would not be detected as a conflict by request_resource_conflict() ? The most obvious would be that the stubdom’s memory map would not match the dom0 one (which I guess the host_e820 trick would correct if it was supported for HVM), but then the stubdom kernel is very quiet and does not report its view of the map. Its kconfig does not show a change in default loglevel, and its cmdline is reported as empty, there’s definitely no quiet flag there. How then is it so quiet ?

I also not that pci_claim_resource() would not have to make such a check if we had a shadow copy of the ROM at this point – this could look like a path worth investigating?

Back to the other end of the problem, namely getting to understand why amdgpu is still unable to access the expansion ROM exposed by the stubdom qemu…

Note that I’m starting to consider a new path, which would be to teach amdgpu to load a ROM directly from within the guest (eg. using request_firmware) as a way to advance the PoC, since obviously I’m not getting there as fast as I’d like with the current approaches. Before attempting this, though, there are still a few things that puzzle me and could possibly hint to something:

(traces based on above-mentionned stubtom qemu patches, and those guest linux patches)

One thing that had been there before my eyes from the start but had not stood out until now:

I’m not really clear yet why we have those mismatches to start with, but whereas virtually all of them result in sync’ing the emulated addresses to the host’s, the first of those expansion-ROM-related ones is not, with 0xc0002 (in the range which is causing those headaches) becoming 0x0002.

A second thing is that qemu claims to expose the BARs for the device at addresses:

[2021-11-21 00:13:11] [00:05.0] xen_pt_register_regions: IO region 0 registered (size=0x10000000 base_addr=0xb0000000 type: 0x4)
[2021-11-21 00:13:11] [00:05.0] xen_pt_register_regions: IO region 2 registered (size=0x00200000 base_addr=0xc0000000 type: 0x4)
[2021-11-21 00:13:11] [00:05.0] xen_pt_register_regions: IO region 4 registered (size=0x00000100 base_addr=0x0000e000 type: 0x1)
[2021-11-21 00:13:11] [00:05.0] xen_pt_register_regions: IO region 5 registered (size=0x00080000 base_addr=0xfe400000 type: 0)
[2021-11-21 00:13:11] [00:05.0] xen_pt_register_regions: Expansion ROM registered (size=0x00020000 base_addr=0x000c0000)

… that closely mimic that of the host:

[    1.104549] pci 0000:07:00.0: [1002:1636] type 00 class 0x030000
[    1.104571] pci 0000:07:00.0: reg 0x10: [mem 0xb0000000-0xbfffffff 64bit pref]
[    1.104586] pci 0000:07:00.0: reg 0x18: [mem 0xc0000000-0xc01fffff 64bit pref]
[    1.104597] pci 0000:07:00.0: reg 0x20: [io  0xe000-0xe0ff]
[    1.104607] pci 0000:07:00.0: reg 0x24: [mem 0xfe400000-0xfe47ffff]
[    1.104624] pci 0000:07:00.0: enabling Extended Tags

… and that in the guest we see noticeably different addresses, which I’d expect to reflect the values set above by qemu, as being physical addresses:

[2021-11-21 00:13:12] [    0.318637] pci 0000:00:05.0: [1002:1636] type 00 class 0x030000
[2021-11-21 00:13:12] [    0.320239] pci 0000:00:05.0: reg 0x10: [mem 0xe0000000-0xefffffff 64bit pref]
[2021-11-21 00:13:12] [    0.322241] pci 0000:00:05.0: reg 0x18: [mem 0xf1000000-0xf11fffff 64bit pref]
[2021-11-21 00:13:12] [    0.324246] pci 0000:00:05.0: reg 0x20: [io  0xc200-0xc2ff]
[2021-11-21 00:13:12] [    0.326236] pci 0000:00:05.0: reg 0x24: [mem 0xf1200000-0xf127ffff]
[2021-11-21 00:13:12] [    0.329241] pci 0000:00:05.0: reg 0x30: [mem 0xf1280000-0xf129ffff pref]
[2021-11-21 00:13:12] [    0.329484] pci 0000:00:05.0: enabling Extended Tags
...
[2021-11-21 00:13:13] [    0.475690] pci 0000:00:05.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]

@marmarek isn’t that one really suspect ?
Is it telling that it’s the shadowed ROM that lives at 0xc0000 and not the ROM itself ?

A third one is the contents that the drivers gets in that memory region it maps as being the expansion ROM:

[2021-11-21 00:13:14] [    2.601896] [drm] amdgpu_atrm_get_bios()
[2021-11-21 00:13:14] [    2.601905] [drm] amdgpu_acpi_vfct_bios()
[2021-11-21 00:13:14] [    2.601914] [drm] igp_read_bios_from_vram()
[2021-11-21 00:13:15] [    2.650226] [drm] BIOS signature incorrect 0 0
[2021-11-21 00:13:15] [    2.650272] [drm] amdgpu_read_bios()
[2021-11-21 00:13:15] [    2.650285] amdgpu 0000:00:05.0: pci_map_rom()
[2021-11-21 00:13:15] [    2.650296] amdgpu 0000:00:05.0: pci_map_rom: start=0000000017dcda60, size=20000
[2021-11-21 00:13:15] [    2.650313] amdgpu 0000:00:05.0: pci_enable_rom: shadow copy, nothing to do
[2021-11-21 00:13:15] [    2.650337] amdgpu 0000:00:05.0: PCI ROM @00: aa55 cb03 0000 0000
[2021-11-21 00:13:15] [    2.650379] amdgpu 0000:00:05.0: PCI ROM @16: 0000 0000 001c 5024
[2021-11-21 00:13:15] [    2.650396] amdgpu 0000:00:05.0: Invalid PCI ROM data signature: expecting 0x52494350, got 0xcb03aa55
[2021-11-21 00:13:15] [    2.650415] amdgpu 0000:00:05.0: pci_map_rom: pci_get_rom_size failed
[2021-11-21 00:13:15] [    2.650432] [drm] amdgpu_read_bios_from_rom()
[2021-11-21 00:13:15] [    2.650443] [drm] amdgpu_read_bios_from_rom: amdgpu_asic_read_bios_from_rom failed
[2021-11-21 00:13:15] [    2.650458] [drm] amdgpu_read_disabled_bios()
[2021-11-21 00:13:15] [    2.650469] [drm] igp_read_bios_from_vram()
[2021-11-21 00:13:15] [    2.692561] [drm] BIOS signature incorrect 0 0
[2021-11-21 00:13:15] [    2.692590] [drm] amdgpu_read_platform_bios()
[2021-11-21 00:13:15] [    2.692601] amdgpu 0000:00:05.0: amdgpu: Unable to locate a BIOS ROM

That is, the 0xaa55 expansion ROM signature is really there, but it’s the only value that looks right. At 0x16 where we should have the offset to the VBIOS signature, we get 0x0000, which explains the strange-looking signature causing the extraction to abort. I feel the biggest question here would be, why do we have the first 2 bytes correct, if the rest is just junk ?
I’m especially wondering if there would not be a link with my first question above, as the junk here starts at what should be physical address 0xc0002. Could it be that ROM shadowing gets broken because of this ?

1 Like

So here I am with a small PoC commit doing precisely this. And indeed there is some good news: the driver does load the my VBIOS ROM and appears to like it, but soon things turn out not to be so fine (at first sight unrelated with VBIOS) with…

  • a strange-looking MTRR write failure
  • some trouble with the PSP firmware failing to load, triggering the termination of the amdgpu driver
  • … and then dereferencing a bad pointer (bug in the error path?) sends the kernel to panic, and possibly inducing a qemu segfault
  • … which result in unresponsive Qubes and requires hard poweroff
[2021-11-23 21:05:52] [    4.297684] amdgpu 0000:00:05.0: amdgpu: Fetched VBIOS from firmware file
[2021-11-23 21:05:52] [    4.297709] amdgpu: ATOM BIOS: 113-RENOIR-025
[2021-11-23 21:05:52] [    4.302046] [drm] VCN decode is enabled in VM mode
[2021-11-23 21:05:52] [    4.302066] [drm] VCN encode is enabled in VM mode
[2021-11-23 21:05:52] [    4.302078] [drm] JPEG decode is enabled in VM mode
[2021-11-23 21:05:52] [    4.302144] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
[2021-11-23 21:05:52] [    4.302181] amdgpu 0000:00:05.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
[2021-11-23 21:05:52] [    4.302217] amdgpu 0000:00:05.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[2021-11-23 21:05:52] [    4.302246] amdgpu 0000:00:05.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
[2021-11-23 21:05:52] [    4.302268] mtrr: base(0x430000000) is not aligned on a size(0x20000000) boundary
[2021-11-23 21:05:52] [    4.302289] Failed to add WC MTRR for [000000000998bb55-00000000eb9e681e]; performance may suffer.
[2021-11-23 21:05:52] [    4.302295] [drm] Detected VRAM RAM=512M, BAR=512M
[2021-11-23 21:05:52] [    4.302341] [drm] RAM width 128bits DDR4
[2021-11-23 21:05:52] [    4.302401] [drm] amdgpu: 512M of VRAM memory ready
[2021-11-23 21:05:52] [    4.302412] [drm] amdgpu: 691M of GTT memory ready.
[2021-11-23 21:05:52] [    4.302437] [drm] GART: num cpu pages 262144, num gpu pages 262144
[2021-11-23 21:05:52] [    4.302565] [drm] PCIE GART of 1024M enabled.
[2021-11-23 21:05:52] [    4.302575] [drm] PTB located at 0x000000F400900000
[2021-11-23 21:05:52] [    4.312921] amdgpu 0000:00:05.0: amdgpu: PSP runtime database doesn't exist
[2021-11-23 21:05:52] [    4.342353] [drm] Loading DMUB firmware via PSP: version=0x01010019
[2021-11-23 21:05:52] [    4.346679] [drm] Found VCN firmware Version ENC: 1.14 DEC: 5 VEP: 0 Revision: 20
[2021-11-23 21:05:52] [    4.346723] amdgpu 0000:00:05.0: amdgpu: Will use PSP to load VCN firmware
[2021-11-23 21:05:52] [    4.978736] [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR
[2021-11-23 21:05:52] 
[2021-11-23 21:05:52] Fedora 33 (Thirty Three)
[2021-11-23 21:05:52] Kernel 5.14.15-1.fc32.qubes.x86_64 on an x86_64 (hvc0)
[2021-11-23 21:05:52] 
[2021-11-23 21:05:52] sys-gui-gpu login: [    5.136770] input: dom0: AT Translated Set 2 keyboard as /devices/virtual/input/input7
...
[2021-11-23 21:05:55] [    7.675982] [drm] psp command (0xFFFFFFFF) failed and response status is (0xFFFFFFFF)
[2021-11-23 21:05:55] [    7.676007] [drm:psp_hw_start [amdgpu]] *ERROR* PSP load tmr failed!
[2021-11-23 21:05:55] [    7.676213] [drm:psp_hw_init [amdgpu]] *ERROR* PSP firmware loading failed
[2021-11-23 21:05:55] [    7.676371] [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* hw_init of IP block <psp> failed -22
[2021-11-23 21:05:55] [    7.676530] amdgpu 0000:00:05.0: amdgpu: amdgpu_device_ip_init failed
[2021-11-23 21:05:55] [    7.676563] amdgpu 0000:00:05.0: amdgpu: Fatal error during GPU init
[2021-11-23 21:05:55] [    7.676578] amdgpu 0000:00:05.0: amdgpu: amdgpu: finishing device.
[2021-11-23 21:05:55] [    7.679044] amdgpu: probe of 0000:00:05.0 failed with error -22
[2021-11-23 21:05:55] [    7.679102] BUG: unable to handle page fault for address: ffffb1f120cdf000
[2021-11-23 21:05:55] [    7.679117] #PF: supervisor write access in kernel mode
[2021-11-23 21:05:55] [    7.679129] #PF: error_code(0x0002) - not-present page
[2021-11-23 21:05:55] [    7.679140] PGD 1000067 P4D 1000067 PUD 11dc067 PMD 0 
[2021-11-23 21:05:55] [    7.679154] Oops: 0002 [#1] SMP NOPTI
[2021-11-23 21:05:55] [    7.679163] CPU: 0 PID: 276 Comm: systemd-udevd Not tainted 5.14.15-1.fc32.qubes.x86_64 #1
[2021-11-23 21:05:55] [    7.679180] Hardware name: Xen HVM domU, BIOS 4.14.3 11/14/2021
[2021-11-23 21:05:55] [    7.679194] RIP: 0010:vcn_v2_0_sw_fini+0x10/0x40 [amdgpu]
[2021-11-23 21:05:55] [    7.679367] Code: 66 f0 83 c2 81 c6 ea 05 00 00 31 c9 4c 89 cf e9 b6 4d ee ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 87 38 17 01 00 48 89 fd <c7> 00 00 00 00 00 e8 d5 d5 f1 ff 48 89 ef e8 2d 20 ff ff 85 c0 74
[2021-11-23 21:05:55] [    7.679402] RSP: 0018:ffffb1f1002cfc30 EFLAGS: 00010206
[2021-11-23 21:05:55] [    7.679414] RAX: ffffb1f120cdf000 RBX: ffff8b4d9a675620 RCX: 0000000000000000
[2021-11-23 21:05:55] [    7.679429] RDX: 000000000000000e RSI: 0000000000000003 RDI: ffff8b4d9a660000
[2021-11-23 21:05:55] [    7.679444] RBP: ffff8b4d9a660000 R08: 000000000000000f R09: 000000008010000f
[2021-11-23 21:05:55] [    7.679459] R10: 0000000040000000 R11: 000000001b99d000 R12: ffff8b4d9a675590
[2021-11-23 21:05:55] [    7.679474] R13: ffff8b4d9a676400 R14: 000000000000000c R15: ffff8b4d813ef36c
[2021-11-23 21:05:55] [    7.679490] FS:  000073bc16d48380(0000) GS:ffff8b4dbcc00000(0000) knlGS:0000000000000000
[2021-11-23 21:05:55] [    7.679507] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2021-11-23 21:05:55] [    7.679520] CR2: ffffb1f120cdf000 CR3: 0000000004160000 CR4: 0000000000350ef0
[2021-11-23 21:05:55] [    7.679536] Call Trace:
[2021-11-23 21:05:55] [    7.679545]  amdgpu_device_ip_fini.isra.0+0xb6/0x1e0 [amdgpu]
[2021-11-23 21:05:55] [    7.679691]  amdgpu_device_fini_sw+0xe/0x100 [amdgpu]
[2021-11-23 21:05:55] [    7.679835]  amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
[2021-11-23 21:05:55] [    7.679978]  devm_drm_dev_init_release+0x3d/0x60 [drm]
[2021-11-23 21:05:55] [    7.680008]  devres_release_all+0xb8/0x100
[2021-11-23 21:05:55] [    7.680019]  really_probe+0x100/0x310
[2021-11-23 21:05:55] [    7.680029]  __driver_probe_device+0xfe/0x180
[2021-11-23 21:05:55] [    7.680040]  driver_probe_device+0x1e/0x90
[2021-11-23 21:05:55] [    7.680050]  __driver_attach+0xc0/0x1c0
[2021-11-23 21:05:55] [    7.680059]  ? __device_attach_driver+0xe0/0xe0
[2021-11-23 21:05:55] [    7.680070]  ? __device_attach_driver+0xe0/0xe0
[2021-11-23 21:05:55] [    7.680081]  bus_for_each_dev+0x89/0xd0
[2021-11-23 21:05:55] [    7.680090]  bus_add_driver+0x12b/0x1e0
[2021-11-23 21:05:55] [    7.680099]  driver_register+0x8f/0xe0
[2021-11-23 21:05:55] [    7.680109]  ? 0xffffffffc0e7b000
[2021-11-23 21:05:55] [    7.680117]  do_one_initcall+0x57/0x200
[2021-11-23 21:05:55] [    7.680128]  do_init_module+0x5c/0x260
[2021-11-23 21:05:55] [    7.680137]  __do_sys_finit_module+0xae/0x110
[2021-11-23 21:05:55] [    7.680149]  do_syscall_64+0x3b/0x90
[2021-11-23 21:05:55] [    7.680158]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[2021-11-23 21:05:55] [    7.680170] RIP: 0033:0x73bc17ce9edd
[2021-11-23 21:05:55] [    7.680180] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6b 7f 0c 00 f7 d8 64 89 01 48
[2021-11-23 21:05:55] [    7.680215] RSP: 002b:00007fffa9b51688 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[2021-11-23 21:05:55] [    7.680231] RAX: ffffffffffffffda RBX: 0000602da93e3120 RCX: 000073bc17ce9edd
[2021-11-23 21:05:55] [    7.680246] RDX: 0000000000000000 RSI: 000073bc17e2732c RDI: 0000000000000014
[2021-11-23 21:05:55] [    7.680260] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000602da93e3bb0
[2021-11-23 21:05:55] [    7.680275] R10: 0000000000000014 R11: 0000000000000246 R12: 000073bc17e2732c
[2021-11-23 21:05:55] [    7.680290] R13: 0000602da9338960 R14: 0000000000000007 R15: 0000602da93e4000
[2021-11-23 21:05:55] [    7.680306] Modules linked in: joydev intel_rapl_msr amdgpu(+) intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ip6table_filter ip6table_mangle ip6table_raw ip6_tables iommu_v2 gpu_sched ipt_REJECT i2c_algo_bit nf_reject_ipv4 drm_ttm_helper ttm xt_state xt_conntrack iptable_filter iptable_mangle iptable_raw drm_kms_helper ehci_pci xt_MASQUERADE iptable_nat nf_nat nf_conntrack ehci_hcd cec nf_defrag_ipv6 serio_raw nf_defrag_ipv4 i2c_piix4 ata_generic pata_acpi pcspkr xen_scsiback target_core_mod xen_netback uinput xen_privcmd xen_gntdev drm xen_gntalloc xen_blkback fuse xen_evtchn bpf_preload ip_tables overlay xen_blkfront
[2021-11-23 21:05:55] [    7.876218] CR2: ffffb1f120cdf000
[2021-11-23 21:05:55] [    7.876227] ---[ end trace 36c4552e098fcc4e ]---
[2021-11-23 21:05:55] [    7.876239] RIP: 0010:vcn_v2_0_sw_fini+0x10/0x40 [amdgpu]
[2021-11-23 21:05:55] [    7.876400] Code: 66 f0 83 c2 81 c6 ea 05 00 00 31 c9 4c 89 cf e9 b6 4d ee ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 8b 87 38 17 01 00 48 89 fd <c7> 00 00 00 00 00 e8 d5 d5 f1 ff 48 89 ef e8 2d 20 ff ff 85 c0 74
[2021-11-23 21:05:55] [    7.876439] RSP: 0018:ffffb1f1002cfc30 EFLAGS: 00010206
[2021-11-23 21:05:55] [    7.876451] RAX: ffffb1f120cdf000 RBX: ffff8b4d9a675620 RCX: 0000000000000000
[2021-11-23 21:05:55] [    7.876467] RDX: 000000000000000e RSI: 0000000000000003 RDI: ffff8b4d9a660000
[2021-11-23 21:05:55] [    7.876483] RBP: ffff8b4d9a660000 R08: 000000000000000f R09: 000000008010000f
[2021-11-23 21:05:55] [    7.876500] R10: 0000000040000000 R11: 000000001b99d000 R12: ffff8b4d9a675590
[2021-11-23 21:05:55] [    7.876515] R13: ffff8b4d9a676400 R14: 000000000000000c R15: ffff8b4d813ef36c
[2021-11-23 21:05:55] [    7.876533] FS:  000073bc16d48380(0000) GS:ffff8b4dbcc00000(0000) knlGS:0000000000000000
[2021-11-23 21:05:55] [    7.876551] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2021-11-23 21:05:55] [    7.876565] CR2: ffffb1f120cdf000 CR3: 0000000004160000 CR4: 0000000000350ef0
[2021-11-23 21:05:55] [    7.876582] Kernel panic - not syncing: Fatal exception
[2021-11-23 21:05:55] [    7.877654] Kernel Offset: 0x1000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

in stubdom:

[2021-11-23 21:05:55] qemu[195]: segfault at 0 ip 00005caaf4d1a060 sp 00007fffa06b82b8 error 4 in qemu[5caaf4a9f000+3e9000]
[2021-11-23 21:05:55] Code: 48 8b 4c 24 20 e8 e0 3b 0f 00 48 83 c4 20 e9 a4 fe ff ff 0f 1f 80 00 00 00 00 48 8b 07 48 8b 00 48 8b 00 c3 66 0f 1f 44 00 00 <48> 8b 07 c3 66 66 2e 0f 1f 84 00 00 00 00 00 90 48 8b 07 0f b6 40

The kernel crash seems not too deep: the IPs are initialized in order, it is a PSP init failure that causes to stop and cleanup, the crash appears in vcn_v2_0_sw_fini dereferencing a fw_shared_cpu_addr pointer initialized during VCN init. When the fault occurs the pointer is non-NULL, could be a use-after-free ?

Quite some things to investigate and try next:

  • check if that bug still happens in 5.15/5.16rc; if still there use this occasion to play with KASAN – but it may not be that nuch of a blocker if I can…
  • … avoid use of PSP (move away _ta and _asd firmwares, or use module params ip_block_mask or fw_load_type)
  • check whether the suspect-looking points in former post have an impact here

Still there in 5.15.4.

On this direction…

  • renaming firmware files: they’re not optional, that causes early psp IP init failure (early enough that no vcn init/fini is run, thus no panic, but no help)
  • option amdgpu fw_load_type=1 in /etc/modprobe.d/ (supposed to force firmware load to go through smu instead of psp) seems to be ignored (and in fact the code shows it is ignored, only 0 can change anything)
  • option amdgpu ip_block_mask=0xfff7 to disable the PSP, OTOH, does have an impact: the psp is not initialized (though several components still claim they’ll use it) changes the error and proceeds into the kernel panic path:
[2021-11-25 23:30:22] [    3.855687] [drm] sw_init of IP block <vega10_ih>...
[2021-11-25 23:30:22] [    3.856832] [drm] sw_init of IP block <smu>...
[2021-11-25 23:30:22] [    3.856864] [drm] sw_init of IP block <gfx_v9_0>...
[2021-11-25 23:30:22] [    3.865352] [drm] sw_init of IP block <sdma_v4_0>...
[2021-11-25 23:30:22] [    3.865439] [drm] sw_init of IP block <dm>...
[2021-11-25 23:30:22] [    3.865880] [drm] Loading DMUB firmware via PSP: version=0x01010019
[2021-11-25 23:30:22] [    3.865905] [drm] sw_init of IP block <vcn_v2_0>...
[2021-11-25 23:30:22] [    3.868761] [drm] Found VCN firmware Version ENC: 1.14 DEC: 5 VEP: 0 Revision: 20
[2021-11-25 23:30:22] [    3.868804] amdgpu 0000:00:05.0: amdgpu: Will use PSP to load VCN firmware
[2021-11-25 23:30:22] [    3.936773] [drm] sw_init of IP block <jpeg_v2_0>...
[2021-11-25 23:30:22] [    3.940481] amdgpu 0000:00:05.0: amdgpu: SMU is initialized successfully!
[2021-11-25 23:30:22] [    3.943960] [drm] kiq ring mec 2 pipe 1 q 0
[2021-11-25 23:30:22] [    4.106258] input: dom0: Power Button as /devices/virtual/input/input7
[2021-11-25 23:30:22] [    4.109534] input: dom0: Power Button as /devices/virtual/input/input8
[2021-11-25 23:30:22] [    4.109748] input: dom0: Video Bus as /devices/virtual/input/input9
[2021-11-25 23:30:22] [    4.109877] input: dom0: AT Translated Set 2 keyboard as /devices/virtual/input/input10
[2021-11-25 23:30:22] [    4.110764] input: dom0: ELAN2203:00 04F3:30AA Mouse as /devices/virtual/input/input11
[2021-11-25 23:30:22] [    4.131382] amdgpu 0000:00:05.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
[2021-11-25 23:30:22] [    4.131566] [drm:amdgpu_gfx_enable_kcq.cold [amdgpu]] *ERROR* KCQ enable failed
[2021-11-25 23:30:22] [    4.131761] [drm:amdgpu_device_ip_init [amdgpu]] *ERROR* hw_init of IP block <gfx_v9_0> failed -110
[2021-11-25 23:30:22] [    4.131953] amdgpu 0000:00:05.0: amdgpu: amdgpu_device_ip_init failed
[2021-11-25 23:30:22] [    4.131968] amdgpu 0000:00:05.0: amdgpu: Fatal error during GPU init
[2021-11-25 23:30:22] [    4.145153] input: dom0: ETPS/2 Elantech Touchpad as /devices/virtual/input/input12
[2021-11-25 23:30:22] [    4.149031] input: dom0: ELAN2203:00 04F3:30AA Touchpad as /devices/virtual/input/input13
[2021-11-25 23:30:22] [    4.160053] input: dom0: Sleep Button as /devices/virtual/input/input14
[2021-11-25 23:30:22] [    4.243266] amdgpu 0000:00:05.0: amdgpu: amdgpu: finishing device.
[2021-11-25 23:30:22] [    4.256416] amdgpu: probe of 0000:00:05.0 failed with error -110
[2021-11-25 23:30:22] [    4.256443] [drm] sw_fini of IP block <jpeg_v2_0>...
[2021-11-25 23:30:22] [    4.256466] [drm] sw_fini of IP block <vcn_v2_0>...
[2021-11-25 23:30:22] [    4.256482] BUG: unable to handle page fault for address: ffffbaa420cdf000

FWIW, kernel 5.14.15 gave me problems in dom0, resulting in a bootloop.

Seemed to be amdgpu related. Issue was present 5.14.15, 5.14.16, but not in 5.14.17.

Ref: dom0 boot loop with kernel-latest-5.14.15 · Issue #7089 · QubesOS/qubes-issues · GitHub

That must be an ASIC-specific issue then no such issue with the RENOIR. However, I still have my NAVI14 dGPU (RX 5500 M) disabled because of a boot loop too.

Since the kernel panic (which induces a qemu crash and forces me to powerdown) is linked to VCN, let’s check what happens when we disable this non-essential IP (and the equally non-essential jpeg one while I’m at it), with amdgpu.ip_block_mask=0xff. More IPs get finalized, and we then hit a new one:

[2021-11-28 13:54:36] <4>[    7.604916] amdgpu: probe of 0000:00:05.0 failed with error -22
[2021-11-28 13:54:36] <6>[    7.605226] [drm] sw_fini of IP block <dm>...
[2021-11-28 13:54:36] <6>[    7.605252] [drm] sw_fini of IP block <sdma_v4_0>...
[2021-11-28 13:54:36] <6>[    7.605275] [drm] sw_fini of IP block <gfx_v9_0>...
[2021-11-28 13:54:36] <4>[    7.605426] ------------[ cut here ]------------
[2021-11-28 13:54:36] <4>[    7.605437] WARNING: CPU: 1 PID: 278 at drivers/gpu/drm/ttm/ttm_bo.c:409 ttm_bo_release+0x2d1/0x300 [ttm]
[2021-11-28 13:54:36] <4>[    7.605465] Modules linked in: intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel joydev ghash_clmulni_intel serio_raw pcspkr amdgpu(+) ip6table_filter ip6table_mangle ip6table_raw ip6_tables ipt_REJECT nf_reject_ipv4 iommu_v2 gpu_sched xt_state xt_conntrack i2c_algo_bit drm_ttm_helper ttm iptable_filter iptable_mangle drm_kms_helper iptable_raw xt_MASQUERADE cec ehci_pci ata_generic ehci_hcd i2c_piix4 iptable_nat nf_nat nf_conntrack pata_acpi nf_defrag_ipv6 nf_defrag_ipv4 xen_scsiback target_core_mod xen_netback uinput xen_privcmd xen_gntdev xen_gntalloc drm xen_blkback fuse xen_evtchn bpf_preload ip_tables overlay xen_blkfront
[2021-11-28 13:54:36] <4>[    7.605611] CPU: 1 PID: 278 Comm: systemd-udevd Not tainted 5.15.4-1.fc32.qubes.x86_64 #1
[2021-11-28 13:54:36] <4>[    7.605629] Hardware name: Xen HVM domU, BIOS 4.14.3 11/25/2021
[2021-11-28 13:54:36] <4>[    7.605644] RIP: 0010:ttm_bo_release+0x2d1/0x300 [ttm]
[2021-11-28 13:54:36] <4>[    7.605658] Code: 35 25 00 00 e9 83 fd ff ff e8 7b ae 33 f5 e9 bc fd ff ff 49 8b 7e 98 b9 30 75 00 00 31 d2 be 01 00 00 00 e8 f1 d2 33 f5 eb a2 <0f> 0b e9 50 fd ff ff e8 33 b4 33 f5 e9 fd fe ff ff be 03 00 00 00
[2021-11-28 13:54:36] <4>[    7.605693] RSP: 0018:ffffbd34002dbbe0 EFLAGS: 00010202
[2021-11-28 13:54:36] <4>[    7.605705] RAX: 0000000000000001 RBX: ffff9761144e92e0 RCX: 000000000000000f
[2021-11-28 13:54:36] <4>[    7.605720] RDX: 0000000000000001 RSI: ffffe8b5c0313200 RDI: ffff97610c4e79b8
[2021-11-28 13:54:36] <4>[    7.605736] RBP: ffff9761144e5270 R08: ffff97610c4e79b8 R09: ffffe8b5c0312d00
[2021-11-28 13:54:36] <4>[    7.605751] R10: 0000000000000000 R11: 0000000000000004 R12: ffff9761144f5a70
[2021-11-28 13:54:36] <4>[    7.605766] R13: ffff97610c4e7858 R14: ffff97610c4e79b8 R15: ffff976101bbf37c
[2021-11-28 13:54:36] <4>[    7.605782] FS:  00007203fc664380(0000) GS:ffff97613cd00000(0000) knlGS:0000000000000000
[2021-11-28 13:54:36] <4>[    7.605799] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2021-11-28 13:54:36] <4>[    7.605811] CR2: 00007ca9fc42a000 CR3: 0000000008a2e000 CR4: 0000000000350ee0
[2021-11-28 13:54:36] <4>[    7.605828] Call Trace:
[2021-11-28 13:54:36] <4>[    7.605835]  <TASK>
[2021-11-28 13:54:36] <4>[    7.605843]  amdgpu_bo_unref+0x1a/0x30 [amdgpu]
[2021-11-28 13:54:36] <4>[    7.606024]  gfx_v9_0_sw_fini+0xca/0x1a0 [amdgpu]
[2021-11-28 13:54:36] <4>[    7.606180]  amdgpu_device_ip_fini.isra.0.cold+0x27/0x55 [amdgpu]
[2021-11-28 13:54:36] <4>[    7.606369]  amdgpu_device_fini_sw+0x16/0x100 [amdgpu]
[2021-11-28 13:54:36] <4>[    7.606514]  amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
[2021-11-28 13:54:36] <4>[    7.606657]  devm_drm_dev_init_release+0x3d/0x60 [drm]
[2021-11-28 13:54:36] <4>[    7.606686]  devres_release_all+0xb8/0x100
[2021-11-28 13:54:36] <4>[    7.606700]  really_probe+0x100/0x310
[2021-11-28 13:54:36] <4>[    7.606710]  __driver_probe_device+0xfe/0x180
[2021-11-28 13:54:36] <4>[    7.606722]  driver_probe_device+0x1e/0x90
[2021-11-28 13:54:36] <4>[    7.606732]  __driver_attach+0xc0/0x1c0
[2021-11-28 13:54:36] <4>[    7.606741]  ? __device_attach_driver+0xe0/0xe0
[2021-11-28 13:54:36] <4>[    7.606753]  ? __device_attach_driver+0xe0/0xe0
[2021-11-28 13:54:36] <4>[    7.606763]  bus_for_each_dev+0x89/0xd0
[2021-11-28 13:54:36] <4>[    7.606773]  bus_add_driver+0x12b/0x1e0
[2021-11-28 13:54:36] <4>[    7.606782]  driver_register+0x8f/0xe0
[2021-11-28 13:54:36] <4>[    7.606791]  ? 0xffffffffc0db9000
[2021-11-28 13:54:36] <4>[    7.606800]  do_one_initcall+0x57/0x200
[2021-11-28 13:54:36] <4>[    7.606811]  do_init_module+0x5c/0x260
[2021-11-28 13:54:36] <4>[    7.606821]  __do_sys_finit_module+0xae/0x110
[2021-11-28 13:54:36] <4>[    7.802026]  do_syscall_64+0x3b/0x90
[2021-11-28 13:54:36] <4>[    7.802038]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[2021-11-28 13:54:36] <4>[    7.802051] RIP: 0033:0x7203fd605edd
[2021-11-28 13:54:36] <4>[    7.802061] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6b 7f 0c 00 f7 d8 64 89 01 48
[2021-11-28 13:54:36] <4>[    7.802099] RSP: 002b:00007fffb0573118 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[2021-11-28 13:54:36] <4>[    7.802117] RAX: ffffffffffffffda RBX: 00006397ad8e6370 RCX: 00007203fd605edd
[2021-11-28 13:54:36] <4>[    7.802133] RDX: 0000000000000000 RSI: 00006397ad8e6f80 RDI: 0000000000000014
[2021-11-28 13:54:36] <4>[    7.802149] RBP: 0000000000020000 R08: 0000000000000000 R09: 00006397ad8e6fe0
[2021-11-28 13:54:36] <4>[    7.802166] R10: 0000000000000014 R11: 0000000000000246 R12: 00006397ad8e6f80
[2021-11-28 13:54:36] <4>[    7.802182] R13: 00006397ad8e0710 R14: 0000000000000000 R15: 00006397ad8e74f0
[2021-11-28 13:54:36] <4>[    7.802199]  </TASK>
[2021-11-28 13:54:36] <4>[    7.802206] ---[ end trace b49c9edf581387d3 ]---
[2021-11-28 13:54:36] <6>[    7.802286] [drm] sw_fini of IP block <smu>...
[2021-11-28 13:54:36] <6>[    7.802302] [drm] sw_fini of IP block <psp>...
[2021-11-28 13:54:36] <6>[    7.802332] [drm] sw_fini of IP block <vega10_ih>...
[2021-11-28 13:54:36] <6>[    7.802496] [drm] sw_fini of IP block <gmc_v9_0>...
[2021-11-28 13:54:36] <4>[    7.802519] ------------[ cut here ]------------
[2021-11-28 13:54:36] <4>[    7.802530] Memory manager not clean during takedown.

This one seems to talk about a GPU-memory management issue. Guess I’ll stop here chasing those downstream crashes, at least this one doesn’t crash qemu and spares me some reboots.

Progress has been slow, and happening mostly on an amd-gfx thread. Only today did I see the guest amdgpu driver start up for the first time - although this is a big step, but there are still a couple of glitches getting in the way of video output.
With a bit of luck, Santa may be only slightly late with this christmas present :wink:

1 Like

Damn this post and the linked/related ones are a great way to understand how things work under the hood ! ^^

Just a noob remark, have you tried by blacklisting amdgpu in dom0 and assigning the device to xen-pciback ? I read nowhere that you tried it.
This would prevent dom0 and/or the driver from doing nasty things with your GPU before PT-ing !

Below is my working method for a RX580, maybe that works for you too ?

Some notes before
  • I know the RX580 is not a iGPU, and I’m using it in a Ryzen desktop CPU (Ryzen 1700X), and there are many things I don’t know, but this method may be of help to others
  • the RX580 card has no FLR, is on the primary x16 PCI slot, so it’s used for displaying BIOS POST and early kernel messages, then xen-pciback seizes it, and the display switches to my other GPU, fortunately an Nvidia (so no driver conflict).
  • those instructions are for a Debian-based dom0, please carefully adapt. I just started Qubes, so I don’t know the correct paths and don’t wanna say 5h!t ! ^^
  • the RX580 must NEVER leave the pci-assignable pool, or hell will fall on you.

Steps

1. Modules config

  • First ensure that /etc/modules or modprobe.d/ contains this
    (PS: it’s already done on Qubes, in /etc/sysconfig/modules/qubes-dom0.modules)
xen-pciback
  • In /etc/modprobe.d/atigpu-blacklist.conf (for Qubes /etc/sysconfig/modules/atigpu-blacklist.conf seems the right place)
blacklist amdgpu

As you also have an AMD dGPU, I think you need an extra step to reload the driver once the domU containing the iGPU is started, but I’ve not tested it : my setup uses a Nvidia GPU for dom0, so it’s easier.

2. initramfs config

  • Create a new script like /usr/share/initramfs-tools/scripts/init-top/zload_xen-pciback, and don’t forget to chmod +x zload_xen-pciback, it’s a sh script.
    PS: no idea where this script should be in Qubes !
#!/bin/sh
modprobe xen-pciback hide=\(0000:25:00.0\)\(0000:25:00.1\)
  • In /usr/share/initramfs-tools/scripts/init-top/udev
    PS: no idea where this script should be in Qubes !
# change
PREREQS=""
# to
PREREQS="zload_xen-pciback"
  • Last thing, don’t forget to regenerate your initramfs (this too I dunno how to do on Qubes/Fedora).
  • To correctly adapt the paths to Qubes, read the “credit link” below. In short, in Debian, initramfs scripts in /usr take precedence over initramfs scripts in /etc.

3. End credits ^^

Voilà, I hope it works for you !
For more detailed explanations of how and why it works, and the credit for inspiration, check this link.

Sure, but I’m using pci-stub for this rather than xen-pciback.

Just noticed an interesting patch on amd-gfx, will have to get back to testing this soon: [PATCH] drm/amdgpu/gmc: use PCI BARs for APUs in passthrough