AMD iGPU passthrough attempt

yann · November 21, 2021, 11:05am

Thinking twice about it: I thought it would be expected to see the PCI devices physical addresses protected from the OS by being declared in reserved regions. However, if we compare the ranges of the different BARs:

[2021-11-21 00:13:11] pci 0000:00:00.0: [1002:1636] type 00 class 0x030000
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x10: [mem 0xb0000000-0xbfffffff 64bit pref]
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x18: [mem 0xc0000000-0xc01fffff 64bit pref]
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x20: [io  0xe000-0xe0ff]
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x24: [mem 0xfe400000-0xfe47ffff]
[2021-11-21 00:13:11] pci 0000:00:00.0: reg 0x30: [mem 0x000c0000-0x000dffff pref]
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/0
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/2
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/4
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/5
[2021-11-21 00:13:11] pcifront pci-0: claiming resource 0000:00:00.0/6
[2021-11-21 00:13:11] pci 0000:00:00.0: can't claim BAR 6 [mem 0x000c0000-0x000dffff pref]: address conflict with Reserved [mem 0x000a0000-0x000fffff]

… with the map provided by the BIOS:

[    0.000000] BIOS-provided physical RAM map:
[    0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable
[    0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved
[    0.000000] Xen: [mem 0x0000000000100000-0x0000000009bfefff] usable
[    0.000000] Xen: [mem 0x0000000009bff000-0x0000000009ffffff] reserved
[    0.000000] Xen: [mem 0x000000000a000000-0x000000000a1fffff] usable
[    0.000000] Xen: [mem 0x000000000a200000-0x000000000a20cfff] ACPI NVS
[    0.000000] Xen: [mem 0x000000000a20d000-0x00000000a9eaafff] usable
[    0.000000] Xen: [mem 0x00000000a9eab000-0x00000000ab3c8fff] reserved
[    0.000000] Xen: [mem 0x00000000ab3c9000-0x00000000ab419fff] ACPI data
[    0.000000] Xen: [mem 0x00000000ab41a000-0x00000000ab786fff] ACPI NVS
[    0.000000] Xen: [mem 0x00000000ab787000-0x00000000ab787fff] reserved
[    0.000000] Xen: [mem 0x00000000ab788000-0x00000000ab98dfff] ACPI NVS
[    0.000000] Xen: [mem 0x00000000ab98e000-0x00000000ad5fefff] reserved
[    0.000000] Xen: [mem 0x00000000ad5ff000-0x00000000adffffff] usable
[    0.000000] Xen: [mem 0x00000000ae000000-0x00000000afffffff] reserved
[    0.000000] Xen: [mem 0x00000000f0000000-0x00000000f7ffffff] reserved
[    0.000000] Xen: [mem 0x00000000fd000000-0x00000000ffffffff] reserved
[    0.000000] Xen: [mem 0x0000000100000000-0x0000000155bc1fff] usable
[    0.000000] Xen: [mem 0x000000042f340000-0x00000004701fffff] reserved
[    0.000000] Xen: [mem 0x000000fd00000000-0x000000ffffffffff] reserved

… we can see that the 0xfe400000-0xfe47ffff range of BAR 5 is indeed intersecting with the 0x00000000fd000000-0x00000000ffffffff reserved region, but the BAR 0 and BAR 2 ranges fall in an “undeclared” gap between 2 reserved regions. And that difference does not result in different handling of those 3 BARs, whose resources are apparently all successfully claimed.

@marmarek, do you see why BAR 5 would not be detected as a conflict by request_resource_conflict() ? The most obvious would be that the stubdom’s memory map would not match the dom0 one (which I guess the host_e820 trick would correct if it was supported for HVM), but then the stubdom kernel is very quiet and does not report its view of the map. Its kconfig does not show a change in default loglevel, and its cmdline is reported as empty, there’s definitely no quiet flag there. How then is it so quiet ?

I also not that pci_claim_resource() would not have to make such a check if we had a shadow copy of the ROM at this point – this could look like a path worth investigating?