Kernel panic during installation on Lenovo Thinkpad P16s Gen 2 (AMD 7840U with 780M iGPU)

First Installation attempt with Qubes release 4.1.2 resulted in output of 4 or 5 lines and stucks over minutes with blinking Cursor.
Second Installation attempt with release 4.2.0-rc4 produced this output.

Web search of the relevant parts “acpi_ev_address_space_dispatch” “__die_body.cold” leads me to

There was stated “Try --cpus max_phys_bits=39 or 38. We place our MMIO regions at the end of the guest physical address space - we have seen some CPUs from AMD wrongly report the address space size.”

My interpretation of ‘guest physical address-space’ ist the total cpu physical address-space of 48 bits minus the bits used for MMIO. Which is returned by the CPU microcode to the kernel function acpi_ev_address_space_dispatch().
Assuming that my amd cpu microcode contains a similar bug even it is a modern Zen 4 architecture. Is there a chance to address the bug in Xen or Linux kernel via command line arguments in Grub boot manager similar to KVM Hypervisor command line args ’ --cpu max_phys_bits= ’ ?

Hi,

According to your picture, you already have the latest BIOS for your model.
(Just writing this for possible future readers:
You could try to update your BIOS to the latest version.)

In the link you posted, the acpi problem has been solved by adding the kernel option acpi=off.
The --cpus max_phys_bits= is for the next problem that this user encounter about virtio device.

This post might be useful: https://forum.qubes-os.org/t/thinkpad-t16-amd-ryzen-acpi-issue-no-keyboard-no-touchpad/21685

It report that acpi must be disabled (acpi=off) to install Qubes.
It also have posted an other forum topic.
this one: https://forum.qubes-os.org/t/my-adventures-with-qubes-4-1-1-on-a-lenovo-t14-gen-3/14370

Maybe that will help you.

2 Likes

Hi,
that hint is was helpful. Thanks!
My first attempt to install with the acpi=off with the ‘kernel-latest’ boot option resulted in an endless loop
‘watchdog: BUG: soft lockup -CPU#0 stuck for 25s! [init:1]’ counting +25s per loop.
The second attempt choosing the first GRUB boot option gets me to the graphical install screen.
Only the linux kernel parameter acpi=off was needed.

I’m impressed that you hit the mark. Before posting here I tried several xen and kernel params suggested in thes Qubes OS User Support forum without success.

In order to proceed I have to obtain an usb keyboard/mouse first as the internal devices are not functional…

1 Like

@methusalix I’m glad you were able to progress!

A little pro tip: I marked @szz9pza’s post as the solution, so future folks (and our future selves!) can see that the topic includes a solution, and get a highlight of that solution in the first post.

If you want yo do that in the future, it’s something that you can do yourself by using the little “checkbox” icon at the bottom of the post that you want to mark as the solution. Welcome to the forum! :slightly_smiling_face:

Hi,

I’m encountering the same issue (same kernel logs) on 4.2.0 with the latest kernel the installer ISO has (6.7 IIRC), latest UEFI/BIOS firmware from Lenovo, on:

Thinkpad Z16 Gen 2
AMD Ryzen™ PRO 7040 H + Radeon Graphics

Same issue with Qubes 4.1.2

Blacklisting the ucsi_acpi module in the kernel params allowed me to boot, install and setup Qubes, use the keyboard and trackpoint (no touchpad), but the machine restarts intermittently when removing USB devices, its quite unstable. Also, since the Z16 has the “fake” trackpoint buttons in the touchpad, its still unusable because the clicking feel comes from haptic feedback in the touchpad, which does not work if you blacklist ucsi_acpi .

Only improvement from the posts above is that only blacklisting ucsi_acpi allowed me to use the built-in keyboard and trackpoint and the installation completed successfully.

Plain Fedora 39 works fine on these machines, installing Xen 4.18 on plain Fedora 39 from rawhide repos and booting it gives this same kernel panic, seems like a Xen issue?

@methusalix Did you manage to get a usable machine? (internal keyboard/touchpad working)

Any idea where the issue stems from exactly or an ETA? Still in the return window for the machine.

Hi,

no I did not manage to get a working Qubes installation. Although the setup finished without error messages the qubes boot process fails. The kernel log shows
“pciback … can’t find IRQ for PCI INT D …
pciback … can’t find IRQ for PCI INT A …
pciback … can’t find IRQ for PCI INT B …”

Fedora 39 works fine on my machine too.

Whenever Lenovo releases new BIOS updates I’ll start new installation attempts. No progress up to now.

Hi,
I am trying a Lenovo Z16 gen2 (Ryzen 9 pro). Installation of 4.3rc1 works flawlessly without acpi=off or any other “trick” (from 64GB USB stick) .
First boot after installation also flawless - but then it doesn’t start the “finish setup” process and ends up with a system with no templates or qubes except dom0. That part works well - trackpad, sleep, reboot, display - all good.
Started “initial setup” manually but it hangs on trying to install templates. Obviously - where would it get them from - since there is no network and USB stick is removed.
Plugged stick back in and tried again initial setup. Does some stuff but then complains that LLVM already exists. At this point, the system does seem to have at a debian template installed, so there is functionality.
Tried installation again (with latest kernel) - this time didn’t remove USB stick for restart. It seems to install and reboot properly but then hangs on black screen. Tried rebooting - same - and heats up like mad. Shouldn’t do more than one change at a time (new kernel + leave USB in).
Installed again. (Each time, I “reclaim” all disk-space of the 4TB built in nvme and use automatic partitioning)
Install with standard kernel and leaving USB stick in for reboot process (but obviously booting installed system).
→ Boot, disk decrypt, user login, all good - but no initial-setup start.
Starting manually - leaving all options default. Hangs on Installing TemplateVMdebian-13-xfce - fans running up and the machine is getting hotter.
But wait - and wait some more - it doesn’t hang, just takes a very long time (and uses a lot of compute resources it seems). Too much heat will kill any laptop, they’re not servers. But let’s see. It’s now installing one template every couple of minutes or so. Patience young Skywalker.
Ok- it ends up “failed” - I should check /var/log/salt/minion:
Got empty response from qubesd - and some tracebacks.
It started a bunch of qubes, including sys-net etc. - so maybe we are good to go.
If I can get completely up and running, I will file all the details from that machine, including logs, hardware etc.
(Just posting this here in case anybody cares, don’t expect specific feedback unless somebody has other proposals to try)

There is an option in Initial Setup to “Use existing LVM thin pool”, instead of creating a new one.

Thanks. The problem I have -in general- is that I don’t know when something in Qubes is as intended (e.g. No wifi access) or is that because there is a problem. After using Linux eclusively for >30 years (including UNIx before) I usually know what to look for in Ubuntu or the like, but Qubes is a different animal. Maybe there is no wifi because there is not supposed to be wifi - but how do I know?

I have read the notes about AMD (“not recommended”) and it’s saddening. I was always an AMD fan because they acted less monopolistic than Intel. But I guess there really are too many issues. The interaction between BIOS and higher levels (OS) and even on/off key I really see as a problem. “Absolute persistence” in the BIOS is also a red flag as well as the inclusion of multiple Micros*** hooks and camera, microphone and all that without any physical switches. I know - physical switches, removable batteries- all a thing of times long gone.
Instead we now have “Absolute Persistence” in our BIOS which is reprogrammable from within the OS! Nice, somebody read Joanna’s paper of ten (or so) years ago (and did the opposite, just in case you weren’t getting my sarcasm)

I have followed the advice of some other comment and tried the release (4.2) again instead - but I get the usual kernel panic. apic=off fixes that but then I am unable to do much because not much works. I tried several others iommu etc. - but to no avail.

It should work out-of-the-box if there is no problem.
I guess you’re using stable kernel in dom0 for qubes and Debian template.
Newer hardware could be unsupported by the stable kernel and stable firmware package used in Debian template of Qubes OS 4.2.4.
Use Fedora template for your sys-net qube and set the sys-net to use in-VM kernel. The Fedora includes newer firmware package and has newer in-VM kernel version.

It’s been a while since I look at the initial setup - but as I recall, the first part of the installer should copy the files needed to install the templates to the drive … and as you’ve noticed, installing templates can take a looong time … :-/

:slight_smile:

You can try to use module_blacklist=ucsi_acpi instead of apic=off.

Thanks ChrisA and MellowPoison!
I’ll try those but have to rethink the whole decision. In the process of this I have bumped into multiple issues (e.g. BIOS updates, “secure boot”…) which questions the whole concept. Have to do some more reading first. Joanna wrote some good papers years ago, I wonder if I find similar about up-to-date aspects.

Great - this has worked and I am one step further. Next error is in initial-setup:
Failed to install template whonix-workstation.17
/etc/qubes-rpc/qubes.TemplateSearch
and then a few more errors later on which are most likely consequences of the first one. Will search for it