X1 Extreme Gen 5 Suspend issues

I did a fresh install of Qubes v4.1.1 on my X1 Extreme Gen 5 (i7 12700 H, RTX 3050 Ti, BIOS v1.12) . For this, I:

  • disabled secure boot
  • Enabled the 3rd Party UEFI CA (in hope for enabling the secure boot later)
  • disabled hyperthreading
  • enabled discrete graphics (despite syspacket telling me otherwise Thinkpad X1 Extreme Gen 5 :slight_smile: )

The installation went fine (also the display, no tearing - despite Kernel 5.15.52-1), and after an update via WiFi USB-adapter (where I had to disable mac-cloning… gee, so many hurdles!), it seemed as everything (including WiFi) was working.

But now I ran into the same issue as so many already: a hot laptop on idle, which won’t come back out of suspend.

What I tried:

  • update to kernel-latest (currently: 6.0.12)
  • append mem_sleep_default=deep to GRUB_CMDLINE_LINUX (and run grub2-mkconfig -o /boot/efi/EFI/qubes/grub.cfg)
  • checked “xenpm get-cpufreq-para” for turbo mode :white_check_mark: (which seemed fine except failures for the latter - efficiency? - CPU’s: CPU[14…31] failed to get cpufreq parameter) and scaling whilst performing a benchmark :white_check_mark: (xenpm start 1|grep “Avg freq”) - as per MSI Pro Z690-A WiFi DDR4 with Alder Lake 12900K - #2 by tzwcfq
  • xenpm set-scaling-governor performance (which again resulted in a failure for the latter - efficiency? - CPU’s: CPU[14…31] failed to get cpufreq parameter) as per Cannot get locally built xen packages to install in dom0 - #14 by tzwcfq

Any help to get my new baby on its feet is greatly appreciated!

Are you using S3 or SOix?

S3 should work for some ADL board, but SOix is currently not supported by Qubes.

This is a sign that the laptop has entered S0ix sleep state. And Qubes currently cannot handle that.

There are possible ways to address this issue: enter the BIOS, and change “Config → Power → Sleep State” to " Linux " . If you cannot find such an option, then it’s unavailable. We can just blame the BIOS engineers of your laptop’s manufacturer for not implementing S3 sleep.

Indeed - this solved the Part of “not being able to suspend”!
The Laptop enters suspend, “cools down”, and wakes back up OK. :white_check_mark:

Unfortunately:
Nevertheless, when there is no VM (but the required ones) running, the laptop remains “hot” (that is what I referred to as “on idle” → nothing really producing load)

  • System Status shows: cpu:0%, mem:27%
  • Same with other BIOS Setting: CPU Power Management-> Off

Any further Ideas - or should I mark this as solved and start a new thread?

Try setting smt=on and scaling-governor to powersave and see if that changes anything

Unfortunately nothing - I can see the Processors running at 400MHz, but the Laptop is still running hot.
Would it have to do anything in regards to the undetected CPU’s (CPU[14…31]), which respond to xenpm - commands with “failed to get cpufreq parameter”?

My BIOS- Setting is as follow:

  • Core Multi-Processing: On
  • Efficient-cores Support: On
  • Hyper-Threading Technology: Off

If I turn off Core-Multi-Processing (and with it Efficient-Cores support), I get a failing response back from all remaining CPU’s [CPU1-31]: “failed to set governor name (22 - Invalid argument)”. Heating persists…

You need to enable hyper-threading for all the CPUs to get detected, try enabling both HTT and SMT and setting the CPU to powersave.

I remember reading something on github about some ThinkPads had an issue with the disabled cores not getting set to the correct power state.

If it doesn’t do anything, then just change to back to the settings you have now, HTT and SMT off is correct.

Unfortunately the result remains the same.
Hyperthreading results in not detecting an interesting batch of cores:
[CPU 1,3,5,7,9,11, then 20-31] result in: “invalid argument”

Is there anything I can test or report, further? I’m surprised that @syspacket seems to be running that model (with a i7 12800 H, me i7 12700 H) flawlessly…

You set the kernel parameter smt=off to smt=on?

Having smt=off will do the same as HTT disabled, it disables all the sibling cores, which is why only every other of the P-cores are enabled, it does affect the E-cores they are unable to hyper-thread.

I had it set on the wrong line :slight_smile:
Unfortunately, that still didn’t get the desired effect:

  • I now have CPU [20-31] not being recognized
  • Laptop stays hot

I ran a sudo xl dmesg, which brought up in specific:

  • parameter “no-real-mode” unknown!
  • Unrecognized CPU model 0x9a - assuming vulnerable to LazyFPU
  • Some "Unknown Cacheability for MFSs#
  • Brought up 20 CPUs
  • CPUx: Temperature above threshold → running in modulated clock mode

I further tried to set dom0_max_vcpus=32, which didn’t do anything, either.

But, whilst typing (which is extremely slow - the cursor moves second-by-second) I noticed, that the laptop heats even in boot-mode…?!

I also get the no-real-mode and 0x9a messages, I don’t think they matter, not sure about the others.

I don’t think what you are trying to do makes much sense.

You have 20 CPUs, but you are trying to allocate 32 to dom0 alone, probably also why you are experiencing performance issues.

20-31 are not recognized, they don’t exist, you should have CPU 0-19.

You also shouldn’t use dom0_max_vcpus to allocate the maximum to dom0, you use it to limit the amount of core dom0 runs on.

I have the 12900K with 24 CPUs, here is their layout.
0-15 is the P-cores
16-23 is the E-cores
Your system should look similar with HTT and SMT enabled.

If you disable HTT and SMT you lose half the P-cores, and it should look something like this.
0,2,4,6,8,10,12,14 is enabled
16-23 is enabled

The E-cores are always going to be enabled, they don’t have the ability to hyper-thread and are not affected to HTT/SMT.

1 Like

This is normal, since your CPU has a total of 20 threads (6Px2+8E). CPU[0-19] are correctly recognized.

I suspect that Xen is expecting a 2^n number of CPU threads. I have 12 threads, and Xen says CPU [12-15] not being recognized. Or Xen uses 2 times the number of your CPU model family’s max physical cores as the expected number of threads.

1 Like

Yes, I do get 20 CPUs, thats correct :slight_smile:

These things were just a test to get my head around it.
Thing is, even with two idle CPU’s - or one - I do not understand why a brand new Laptop is struggling with heat problems.
My ‘old’ T460 with Qubes 4.1 remains cold, even with 4 additional VMs running…

Is there anything else I can try? i would hate to let Qubes-OS go…

You can run xenpm start 5, when your laptop is idle.

The command will give you an idea of which idle state your CPUs get in. (C0, C1, C2, C3, etc.)

If most of the CPUs are in C0 or C1 most of the time, then no wonder your laptop will heat up.