Fedora sees only 1 CPU core after updating the kernel to 6.9.x

After yesterday’s dom0 update and restart(s), cpuinfo and cpugraph (Task Manager v1.2) show only one core, although I have 12th generation CPU with:

Total Cores 16
# of Performance-cores 8
# of Efficient-cores 8
Total Threads 24

xen_version : 4.14.6
Linux 6.9.4-1.qubes.fc32.x86_64
Qubes v4.1.2

Didn’t ever mess up with pinning at all, so that’s out of question.

Any idea what and how to check what is going on?

You can check the output of xl dmesg command in dom0. Maybe there will be some info about your CPU.

Thanks @apparatus

This is what I get from xl dmesg

(XEN) Brought up 16 CPUs
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource

(XEN) Dom0 has maximum 16 VCPUs

What’s the output of these commands in dom0?

lscpu
xl cpupool-list
xl vcpu-list

lscpu
CPU(s): 1
On-line CPU(s) list: 0
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1

xl cpupool-list
Name CPUs Sched Active Domain count
Pool-0 16 credit2 y xx

xl vcpu-list
Name ID VCPU CPU State Time(s) Affinity (Hard / Soft)
Domain-0 0 0 0 r-- 8464.7 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 1 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 2 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 3 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 4 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 5 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 6 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 7 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 8 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 9 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 10 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 11 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 12 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 13 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 14 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
Domain-0 0 15 - --p 0.0 0,2,4,6,8,10,12,14,16-23 / 0,2,4,6,8,10,12,14,16-23
disp993 11 0 18 -b- 1059.8 all / 0-23
disp993 11 1 6 -b- 872.3 all / 0-23
disp21 12 0 14 -b- 716.7 all / 0-23
disp21 12 1 17 -b- 590.5 all / 0-23
disp4498 13 0 2 -b- 222.3 all / 0-23
disp4498 13 1 12 -b- 224.9 all / 0-23

etc for the rest opened domains…

What’s the output of this command?

xl dmesg | grep "Command line"
(XEN) Command line: placeholder console=none dom0_mem=min:6144M dom0_mem=max:6144M ucode=scan smt=off gnttab_max_frames=2048 gnttab_max_maptrack_frames=4096 no-real-mode edd=off

I cant remember I added “no-real-mode edd=off” part, at least I don’t have it in my notes

It’s the default, I have it as well.

What’s the output of these commands?

xl dmesg | grep VCPU
xl list 0

xl dmesg | grep VCPU
(XEN) Dom0 has maximum 16 VCPUs

xl list 0
Name ID Mem VCPUs State Time(s)
Domain0 0 6128 1 r----- 10748.4

What if you specify the number of VCPUs assigned to dom0 for Xen? Will you have all your cores in dom0?
Try to add dom0_max_vcpus=16 or dom0_max_vcpus=16- to GRUB_CMDLINE_XEN_DEFAULT in /etc/default/grub and rebuild GRUB config.

Thanks for the tip and your time, and for that I’ll need to wait for the next restart in a couple of days. Will be back with the results, although am not sure wouldn’t that be by default…
It looks they’re all there but somehow reassigned one per domain…

It should be the default but there seems to be some bug or something.
In the output of xl vcpu-list for dom0 you can see that all 16 PCPUs are assigned to the single VCPU 0. I’m not sure if it’s a possible configuration. Looks like a bug to me, but I don’t know what is that.
Maybe you should ask about this on XenDevel matrix channel or mailing list to confirm whether it’s a bug or not:

1 Like

Thanks for the idea.

Does this looks interesting/related to you:

since I had these in dmesg

(XEN) CPU10: Temperature above threshold
(XEN) CPU10: Running in modulated clock mode
(XEN) CPU14: Temperature above threshold
(XEN) CPU14: Running in modulated clock mode
(XEN) CPU6: Temperature above threshold
(XEN) CPU6: Running in modulated clock mode
(XEN) CPU2: Temperature above threshold
(XEN) CPU2: Running in modulated clock mode
(XEN) CPU8: Temperature above threshold
(XEN) CPU8: Running in modulated clock mode
(XEN) CPU10: Temperature above threshold
(XEN) CPU10: Running in modulated clock mode
(XEN) CPU4: Temperature above threshold
(XEN) CPU4: Running in modulated clock mode
(XEN) CPU2: Temperature above threshold
(XEN) CPU2: Running in modulated clock mode
(XEN) CPU0: Temperature above threshold
(XEN) CPU0: Running in modulated clock mode
(XEN) CPU12: Temperature above threshold
(XEN) CPU12: Running in modulated clock mode

These messages mean that your CPU is overheating and is throttled.
You can check the output of sensors command in dom0 to check the CPU temperature.
I’m not sure if overheating could be related to your issue.

1 Like

I see. Thanks.

Well, I found this in my journalctl. Do you read some firmware bug maybe?

Jun 23 08:42:33 dom0 kernel: alsactl[1672]: segfault at 28 ip 000058f0aa1e0aca sp 00007ffe50bb72e0 error 4 in alsactl[58f0aa1d1000+12000] likely on CPU 12 (core 24, socket 0)

...

Jun 23 08:43:12 dom0 kernel: Xen PV: Detected 16 vCPUS
Jun 23 08:43:12 dom0 kernel: CPU topo: Enumerated BSP APIC 0 is not marked in APICBASE MSR
Jun 23 08:43:12 dom0 kernel: CPU topo: Assuming crash kernel. Limiting to one CPU to prevent machine INIT
Jun 23 08:43:12 dom0 kernel: CPU topo: [Firmware Bug]: APIC enumeration order not specification compliant
Jun 23 08:43:12 dom0 kernel: CPU topo: Boot CPU APIC ID not the first enumerated APIC ID: 0 != 1
Jun 23 08:43:12 dom0 kernel: CPU topo: Crash kernel detected. Disabling real BSP to prevent machine INIT
Jun 23 08:43:12 dom0 kernel: CPU topo: CPU limit of 1 reached. Ignoring further CPUs
Jun 23 08:43:12 dom0 kernel: CPU topo: Max. logical packages:   1
Jun 23 08:43:12 dom0 kernel: CPU topo: Max. logical dies:       1
Jun 23 08:43:12 dom0 kernel: CPU topo: Max. dies per package:   1
Jun 23 08:43:12 dom0 kernel: CPU topo: Max. threads per core:   1
Jun 23 08:43:12 dom0 kernel: CPU topo: Num. cores per package:     1
Jun 23 08:43:12 dom0 kernel: CPU topo: Num. threads per package:   1
Jun 23 08:43:12 dom0 kernel: CPU topo: Allowing 1 present CPUs plus 0 hotplug CPUs
Jun 23 08:43:12 dom0 kernel: CPU topo: Rejected CPUs 23
Jun 23 08:43:12 dom0 kernel: setup_percpu: NR_CPUS:8192 nr_cpumask_bits:1 nr_cpu_ids:1 nr_node_ids:1
Jun 23 08:43:12 dom0 kernel: SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
Jun 23 08:43:12 dom0 kernel: rcu:         RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=1.
Jun 23 08:43:12 dom0 kernel: installing Xen timer for CPU 0
Jun 23 08:43:12 dom0 kernel: Performance Events: unsupported p6 CPU model 151 no PMU driver, software events only.
Jun 23 08:43:12 dom0 kernel: smp: Bringing up secondary CPUs ...
Jun 23 08:43:12 dom0 kernel: smp: Brought up 1 node, 1 CPU
Jun 23 08:43:12 dom0 kernel: ACPI: _OSC evaluated successfully for all CPUs
Jun 23 08:43:12 dom0 kernel: intel_pstate: CPU model not supported
Jun 23 08:43:28 dom0 systemd[1]: systemd-udevd.service: Consumed 2.456s CPU time.
Jun 23 08:44:08 dom0 systemd[1]: user-994.slice: Consumed 1.048s CPU time.
Jun 23 08:47:37 dom0 systemd[1]: session-2.scope: Consumed 41.020s CPU time.
Jun 23 08:47:37 dom0 systemd[3032]: qubes\x2dwidget.slice: Consumed 2.948s CPU time.
Jun 23 08:47:37 dom0 systemd[1]: user@1000.service: Consumed 5.301s CPU time.
Jun 23 08:47:37 dom0 systemd[1]: lightdm.service: Consumed 16.121s CPU time.
Jun 23 08:47:37 dom0 systemd[1]: user-1000.slice: Consumed 46.344s CPU time.
Jun 23 08:47:38 dom0 systemd[1]: qubes-qrexec-policy-daemon.service: Consumed 3.796s CPU time.
Jun 23 08:47:40 dom0 systemd[1]: libvirtd.service: Consumed 35.624s CPU time.
Jun 23 08:47:40 dom0 systemd[1]: qubesd.service: Consumed 33.869s CPU time.
Jun 23 08:47:40 dom0 systemd[1]: user.slice: Consumed 48.138s CPU time.
Jun 23 08:48:14 dom0 kernel: Xen PV: Detected 16 vCPUS
Jun 23 08:48:14 dom0 kernel: CPU topo: Enumerated BSP APIC 0 is not marked in APICBASE MSR
Jun 23 08:48:14 dom0 kernel: CPU topo: Assuming crash kernel. Limiting to one CPU to prevent machine INIT
Jun 23 08:48:14 dom0 kernel: CPU topo: [Firmware Bug]: APIC enumeration order not specification compliant
Jun 23 08:48:14 dom0 kernel: CPU topo: Boot CPU APIC ID not the first enumerated APIC ID: 0 != 1
Jun 23 08:48:14 dom0 kernel: CPU topo: Crash kernel detected. Disabling real BSP to prevent machine INIT
Jun 23 08:48:14 dom0 kernel: CPU topo: CPU limit of 1 reached. Ignoring further CPUs
Jun 23 08:48:14 dom0 kernel: CPU topo: Max. logical packages:   1
Jun 23 08:48:14 dom0 kernel: CPU topo: Max. logical dies:       1
Jun 23 08:48:14 dom0 kernel: CPU topo: Max. dies per package:   1
Jun 23 08:48:14 dom0 kernel: CPU topo: Max. threads per core:   1
Jun 23 08:48:14 dom0 kernel: CPU topo: Num. cores per package:     1
Jun 23 08:48:14 dom0 kernel: CPU topo: Num. threads per package:   1
Jun 23 08:48:14 dom0 kernel: CPU topo: Allowing 1 present CPUs plus 0 hotplug CPUs
Jun 23 08:48:14 dom0 kernel: CPU topo: Rejected CPUs 23
Jun 23 08:48:14 dom0 kernel: setup_percpu: NR_CPUS:8192 nr_cpumask_bits:1 nr_cpu_ids:1 nr_node_ids:1
Jun 23 08:48:14 dom0 kernel: SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
Jun 23 08:48:14 dom0 kernel: rcu:         RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=1.
Jun 23 08:48:14 dom0 kernel: installing Xen timer for CPU 0
Jun 23 08:48:14 dom0 kernel: Performance Events: unsupported p6 CPU model 151 no PMU driver, software events only.

That was the last restart when it all started…

I found these earlier, during dom0 update Iwas talking about in my OP. I just restarted laptop 4 days later…

Jun 19 14:49:43 dom0 kernel: Performance Events: unsupported p6 CPU model 151 no PMU driver, software events only.
Jun 19 14:49:43 dom0 kernel: smp: Bringing up secondary CPUs ...
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 2
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 4
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 6
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 8
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 10
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 12
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 14
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU   2: APIC ID mismatch. CPUID: 0x0002 APIC: 0x0008
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU   4: APIC ID mismatch. CPUID: 0x0004 APIC: 0x0010
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU   6: APIC ID mismatch. CPUID: 0x0006 APIC: 0x0018
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU   8: APIC ID mismatch. CPUID: 0x0008 APIC: 0x0020
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU  10: APIC ID mismatch. CPUID: 0x000a APIC: 0x0028
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU  12: APIC ID mismatch. CPUID: 0x000c APIC: 0x0030
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU  14: APIC ID mismatch. CPUID: 0x000e APIC: 0x0038
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 1
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 3
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 5
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 7
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 9
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 11
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 13
Jun 19 14:49:43 dom0 kernel: installing Xen timer for CPU 15
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU   3: APIC ID mismatch. CPUID: 0x0003 APIC: 0x0009
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU   5: APIC ID mismatch. CPUID: 0x0005 APIC: 0x0011
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU   7: APIC ID mismatch. CPUID: 0x0007 APIC: 0x0019
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU   9: APIC ID mismatch. CPUID: 0x0009 APIC: 0x0021
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU  11: APIC ID mismatch. CPUID: 0x000b APIC: 0x0029
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU  13: APIC ID mismatch. CPUID: 0x000d APIC: 0x0031
Jun 19 14:49:43 dom0 kernel: [Firmware Bug]: CPU  15: APIC ID mismatch. CPUID: 0x000f APIC: 0x0039

That is not how throttling is supposed to work.

The CPU should automatically detect it is overheating and reduce the clock speed, by reducing the long-boost power limited, until the CPU temperature is stable at around 70C.

If Xen has to handle the throttling, there is something wrong with either the cooling or the CPU power limit settings.

I think it’s just an informative messages that CPU is throttling and not that Xen is handling the throttling instead of CPU.

Can you post the full kernel crash log?
By the look of it there could be some issue with your audio device driver.

Maybe, I run a lot of CPU intensives applications, I never get any messages about the CPU overheating.

But it can be a difference between the 12 and 13th gen CPUs