I’ve read other posts on the topic and I’d like to know if there have been further developments on Power management, scaling drivers and frequencies ranges.
In particular, these discussions seemed quite relevant:
Only receiving maximum 2 hours of battery life
AMD CPU Frequency Scaling Broken #8008
First issue:
As is stated in the linked posts, the main issue here seems to be the apparently outdated scaling driver powernow
which doesn’t provide a fine granularity and support like intel_pstate
or amd_pstate
.
Honestly, on my machine I’ve experienced the same issue until kernel version 6.1, which started to provide a relatively good support.
Both with the default qubes xen setup which uses powernow
and the non-xen setup using acpi_cpufreq
driver the range of available frequencies is 1600 MHz - 3300 MHz, while with the non-xen setup using acpi_cpufreq_init
driver the range become 400 MHz - 3300 MHz or 400 MHz - 4950 MHz, depending on the turbo mode being enabled or not.
This obviously improve power management capabilities, cooling and fan activity (no more always Tctl > 40°C and fan always on)
Default xen setup:
[user@dom0 ~]$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Vendor ID: AuthenticAMD
BIOS Vendor ID: Advanced Micro Devices, Inc.
Model name: AMD Ryzen 9 6900HX with Radeon Graphics
BIOS Model name: AMD Ryzen 9 6900HX with Radeon Graphics Unknown CPU @ 3.3GHz
BIOS CPU family: 107
CPU family: 25
Model: 68
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
Stepping: 1
BogoMIPS: 6587.61
Flags: fpu de tsc msr pae mce cx8 apic mca cmov pat clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy abm sse4a misalignsse 3dnowprefetch bpext ibpb vmmcall fsgsbase bmi1 avx2 bmi2 erms rdseed adx clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr arat vaes vpclmulqdq rdpid fsrm
Hypervisor vendor: Xen
Virtualization type: none
L1d cache: 32 KiB (1 instance)
L1i cache: 32 KiB (1 instance)
L2 cache: 512 KiB (1 instance)
L3 cache: 16 MiB (1 instance)
NUMA node(s): 1
NUMA node0 CPU(s): 0-7
...
[user@dom0 ~]$ xenpm get-cpufreq-para 0
cpu id : 0
affected_cpus : 0
cpuinfo frequency : max [3300000] min [1600000] cur [1600000]
scaling_driver : powernow
scaling_avail_gov : hwp-internal userspace performance powersave ondemand
current_governor : ondemand
ondemand specific :
sampling_rate : max [10000000] min [10000] cur [20000]
up_threshold : 80
scaling_avail_freq : 3300000 1800000 *1600000
scaling frequency : max [3300000] min [1600000] cur [1600000]
turbo mode : enabled
Non-xen dom0 setup with acpi-cpufreq driver:
Default xen setup:
[user@dom0 ~]$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: AuthenticAMD
BIOS Vendor ID: Advanced Micro Devices, Inc.
Model name: AMD Ryzen 9 6900HX with Radeon Graphics
BIOS Model name: AMD Ryzen 9 6900HX with Radeon Graphics Unknown CPU @ 3.3GHz
BIOS CPU family: 107
CPU family: 25
Model: 68
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
Stepping: 1
Frequency boost: enabled
CPU(s) scaling MHz: 42%
CPU max MHz: 4933.8862
CPU min MHz: 1600.0000
BogoMIPS: 6587.68
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
Virtualization: AMD-V
L1d cache: 256 KiB (8 instances)
L1i cache: 256 KiB (8 instances)
L2 cache: 4 MiB (8 instances)
L3 cache: 16 MiB (1 instance)
NUMA node(s): 1
NUMA node0 CPU(s): 0-15
...
[user@dom0 ~]$ cpupower --cpu 0 frequency-info
analyzing CPU 0:
driver: acpi-cpufreq
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: Cannot determine or is not supported.
hardware limits: 1.60 GHz - 4.94. GHz
available frequency steps: 3.30 GHz, 1.80 GHz, 1.60 GHz
available cpufreq governors: conservative ondemand userspace powersave performance schedutil
current policy: frequency should be within 1.60 GHz and 3.30 GHz.
The governor "schedutil" may decide which speed to use
within this range.
current CPU frequency: 1.60 GHz (asserted by call to hardware)
boost state support:
Supported: yes
Active: yes
Boost States: 0
Total States: 3
Pstate-P0: 3300MHz
Pstate-P1: 1800MHz
Pstate-P2: 1600MHz
Non-xen dom0 setup with acpi-cpufreq driver:
[user@dom0 ~]$ cpupower --cpu 0 frequency-info
analyzing CPU 0:
driver: amd-pstate
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: 20.0 us
hardware limits: 400 MHz - 4.94 GHz
available cpufreq governors: conservative ondemand userspace powersave performance schedutil
current policy: frequency should be within 400 MHz and 4.94 GHz.
The governor "schedutil" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 1.45 GHz (asserted by call to kernel)
boost state support:
Supported: yes
Active: yes
AMD PSTATE Highest Performance: 166. Maximum Frequency: 4.94 GHz.
AMD PSTATE Nominal Performance: 111. Nominal Frequency: 3.30 GHz.
AMD PSTATE Lowest Non-linear Performance: 37. Lowest Non-linear Frequency: 1.10 GHz.
AMD PSTATE Lowest Performance: 14. Lowest Frequency: 400 MHz.
On debian distros using amd_pstate
and laptop-mode-tools
the battery estimated duration increased a lot and even thought this machine is quite powerful and the dGPU (Radeon RX 6650M) can become an avid energy drainer, on normal workload the battery duration seems quite fine.
The fact is,comparing the xen and non-xen setups, using the same kernel (kernel-latest 6.4.8-1.qubes.fc37.x86_64) and 3 qubes running on the xen setup (sys-firewall, sys-net and an AppVM), on similar workloads, for example visiting a wiki web page (1 tab open), even users mentioned often the overhead of virtualization and qubes internals, I’ve not experienced a noticeable difference of battery duration between seups, unless switching from acpi_cpufreq
to amd_pstate
.
So, it seems to me that the main issue here remains the low-perf P-state with a too high frequency.
I saw the suggestion related to xen cmdline cpufreq=xen:hwp
, but I think hwp works only on Intel systems, while I’m on an AMD one.
I was wondering too, what is exactly the scaling governor hwp-internal
, that shows up with xenpm get-cpufreq-para
? This one too is specific only to Intel systems?
Since Qubes R4.2 with the latest updates already provides kernel version 6.4.8 and on my machine is pretty stable and almost everything works, I’d really like to take advantage of AMD P-State EPP (Energy Performance Preference) mode and the P-State Guided Autonomous Mode (amd_pstate=guided
) too, to finally be able to switch back to Qubes, when the R4.2 stable will be released.
Second issue:
From what I understood reading the Xen Wiki (xenproject - Xen_power_management), while the cpuidle states are always manged by the hypervisor, the cpufreq states can be managed by both the hypervisor and the dom0 kernel related mechanism and this behaviour is controlled with the xen cmdline parameter cpufreq: cpufreq=xen
(default) or cpufreq=dom0-kernel
.
Domain0 based cpufreq reuse the domain0 kernel cpufreq code and let domain0 handle the cpufreq logic. Xen hypervisor provides two hypercalls, which are platform hypercall XENPF_change_freq and XENPF_getidletime, to assist the domain0 kernel to get the system status and also change the CPU frequency.
So, theoretically, is could be possible to transfer almost all of the cpufreq logic to dom0 and handle it with the dom0 local scaling drivers, like intel_pstate
and amd_pstate
, or it wouldn’t change much?
I’ve tried and with cpufreq=dom0-kernel
seems that neither the xen hypervisor nor the dom0 can manage cpufreqs any longer.
[user@dom0 ~]$ xenpm get-cpufreq-para
Xen cpufreq is not enabled!
[user@dom0 ~]$ cpupower --cpu 0 frequency-info
analyzing CPU 0:
no or unknown cpufreq driver is active on this CPU
CPUs which run at the same hardware frequency: Not Available
CPUs which need to have their frequency coordinated by software: Not Available
maximum transition latency: Cannot determine or is not supported.
Not Available
available cpufreq governors: Not Available
Unable to determine current policy
current CPU frequency: Unable to call hardware
current CPU frequency: Unable to call to kernel
boost state support:
Supported: yes
Active: no
What am I missing here?