if you take the time to read what i have already posted, i have made it quite clear what i have already attempted. everything i will post in this summary below is included above.
i am attempting to install and test a modified version of the vmm-xen component of qubes
after making some modifications to the sources in the qubes-builder repo, i was able to get it to fetch sources with “make get-sources”
i patched the vmm-xen component and successfully built the corresponding packages with “make vmm-xen”
i copied the rpms i built to dom0 and attempted to install them
i could not install the rpms that were built because of the aforementioned conflicts with the qubes-core-dom0 metapackage “problem: the operation would result in removing the following protected packages: qubes-core-dom0”
since it is clear that it is not possible to build and update the vmm-xen component by itself via packages built from qubes-builder using the normal fedora pkg manager (dnf), there is clearly some missing tribal knowledge required to update these packages. i am seeking this tribal knowledge, which boils down to “how do qubes developers build, install, and test patches to vmm-xen?”.
the fact that qubes rides on top of xen is fine most of the time, but it is a shitshow when xen does not support the hardware properly. i am stuck on a laptop that alternates between no fan spinning and 3500+ rpm spinning with no middle ground as a result of this missing hw support in xen. see this issue for reference, which suggests patching vmm-xen as a resolution
opened 10:23PM - 12 Dec 18 UTC
T: bug
C: Xen
P: major
hardware support
needs diagnosis
### Qubes OS version:
<!-- (e.g., `R3.2`)
You can get it from the dom0 te… rminal with the command
`cat /etc/qubes-release`
Type below this line. -->
R4.0
### Affected component(s):
intel_pstate
acpi-cpufreq
xenpm
---
### Steps to reproduce the behavior:
<!-- Use single backticks (`) for in-line code snippets and
triple backticks (```) for code blocks.
Type below this line. -->
Tested on:
- Lenovo T480s
- Lenovo X1 Carbon Gen. 6
- Huawei Matebook X Pro.
All with Intel i7-8550U.
Latest BIOS revisions for the respective systems as of Dec. 2018
Kernel: 4.19.2-3.pvops.qubes.x86_64.
EFI install.
In dom0, `sudo xenpm get-cpufreq-para`
### Expected behavior:
The processor is rated at 1.8 GHz (4.0 turbo), so we would expect to see appropriate scaling in that range, available frequencies from 1800000 - 4000000.
Further, we would expect to see `scaling_driver = intel_pstate`.
### Actual behavior:
The CPU frequencies do not scale correctly. Why?
Frequencies are pinned at 2 GHz max, 400 MHz min, across all cores.
```
# xenpm cpu-freq-para
...
cpu id : 0
affected_cpus : 0
cpuinfo frequency : max [2001000] min [400000] cur [2001000]
scaling_driver : acpi-cpufreq
scaling_avail_gov : userspace performance powersave ondemand
current_governor : ondemand
ondemand specific :
sampling_rate : max [10000000] min [10000] cur [20000]
up_threshold : 80
scaling_avail_freq : 2001000 2000000 1900000 1800000 1700000 1500000 1400000 1300000 1200000 1100000 1000000 800000 700000 600000 500000 *400000
scaling frequency : max [2001000] min [400000] cur [400000]
turbo mode : enabled
...
```
Confirmed with `watch -n1 "cat /proc/cpuinfo | grep \"[c]pu MHz\""`
`xenpm set-scaling-maxfreq` and `-minfreq` have no effect.
`xenpm get-cpufreq-states` shows 16 total/usable P-states.
Changing the governor to `performance` has no effect. Default is `ondemand`
`dmidecode` reports a max of 2 GHz on the Lenovos, and an apparently erroneous speed on the Huawei (~ 8 GHz).
```
# dmidecode | grep -i speed
Speed: 2400 MT/s
Configured Clock Speed: 2400 MT/s
Speed: 2400 MT/s
Configured Clock Speed: 2400 MT/s
Speed: Unknown
Speed: Unknown
Speed: Unknown
Max Speed: 2000 MHz
Current Speed: 1800 MHz
```
The `scaling_driver` is legacy `acpi-cpufreq`. Interestingly, `intel_pstate` can be seen initializing during boot, but it does not take over handling anything. Attempting to `blacklist acpi-cpufreq` in `modprobe.d` has no effect.
```
# dmesg | grep pstate
[ 5.067624] intel_pstate: Intel P-state driver initializing
```
`/sys/devices/system/cpu/intel_pstate/` contains the expected attributes, but as mentioned in the "related issue" linked below, `no_turbo`, `num_pstates`, and `turbo_pct` error `Resource temporarily unavailable`.
`/sys/devices/system/cpu/intel_pstate/status` always returns `off`, and does not respond to `echo "active" >`. This behavior has been tested with various kernel command line parameters, including `intel_pstate=force`, `intel_pstate=disabled`, `intel_pstate=no_hwp`, `intel_pstate=enable` with no change in performance aside from `../cpu/intel_pstate/` attributes disappearing when `no_hwp` or `disabled` were in effect. Also tried `processor.ignore_ppc=1`.
Strangely, none of the appropriate attributes for `cpufreq` exist in `/sys/devices/system/cpu/cpu*/`.
```
# ls /sys/devices/system/cpu/cpu0/
acpi_cppc driver hotplug power topology
cache firmware_node node0 subsystem uevent
```
`lsmod | grep cpufreq` shows no results, trying to `modprobe acpi-cpufreq` or `cpufreq-xen` returns errors. `xen_acpi_processor` is loaded.
```
# modprobe acpi-cpufreq
modprobe: ERROR: could not insert 'acpi_cpufreq': No such device
# modprobe cpufreq-xen
modprobe: FATAL: Module cpufreq-xen not found in directory /lib/modules/4.19.2-3.pvops.qubes.x86_64
```
`cpupower frequency-info` is completely unresponsive, with zero information available about the processor.
```
analyzing CPU 0:
no or unknown cpufreq driver is active on this CPU
CPUs which run at the same hardware frequency: Not Available
CPUs which need to have their frequency coordinated by software: Not Available
maximum transition latency: Cannot determine or is not supported.
Not Available
available cpufreq governors: Not Available
Unable to determine current policy
current CPU frequency: Unable to call hardware
current CPU frequency: Unable to call to kernel
boost state support:
Supported: yes
Active: yes
```
Though it shouldn't have any effect, testing was attempted with `smt=on` and `off`, and `Hyperthreading` enabled/disabled in the BIOS appropriately.
Testing was also performed while toggling various BIOS settings.
- enable/disable `Intel SpeedStep`
- power settings at `Maximum Performance` vs. `Balanced`
It does not appear to be a thermal throttling issue, with idle ~ 37*C and under load ~60*C observed consistently.
`tlp` was tested with no effect on the frequency scaling, regardless of being enabled or disabled. `tlp-stat` yields minimal additional info, with what seems to be an outdated recommendation for the Lenovos to install `tp-smapi kernel modules`, that are in fact deprecated in favor of `thinkpad_acpi`, which appears to be active on the Thinkpads.
```
dmesg | grep thinkpad
[ 19.589434] thinkpad_acpi: ThinkPad ACPI Extras v0.26
[ 19.589439] thinkpad_acpi: http://ibm-acpi.sf.net/
[ 19.589440] thinkpad_acpi: ThinkPad BIOS N22ET50W (1.27 ), EC unknown
[ 19.589441] thinkpad_acpi: Lenovo ThinkPad T480s, model 20L7CTO1WW
[ 19.591883] thinkpad_acpi: radio switch found; radios are enabled
[ 19.591898] thinkpad_acpi: This ThinkPad has standard ACPI backlight brightness control, supported by the ACPI video driver
[ 19.591899] thinkpad_acpi: Disabling thinkpad-acpi brightness events by default...
[ 19.612278] thinkpad_acpi: rfkill switch tpacpi_wwan_sw: radio is unblocked
[ 19.643468] thinkpad_acpi: Standard ACPI backlight interface available, not loading native one
[ 19.674512] thinkpad_acpi: battery 1 registered (start 0, stop 100)
[ 19.674576] input: ThinkPad Extra Buttons as /devices/platform/thinkpad_acpi/
```
`thermald` is not loaded.
### General notes:
https://www.kernel.org/doc/html/v4.12/admin-guide/pm/intel_pstate.html
- This link suggests removing `irqbalance` but I'm skeptical.
https://askubuntu.com/questions/1067866/ubuntu-18-04-steam-games-frame-rate-drop/1073353#1073353?newreg=c7c120f373da4effb7317104571cd573
- https://cateee.net/lkddb/web-lkddb/XEN_ACPI_PROCESSOR.html
Regarding xen_acpi_processor: "It also registers itself as the SMM so that other drivers (such as ACPI cpufreq scaling driver) will not load."
How could `lsmod` report `xen_acpi_processor` as loaded but `xenpm` shows the scaling driver `acpi-cpufreq` ? This might make sense as to the missing `/sys/devices/.../cpufreq` entries.
- The following exchange is dubious at best, the final post gets down to the point of disabling intel microcode. They also suggest the use of `msr-tools`, but that really shouldn't be necessary.
https://bbs.archlinux.org/viewtopic.php?id=231077
- This is good work, but in my opinion, running a script every few seconds in dom0 isn't a legitimate fix.
https://github.com/erpalma/lenovo-throttling-fix
---
### Related issues:
https://github.com/QubesOS/qubes-issues/issues/4491
https://github.com/QubesOS/qubes-issues/issues/450
while i have really enjoyed running qubes for the past 7+ years, the fact that joanna effectively abandoned the project to work on golem (or whatever she has pivoted to since) has not gone unnoticed by myself and many other users. i’m not asking the world here, i just need access to some undocumented tribal knowledge so i can avoid wasting many hours of my own time and/or money on new hardware.
that nobody is either willing or able to read this and point me in the right direction speaks to the extent that qubes is becoming / is a ghost town. i’ve posted on the user mailing list and posted on the forum, wtf else am i supposed to do? post this on the dev list?