I know that Qubes OS is built around compartmentalization, so I’m aware(as far as I can) of higher impact on machine resources, however, when my laptop is idle, dom0 eats between 2 and 5 percent of CPU(displayed on qui-domains short-info pop-up), when I watch 1080 youtube video(not even full-screen) with firefox and personal template(debian) that particular template eats 50-80% of the CPU. I don’t see any visual artifacts, however the laptop gets wild both by fan speed and heating. I use xps 13 with i7 and fresh KDE DE. I switched from manjaro and never had anything noticeable like that before.
Is this really the cost of compartmentalization or perhaps something doesn’t work well, how do I troubleshoot this kind of things?
Is there any tweaks I can use to somehow reduce the impact of heavy tasks on the CPU? Perhaps by switching something off, etc?
This might be relevant: Qubes is EXTREMELY slow on very new p1 gen 5 lenovo laptop .
More links:
opened 05:09PM - 01 Apr 22 UTC
T: bug
C: Xen
P: major
r4.0-dom0-cur-test
r4.1-dom0-stable
diagnosed
pr submitted
r4.2-host-stable
affects-4.1
[How to file a helpful issue](https://www.qubes-os.org/doc/issue-tracking/)
#… ## Qubes OS release
R4.1
### Brief summary
The Xen hypervisor has performance problems on certain compute-intensive workloads
### Steps to reproduce
See @fepitre for details
### Expected behavior
Same (or almost same) performance as bare hardware
### Actual behavior
Less performance than bare hardware
opened 09:59PM - 29 Dec 19 UTC
T: enhancement
C: Xen
security
P: default
<!-- IMPORTANT: Please read our issue tracker guidelines before submitting this … issue:
https://www.qubes-os.org/doc/reporting-bugs/
Please do not delete or remove any part of this issue template. -->
### Qubes OS version:
R 4.0.1
All current versions as of 1 Jan 2020
### Affected component(s) or functionality:
Xen, hyperthreading, Intel
---
### Steps to reproduce the behavior:
Run Qubes on machine with hyperthreading (HT) hardware
### Expected or desired behavior:
HT should ideally be available
### Actual behavior:
HT disabled by default
### General notes:
HT has been deliberately disabled for security reasons, and with the current relaease versions of Xen this is sensible. The reason is that there exist several exploits on Intel kit that involve abuse of shared cache, etc, when one thread in a CPU is running on one VM and another on another.
However, the Xen wizards are on the case, and they have a fix that will ensure that all the threads running on a cpu are allocated together. This means that the only software exposed by such exploits will be software that already has access to that machine. This may reduce the attack surface to an acceptable level for at least some users.
The patch is now included in some unstable branches of Xen, and is invoked by the command line parameters
smt on sched-gran=core
(or socket or cpu). My request is that this is implemented by Qubes, but ONLY once we start using a Xen version that includes this feature.
However, it is currently (Jan 2020) too soon to change the current behaviour, as the versions of Xen currently in use ignore this combination without warning.
My request therefore is that this is assigned a sensibly long timescale.
---
### I have consulted the following relevant [documentation](https://www.qubes-os.org/doc/):
https://www.slideshare.net/xen_com_mgr/xpdds19-core-scheduling-in-xen-jrgen-gro-suse
https://patchwork.kernel.org/cover/11086677/
https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#sched-gran-x86
(NOTE the above path is in the "unstable" branch)
### I am aware of the following related, [non-duplicate](https://www.qubes-os.org/doc/reporting-bugs/#new-issues-should-not-be-duplicates-of-existing-issues) issues:
opened 12:55AM - 01 Aug 19 UTC
T: bug
C: Xen
P: default
needs diagnosis
C: power management
affects-4.1
**Qubes OS version**
R4.0.2-rc1
**Affected component(s) or functionality**
… Power Management (Xen)
**Brief summary**
Power usage 2-3X after first suspend/resume cycle
**To Reproduce**
(Tested on Lenovo T470 and T490)
1. Turn off laptop
2. Turn on laptop and login to desktop
3. Note power usage using powertop, and look at power consumption - you can wait as long as you like for sampling
4. Suspend computer by closing lid or through system menu
5. Resume computer
6. Note that power usage reported in powertop is 2-3x what it was before, and is sustained until the machine is rebooted
**Expected behavior**
Power management is restored after resumption, and not materially different than before putting laptop to sleep.
**Actual behavior**
Laptop consumes 2-3x power consumption in watts before suspend
**Screenshots**
N/A
**Additional context**
Not sure how many systems this impacts. I lived with this for years on the T470, but when I got the T490 the symptoms were way more noticeable and I can't carry around a spare battery anymore to mitigate it.
**Solutions you've tried**
* Setting xempm cpufreq to "ondemand"
* Using tlp in dom0 before/after suspend
* Testing on more than one device (T470/T490)
* Updating to kernel-latest
**Relevant [documentation](https://www.qubes-os.org/doc/) you've consulted**
* https://wiki.xenproject.org/wiki/Xenpm_command
* https://www.qubes-os.org/doc/newer-hardware-troubleshooting/
* https://groups.google.com/forum/#!searchin/qubes-users/T470%7Csort:date/qubes-users/eqhw02hHtCk/zf77k7q5EwAJ
* https://groups.google.com/forum/#!searchin/qubes-users/T490%7Csort:date/qubes-users/Z0Kfm53zMxQ/oRRR155qDQAJ
**Related, [non-duplicate](https://www.qubes-os.org/doc/reporting-bugs/#new-issues-should-not-be-duplicates-of-existing-issues) issues**
none
opened 10:23PM - 12 Dec 18 UTC
T: bug
C: Xen
P: major
r4.1-buster-stable
hardware support
r4.1-bullseye-stable
r4.1-dom0-stable
needs diagnosis
pr submitted
r4.1-centos-stream8-cur-test
r4.1-bookworm-stable
r4.2-host-cur-test
affects-4.1
affects-4.2
### Qubes OS version:
<!-- (e.g., `R3.2`)
You can get it from the dom0 te… rminal with the command
`cat /etc/qubes-release`
Type below this line. -->
R4.0
### Affected component(s):
intel_pstate
acpi-cpufreq
xenpm
---
### Steps to reproduce the behavior:
<!-- Use single backticks (`) for in-line code snippets and
triple backticks (```) for code blocks.
Type below this line. -->
Tested on:
- Lenovo T480s
- Lenovo X1 Carbon Gen. 6
- Huawei Matebook X Pro.
All with Intel i7-8550U.
Latest BIOS revisions for the respective systems as of Dec. 2018
Kernel: 4.19.2-3.pvops.qubes.x86_64.
EFI install.
In dom0, `sudo xenpm get-cpufreq-para`
### Expected behavior:
The processor is rated at 1.8 GHz (4.0 turbo), so we would expect to see appropriate scaling in that range, available frequencies from 1800000 - 4000000.
Further, we would expect to see `scaling_driver = intel_pstate`.
### Actual behavior:
The CPU frequencies do not scale correctly. Why?
Frequencies are pinned at 2 GHz max, 400 MHz min, across all cores.
```
# xenpm cpu-freq-para
...
cpu id : 0
affected_cpus : 0
cpuinfo frequency : max [2001000] min [400000] cur [2001000]
scaling_driver : acpi-cpufreq
scaling_avail_gov : userspace performance powersave ondemand
current_governor : ondemand
ondemand specific :
sampling_rate : max [10000000] min [10000] cur [20000]
up_threshold : 80
scaling_avail_freq : 2001000 2000000 1900000 1800000 1700000 1500000 1400000 1300000 1200000 1100000 1000000 800000 700000 600000 500000 *400000
scaling frequency : max [2001000] min [400000] cur [400000]
turbo mode : enabled
...
```
Confirmed with `watch -n1 "cat /proc/cpuinfo | grep \"[c]pu MHz\""`
`xenpm set-scaling-maxfreq` and `-minfreq` have no effect.
`xenpm get-cpufreq-states` shows 16 total/usable P-states.
Changing the governor to `performance` has no effect. Default is `ondemand`
`dmidecode` reports a max of 2 GHz on the Lenovos, and an apparently erroneous speed on the Huawei (~ 8 GHz).
```
# dmidecode | grep -i speed
Speed: 2400 MT/s
Configured Clock Speed: 2400 MT/s
Speed: 2400 MT/s
Configured Clock Speed: 2400 MT/s
Speed: Unknown
Speed: Unknown
Speed: Unknown
Max Speed: 2000 MHz
Current Speed: 1800 MHz
```
The `scaling_driver` is legacy `acpi-cpufreq`. Interestingly, `intel_pstate` can be seen initializing during boot, but it does not take over handling anything. Attempting to `blacklist acpi-cpufreq` in `modprobe.d` has no effect.
```
# dmesg | grep pstate
[ 5.067624] intel_pstate: Intel P-state driver initializing
```
`/sys/devices/system/cpu/intel_pstate/` contains the expected attributes, but as mentioned in the "related issue" linked below, `no_turbo`, `num_pstates`, and `turbo_pct` error `Resource temporarily unavailable`.
`/sys/devices/system/cpu/intel_pstate/status` always returns `off`, and does not respond to `echo "active" >`. This behavior has been tested with various kernel command line parameters, including `intel_pstate=force`, `intel_pstate=disabled`, `intel_pstate=no_hwp`, `intel_pstate=enable` with no change in performance aside from `../cpu/intel_pstate/` attributes disappearing when `no_hwp` or `disabled` were in effect. Also tried `processor.ignore_ppc=1`.
Strangely, none of the appropriate attributes for `cpufreq` exist in `/sys/devices/system/cpu/cpu*/`.
```
# ls /sys/devices/system/cpu/cpu0/
acpi_cppc driver hotplug power topology
cache firmware_node node0 subsystem uevent
```
`lsmod | grep cpufreq` shows no results, trying to `modprobe acpi-cpufreq` or `cpufreq-xen` returns errors. `xen_acpi_processor` is loaded.
```
# modprobe acpi-cpufreq
modprobe: ERROR: could not insert 'acpi_cpufreq': No such device
# modprobe cpufreq-xen
modprobe: FATAL: Module cpufreq-xen not found in directory /lib/modules/4.19.2-3.pvops.qubes.x86_64
```
`cpupower frequency-info` is completely unresponsive, with zero information available about the processor.
```
analyzing CPU 0:
no or unknown cpufreq driver is active on this CPU
CPUs which run at the same hardware frequency: Not Available
CPUs which need to have their frequency coordinated by software: Not Available
maximum transition latency: Cannot determine or is not supported.
Not Available
available cpufreq governors: Not Available
Unable to determine current policy
current CPU frequency: Unable to call hardware
current CPU frequency: Unable to call to kernel
boost state support:
Supported: yes
Active: yes
```
Though it shouldn't have any effect, testing was attempted with `smt=on` and `off`, and `Hyperthreading` enabled/disabled in the BIOS appropriately.
Testing was also performed while toggling various BIOS settings.
- enable/disable `Intel SpeedStep`
- power settings at `Maximum Performance` vs. `Balanced`
It does not appear to be a thermal throttling issue, with idle ~ 37*C and under load ~60*C observed consistently.
`tlp` was tested with no effect on the frequency scaling, regardless of being enabled or disabled. `tlp-stat` yields minimal additional info, with what seems to be an outdated recommendation for the Lenovos to install `tp-smapi kernel modules`, that are in fact deprecated in favor of `thinkpad_acpi`, which appears to be active on the Thinkpads.
```
dmesg | grep thinkpad
[ 19.589434] thinkpad_acpi: ThinkPad ACPI Extras v0.26
[ 19.589439] thinkpad_acpi: http://ibm-acpi.sf.net/
[ 19.589440] thinkpad_acpi: ThinkPad BIOS N22ET50W (1.27 ), EC unknown
[ 19.589441] thinkpad_acpi: Lenovo ThinkPad T480s, model 20L7CTO1WW
[ 19.591883] thinkpad_acpi: radio switch found; radios are enabled
[ 19.591898] thinkpad_acpi: This ThinkPad has standard ACPI backlight brightness control, supported by the ACPI video driver
[ 19.591899] thinkpad_acpi: Disabling thinkpad-acpi brightness events by default...
[ 19.612278] thinkpad_acpi: rfkill switch tpacpi_wwan_sw: radio is unblocked
[ 19.643468] thinkpad_acpi: Standard ACPI backlight interface available, not loading native one
[ 19.674512] thinkpad_acpi: battery 1 registered (start 0, stop 100)
[ 19.674576] input: ThinkPad Extra Buttons as /devices/platform/thinkpad_acpi/
```
`thermald` is not loaded.
### General notes:
https://www.kernel.org/doc/html/v4.12/admin-guide/pm/intel_pstate.html
- This link suggests removing `irqbalance` but I'm skeptical.
https://askubuntu.com/questions/1067866/ubuntu-18-04-steam-games-frame-rate-drop/1073353#1073353?newreg=c7c120f373da4effb7317104571cd573
- https://cateee.net/lkddb/web-lkddb/XEN_ACPI_PROCESSOR.html
Regarding xen_acpi_processor: "It also registers itself as the SMM so that other drivers (such as ACPI cpufreq scaling driver) will not load."
How could `lsmod` report `xen_acpi_processor` as loaded but `xenpm` shows the scaling driver `acpi-cpufreq` ? This might make sense as to the missing `/sys/devices/.../cpufreq` entries.
- The following exchange is dubious at best, the final post gets down to the point of disabling intel microcode. They also suggest the use of `msr-tools`, but that really shouldn't be necessary.
https://bbs.archlinux.org/viewtopic.php?id=231077
- This is good work, but in my opinion, running a script every few seconds in dom0 isn't a legitimate fix.
https://github.com/erpalma/lenovo-throttling-fix
---
### Related issues:
https://github.com/QubesOS/qubes-issues/issues/4491
https://github.com/QubesOS/qubes-issues/issues/450
And another extremely annoying bug: Xorg takes 100% cpu with every scroll and it’s extremely slow using Libre.
Maybe I should ditch Libre and move to another Office suit?
My Ryzen 5600G doesn’t seem to benefit from iGPU and the cpu usage is very high, at least 60% (I assume it’s single core?).
1 Like
@fsflover So to summarize:
it supposed to eat more, due to disabled hardware-acceleration(is it only external GPU or both?)
we can now enable HyperThreading with kernel param
Is there anything else I missed in this jungle(I do very appreciate your help with finding and providing the above links)?
I enabled HyperThreading in BIOS and add kernel params. This didn’t solve the 50-80% CPU usage problem, but it did in fact changed something, as right now I’m getting 0% in dom0 while idle.
1 Like
Correct.
AFAIK both. You can try (second) GPU passthrough though, to have GPU acceleration in a VM.
It’s indeed a lot of links. This post and this post suggest that some things can work better if you tell your software not to rely on GPU (as far as I understood).
That took awhile to go through. Even though it wasn’t updated long time, I guess it should work to passthough GPU… I see lots of recent topics on this forum about it.
Yeah, I digged a lot about this atm, nothing helped me significantly to notice. I guess the passthough solution would be great to use, it’s a reasonable compromise, but require an upgrade.
Is it 50-80% of the total resources or the resources allocated to the appVM?
Per qube(personal one). I do have 2 vCPU allocated for it though.
renehoj
October 25, 2022, 11:02am
8
I don’t think ~65% sounds that crazy if you are using 2 vcpus with hyperthrading, I think the qube has access to 25% of the total resources.
1 Like
@renehoj It’s kind of new to me, as I never used VMs before and never experienced anything like that. Thank you for clarifying this for me.