Much higher CPU usage than on other distors

Frank_boje · October 19, 2022, 5:12pm

I know that Qubes OS is built around compartmentalization, so I’m aware(as far as I can) of higher impact on machine resources, however, when my laptop is idle, dom0 eats between 2 and 5 percent of CPU(displayed on qui-domains short-info pop-up), when I watch 1080 youtube video(not even full-screen) with firefox and personal template(debian) that particular template eats 50-80% of the CPU. I don’t see any visual artifacts, however the laptop gets wild both by fan speed and heating. I use xps 13 with i7 and fresh KDE DE. I switched from manjaro and never had anything noticeable like that before.

Is this really the cost of compartmentalization or perhaps something doesn’t work well, how do I troubleshoot this kind of things?
Is there any tweaks I can use to somehow reduce the impact of heavy tasks on the CPU? Perhaps by switching something off, etc?

fsflover · October 19, 2022, 5:16pm

This might be relevant: Qubes is EXTREMELY slow on very new p1 gen 5 lenovo laptop.

Xen-related performance problems

opened 05:09PM - 01 Apr 22 UTC

DemiMarie

T: bug C: Xen P: major r4.0-dom0-cur-test r4.1-dom0-stable diagnosed pr submitted r4.2-host-stable affects-4.1

[How to file a helpful issue](https://www.qubes-os.org/doc/issue-tracking/) #…## Qubes OS release R4.1 ### Brief summary The Xen hypervisor has performance problems on certain compute-intensive workloads ### Steps to reproduce See @fepitre for details ### Expected behavior Same (or almost same) performance as bare hardware ### Actual behavior Less performance than bare hardware

github.com/QubesOS/qubes-issues

Safe use of Hyperthreading when Xen stable includes new sched-gran parameter

opened 09:59PM - 29 Dec 19 UTC

trueriver

T: enhancement C: Xen security P: default

### Qubes OS version: R 4.0.1 All current versions as of 1 Jan 2020 ### Affected component(s) or functionality: Xen, hyperthreading, Intel --- ### Steps to reproduce the behavior: Run Qubes on machine with hyperthreading (HT) hardware ### Expected or desired behavior: HT should ideally be available ### Actual behavior: HT disabled by default ### General notes: HT has been deliberately disabled for security reasons, and with the current relaease versions of Xen this is sensible. The reason is that there exist several exploits on Intel kit that involve abuse of shared cache, etc, when one thread in a CPU is running on one VM and another on another. However, the Xen wizards are on the case, and they have a fix that will ensure that all the threads running on a cpu are allocated together. This means that the only software exposed by such exploits will be software that already has access to that machine. This may reduce the attack surface to an acceptable level for at least some users. The patch is now included in some unstable branches of Xen, and is invoked by the command line parameters smt on sched-gran=core (or socket or cpu). My request is that this is implemented by Qubes, but ONLY once we start using a Xen version that includes this feature. However, it is currently (Jan 2020) too soon to change the current behaviour, as the versions of Xen currently in use ignore this combination without warning. My request therefore is that this is assigned a sensibly long timescale. --- ### I have consulted the following relevant [documentation](https://www.qubes-os.org/doc/): https://www.slideshare.net/xen_com_mgr/xpdds19-core-scheduling-in-xen-jrgen-gro-suse https://patchwork.kernel.org/cover/11086677/ https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#sched-gran-x86 (NOTE the above path is in the "unstable" branch) ### I am aware of the following related, [non-duplicate](https://www.qubes-os.org/doc/reporting-bugs/#new-issues-should-not-be-duplicates-of-existing-issues) issues:

github.com/QubesOS/qubes-issues

Power Consumption 2-3x after first suspend/resume

opened 12:55AM - 01 Aug 19 UTC

AndrewX192

T: bug C: Xen P: default needs diagnosis C: power management affects-4.1

**Qubes OS version** R4.0.2-rc1 **Affected component(s) or functionality** …Power Management (Xen) **Brief summary** Power usage 2-3X after first suspend/resume cycle **To Reproduce** (Tested on Lenovo T470 and T490) 1. Turn off laptop 2. Turn on laptop and login to desktop 3. Note power usage using powertop, and look at power consumption - you can wait as long as you like for sampling 4. Suspend computer by closing lid or through system menu 5. Resume computer 6. Note that power usage reported in powertop is 2-3x what it was before, and is sustained until the machine is rebooted **Expected behavior** Power management is restored after resumption, and not materially different than before putting laptop to sleep. **Actual behavior** Laptop consumes 2-3x power consumption in watts before suspend **Screenshots** N/A **Additional context** Not sure how many systems this impacts. I lived with this for years on the T470, but when I got the T490 the symptoms were way more noticeable and I can't carry around a spare battery anymore to mitigate it. **Solutions you've tried** * Setting xempm cpufreq to "ondemand" * Using tlp in dom0 before/after suspend * Testing on more than one device (T470/T490) * Updating to kernel-latest **Relevant [documentation](https://www.qubes-os.org/doc/) you've consulted** * https://wiki.xenproject.org/wiki/Xenpm_command * https://www.qubes-os.org/doc/newer-hardware-troubleshooting/ * https://groups.google.com/forum/#!searchin/qubes-users/T470%7Csort:date/qubes-users/eqhw02hHtCk/zf77k7q5EwAJ * https://groups.google.com/forum/#!searchin/qubes-users/T490%7Csort:date/qubes-users/Z0Kfm53zMxQ/oRRR155qDQAJ **Related, [non-duplicate](https://www.qubes-os.org/doc/reporting-bugs/#new-issues-should-not-be-duplicates-of-existing-issues) issues** none

github.com/QubesOS/qubes-issues

Intel CPU Frequency Scaling Broken

opened 10:23PM - 12 Dec 18 UTC

sylentprofet

T: bug C: Xen P: major r4.1-buster-stable hardware support r4.1-bullseye-stable r4.1-dom0-stable needs diagnosis pr submitted r4.1-centos-stream8-cur-test r4.1-bookworm-stable r4.2-host-cur-test affects-4.1 affects-4.2

### Qubes OS version:  R4.0 ### Affected component(s): intel_pstate acpi-cpufreq xenpm --- ### Steps to reproduce the behavior:  Tested on: - Lenovo T480s - Lenovo X1 Carbon Gen. 6 - Huawei Matebook X Pro. All with Intel i7-8550U. Latest BIOS revisions for the respective systems as of Dec. 2018 Kernel: 4.19.2-3.pvops.qubes.x86_64. EFI install. In dom0, `sudo xenpm get-cpufreq-para` ### Expected behavior: The processor is rated at 1.8 GHz (4.0 turbo), so we would expect to see appropriate scaling in that range, available frequencies from 1800000 - 4000000. Further, we would expect to see `scaling_driver = intel_pstate`. ### Actual behavior: The CPU frequencies do not scale correctly. Why? Frequencies are pinned at 2 GHz max, 400 MHz min, across all cores. ``` # xenpm cpu-freq-para ... cpu id : 0 affected_cpus : 0 cpuinfo frequency : max [2001000] min [400000] cur [2001000] scaling_driver : acpi-cpufreq scaling_avail_gov : userspace performance powersave ondemand current_governor : ondemand ondemand specific : sampling_rate : max [10000000] min [10000] cur [20000] up_threshold : 80 scaling_avail_freq : 2001000 2000000 1900000 1800000 1700000 1500000 1400000 1300000 1200000 1100000 1000000 800000 700000 600000 500000 *400000 scaling frequency : max [2001000] min [400000] cur [400000] turbo mode : enabled ... ``` Confirmed with `watch -n1 "cat /proc/cpuinfo | grep \"[c]pu MHz\""` `xenpm set-scaling-maxfreq` and `-minfreq` have no effect. `xenpm get-cpufreq-states` shows 16 total/usable P-states. Changing the governor to `performance` has no effect. Default is `ondemand` `dmidecode` reports a max of 2 GHz on the Lenovos, and an apparently erroneous speed on the Huawei (~ 8 GHz). ``` # dmidecode | grep -i speed Speed: 2400 MT/s Configured Clock Speed: 2400 MT/s Speed: 2400 MT/s Configured Clock Speed: 2400 MT/s Speed: Unknown Speed: Unknown Speed: Unknown Max Speed: 2000 MHz Current Speed: 1800 MHz ``` The `scaling_driver` is legacy `acpi-cpufreq`. Interestingly, `intel_pstate` can be seen initializing during boot, but it does not take over handling anything. Attempting to `blacklist acpi-cpufreq` in `modprobe.d` has no effect. ``` # dmesg | grep pstate [ 5.067624] intel_pstate: Intel P-state driver initializing ``` `/sys/devices/system/cpu/intel_pstate/` contains the expected attributes, but as mentioned in the "related issue" linked below, `no_turbo`, `num_pstates`, and `turbo_pct` error `Resource temporarily unavailable`. `/sys/devices/system/cpu/intel_pstate/status` always returns `off`, and does not respond to `echo "active" >`. This behavior has been tested with various kernel command line parameters, including `intel_pstate=force`, `intel_pstate=disabled`, `intel_pstate=no_hwp`, `intel_pstate=enable` with no change in performance aside from `../cpu/intel_pstate/` attributes disappearing when `no_hwp` or `disabled` were in effect. Also tried `processor.ignore_ppc=1`. Strangely, none of the appropriate attributes for `cpufreq` exist in `/sys/devices/system/cpu/cpu*/`. ``` # ls /sys/devices/system/cpu/cpu0/ acpi_cppc driver hotplug power topology cache firmware_node node0 subsystem uevent ``` `lsmod | grep cpufreq` shows no results, trying to `modprobe acpi-cpufreq` or `cpufreq-xen` returns errors. `xen_acpi_processor` is loaded. ``` # modprobe acpi-cpufreq modprobe: ERROR: could not insert 'acpi_cpufreq': No such device # modprobe cpufreq-xen modprobe: FATAL: Module cpufreq-xen not found in directory /lib/modules/4.19.2-3.pvops.qubes.x86_64 ``` `cpupower frequency-info` is completely unresponsive, with zero information available about the processor. ``` analyzing CPU 0: no or unknown cpufreq driver is active on this CPU CPUs which run at the same hardware frequency: Not Available CPUs which need to have their frequency coordinated by software: Not Available maximum transition latency: Cannot determine or is not supported. Not Available available cpufreq governors: Not Available Unable to determine current policy current CPU frequency: Unable to call hardware current CPU frequency: Unable to call to kernel boost state support: Supported: yes Active: yes ``` Though it shouldn't have any effect, testing was attempted with `smt=on` and `off`, and `Hyperthreading` enabled/disabled in the BIOS appropriately. Testing was also performed while toggling various BIOS settings. - enable/disable `Intel SpeedStep` - power settings at `Maximum Performance` vs. `Balanced` It does not appear to be a thermal throttling issue, with idle ~ 37*C and under load ~60*C observed consistently. `tlp` was tested with no effect on the frequency scaling, regardless of being enabled or disabled. `tlp-stat` yields minimal additional info, with what seems to be an outdated recommendation for the Lenovos to install `tp-smapi kernel modules`, that are in fact deprecated in favor of `thinkpad_acpi`, which appears to be active on the Thinkpads. ``` dmesg | grep thinkpad [ 19.589434] thinkpad_acpi: ThinkPad ACPI Extras v0.26 [ 19.589439] thinkpad_acpi: http://ibm-acpi.sf.net/ [ 19.589440] thinkpad_acpi: ThinkPad BIOS N22ET50W (1.27 ), EC unknown [ 19.589441] thinkpad_acpi: Lenovo ThinkPad T480s, model 20L7CTO1WW [ 19.591883] thinkpad_acpi: radio switch found; radios are enabled [ 19.591898] thinkpad_acpi: This ThinkPad has standard ACPI backlight brightness control, supported by the ACPI video driver [ 19.591899] thinkpad_acpi: Disabling thinkpad-acpi brightness events by default... [ 19.612278] thinkpad_acpi: rfkill switch tpacpi_wwan_sw: radio is unblocked [ 19.643468] thinkpad_acpi: Standard ACPI backlight interface available, not loading native one [ 19.674512] thinkpad_acpi: battery 1 registered (start 0, stop 100) [ 19.674576] input: ThinkPad Extra Buttons as /devices/platform/thinkpad_acpi/ ``` `thermald` is not loaded. ### General notes: https://www.kernel.org/doc/html/v4.12/admin-guide/pm/intel_pstate.html - This link suggests removing `irqbalance` but I'm skeptical. https://askubuntu.com/questions/1067866/ubuntu-18-04-steam-games-frame-rate-drop/1073353#1073353?newreg=c7c120f373da4effb7317104571cd573 - https://cateee.net/lkddb/web-lkddb/XEN_ACPI_PROCESSOR.html Regarding xen_acpi_processor: "It also registers itself as the SMM so that other drivers (such as ACPI cpufreq scaling driver) will not load." How could `lsmod` report `xen_acpi_processor` as loaded but `xenpm` shows the scaling driver `acpi-cpufreq` ? This might make sense as to the missing `/sys/devices/.../cpufreq` entries. - The following exchange is dubious at best, the final post gets down to the point of disabling intel microcode. They also suggest the use of `msr-tools`, but that really shouldn't be necessary. https://bbs.archlinux.org/viewtopic.php?id=231077 - This is good work, but in my opinion, running a script every few seconds in dom0 isn't a legitimate fix. https://github.com/erpalma/lenovo-throttling-fix --- ### Related issues: https://github.com/QubesOS/qubes-issues/issues/4491 https://github.com/QubesOS/qubes-issues/issues/450

Frank_boje · October 19, 2022, 5:53pm

@fsflover So to summarize:

it supposed to eat more, due to disabled hardware-acceleration(is it only external GPU or both?)
we can now enable HyperThreading with kernel param

Is there anything else I missed in this jungle(I do very appreciate your help with finding and providing the above links)?

I enabled HyperThreading in BIOS and add kernel params. This didn’t solve the 50-80% CPU usage problem, but it did in fact changed something, as right now I’m getting 0% in dom0 while idle.

fsflover · October 19, 2022, 6:01pm

Correct.

AFAIK both. You can try (second) GPU passthrough though, to have GPU acceleration in a VM.

It’s indeed a lot of links. This post and this post suggest that some things can work better if you tell your software not to rely on GPU (as far as I understood).

Frank_boje · October 19, 2022, 7:46pm

That took awhile to go through. Even though it wasn’t updated long time, I guess it should work to passthough GPU… I see lots of recent topics on this forum about it.

Yeah, I digged a lot about this atm, nothing helped me significantly to notice. I guess the passthough solution would be great to use, it’s a reasonable compromise, but require an upgrade.

renehoj · October 19, 2022, 8:28pm

Is it 50-80% of the total resources or the resources allocated to the appVM?

Frank_boje · October 25, 2022, 10:22am

Per qube(personal one). I do have 2 vCPU allocated for it though.

renehoj · October 25, 2022, 11:02am

I don’t think ~65% sounds that crazy if you are using 2 vcpus with hyperthrading, I think the qube has access to 25% of the total resources.

Frank_boje · October 26, 2022, 2:22pm

@renehoj It’s kind of new to me, as I never used VMs before and never experienced anything like that. Thank you for clarifying this for me.