Web browser's (Firefox) video playback is laggy

KitsuneNoBaka · June 9, 2025, 5:48am

Rare i915-related GPU hangs during seemingly random times

opened 04:47PM - 16 Aug 24 UTC

C: other P: major hardware support needs diagnosis affects-4.2

[How to file a helpful issue](https://www.qubes-os.org/doc/issue-tracking/) #…## Qubes OS release 4.2.2 ### Brief summary Random GPU hangs during seemingly random and rare times, e.g. once per 2 months, worked-around only with hard resets. The HCL report of the hardware, on which these anomalies happen is the following, with one exception (more on this later): ``` --- layout: 'hcl' type: 'Notebook' hvm: 'yes' iommu: 'yes' slat: 'yes' tpm: '2.0' remap: 'yes' brand: | LENOVO model: | 20L6SBJW00 bios: | N24ET60W (1.35 ) cpu: | Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz cpu-short: | FIXME chipset: | Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [8086:5914] (rev 08) chipset-short: | FIXME gpu: | Intel Corporation UHD Graphics 620 [8086:5917] (rev 07) (prog-if 00 [VGA controller]) gpu-short: | FIXME network: | Intel Corporation Ethernet Connection (4) I219-LM [8086:15d7] (rev 21) Intel Corporation Wireless 8265 / 8275 [8086:24fd] (rev 78) memory: | 65406 scsi: | usb: | 2 certified: 'no' versions: - works: 'FIXME:yes|no|partial' qubes: | R4.2.2 xen: | 4.17.4 kernel: | 6.6.42-1 remark: | FIXME credit: | FIXAUTHOR link: | FIXLINK ``` The exception is that the kernel used where the hang happened was actually 6.6.36, as the logs say: ``` Aug 11 17:54:17 dom0 kernel: Linux version 6.6.36-1.qubes.fc37.x86_64 (mockbuild@01d867aa44b046b59b72b56f0f81e904) (gcc (GCC) 12.3.1 20230508 (Red Hat 12.3.1-1), GNU ld version 2.38-27.fc37) #1 SMP PREEMPT_DYNAMIC Tue Jul 2 03:51:16 GMT 2024 ``` The logs (with a lot of noise, but preserved as evidence, that I couldn't see anything suspicious on my end) from the point in time where I closed the lid of my laptop to the point when I went back and noticed the unresponsiveness: (notice the lines with "i915" - more on them later) ``` [user@dom0 ~]$ sudo journalctl --since="2024-08-12 12:39:00" --until="2024-08-12 13:19:00" --no-pager Aug 12 12:39:21 dom0 systemd-logind[1758]: Lid closed. Aug 12 12:39:32 dom0 xscreensaver-auth[14684]: PAM unable to dlopen(/usr/lib64/security/pam_sss.so): /usr/lib64/security/pam_sss.so: cannot open shared object file: No such file or directory Aug 12 12:39:32 dom0 xscreensaver-auth[14684]: PAM adding faulty module: /usr/lib64/security/pam_sss.so Aug 12 12:40:30 dom0 kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out Aug 12 12:40:30 dom0 kernel: i915 0000:00:02.0: [drm] Xorg[5133] context reset due to GPU hang Aug 12 12:40:30 dom0 kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:85dffffb, in Xorg [5133] Aug 12 13:00:01 dom0 CROND[14739]: (root) CMD (/usr/bin/qvm-sync-clock > /dev/null 2>&1 || true) Aug 12 13:00:02 dom0 audit[14743]: USYS_CONFIG pid=14743 uid=0 auid=4294967295 ses=4294967295 msg='op=change-system-time exe="/usr/sbin/hwclock" hostname=? addr=? terminal=? res=success' Aug 12 13:00:02 dom0 kernel: audit: type=1111 audit(1723460402.499:615): pid=14743 uid=0 auid=4294967295 ses=4294967295 msg='op=change-system-time exe="/usr/sbin/hwclock" hostname=? addr=? terminal=? res=success' Aug 12 13:00:02 dom0 CROND[14738]: (root) CMDEND (/usr/bin/qvm-sync-clock > /dev/null 2>&1 || true) Aug 12 13:01:01 dom0 CROND[14747]: (root) CMD (run-parts /etc/cron.hourly) Aug 12 13:01:01 dom0 run-parts[14750]: (/etc/cron.hourly) starting 0anacron Aug 12 13:01:01 dom0 run-parts[14756]: (/etc/cron.hourly) finished 0anacron Aug 12 13:01:01 dom0 CROND[14746]: (root) CMDEND (run-parts /etc/cron.hourly) Aug 12 13:04:50 dom0 qrexec-policy-daemon[2809]: qrexec: qubes.GetDate+nanoseconds: social-media -> @default: allowed to dom0 Aug 12 13:04:50 dom0 audit: BPF prog-id=101 op=LOAD Aug 12 13:04:50 dom0 kernel: audit: type=1334 audit(1723460690.049:616): prog-id=101 op=LOAD Aug 12 13:04:50 dom0 kernel: audit: type=1334 audit(1723460690.049:617): prog-id=102 op=LOAD Aug 12 13:04:50 dom0 kernel: audit: type=1334 audit(1723460690.049:618): prog-id=103 op=LOAD Aug 12 13:04:50 dom0 audit: BPF prog-id=102 op=LOAD Aug 12 13:04:50 dom0 audit: BPF prog-id=103 op=LOAD Aug 12 13:04:50 dom0 systemd[1]: Starting systemd-hostnamed.service - Hostname Service... Aug 12 13:04:50 dom0 systemd[1]: Started systemd-hostnamed.service - Hostname Service. Aug 12 13:04:50 dom0 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Aug 12 13:04:50 dom0 kernel: audit: type=1130 audit(1723460690.150:619): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Aug 12 13:05:20 dom0 systemd[1]: systemd-hostnamed.service: Deactivated successfully. Aug 12 13:05:20 dom0 kernel: audit: type=1131 audit(1723460720.186:620): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Aug 12 13:05:20 dom0 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Aug 12 13:05:20 dom0 audit: BPF prog-id=103 op=UNLOAD Aug 12 13:05:20 dom0 kernel: audit: type=1334 audit(1723460720.231:621): prog-id=103 op=UNLOAD Aug 12 13:05:20 dom0 kernel: audit: type=1334 audit(1723460720.231:622): prog-id=102 op=UNLOAD Aug 12 13:05:20 dom0 kernel: audit: type=1334 audit(1723460720.231:623): prog-id=101 op=UNLOAD Aug 12 13:05:20 dom0 audit: BPF prog-id=102 op=UNLOAD Aug 12 13:05:20 dom0 audit: BPF prog-id=101 op=UNLOAD Aug 12 13:18:16 dom0 systemd-logind[1758]: Lid opened. ``` Then, after the GPU hang happened, the Alt+SysRq+K sequence did nothing. Tried multiple times and also made sure that the hotkeys were indeed active: ``` [user@dom0 ~]$ sysctl kernel.sysrq kernel.sysrq = 4 ``` So the problem might be somewhere deeper than just, as I initially thought, a faulty X11 configuration/installation in my case, which happens to be: ``` [user@dom0 ~]$ cat /etc/X11/xorg.conf.d/20-gfx.conf Section "device" Identifier "intel" Driver "modesetting" Option "AccelMethod" "glamor" EndSection ``` There's the (closed) issue https://github.com/QubesOS/qubes-issues/issues/7813, which contains the aforementioned lines containing "i915" in them, but this case is different than mine - I have no visual artifacts, only the mere unresponsiveness. Then, in that issue there's the linked comment https://github.com/QubesOS/qubes-issues/issues/7785#issuecomment-1254095362, which describes my case more precisely. However, since the issue https://github.com/QubesOS/qubes-issues/issues/7785 itself was about Qubes OS 4.1 and 5.15/5.18 kernels, and closed due to a bug about Xorg pages, as the comment at https://github.com/QubesOS/qubes-issues/issues/7785#issuecomment-1320171556 says, where these characteristics are no match for my case, I found opening a new ticket a wiser decision than requesting to reopen the linked one, and modifying the title. Might be related to https://github.com/QubesOS/qubes-issues/issues/7902, but this ticket is also about Qubes OS 4.1, where I don't recall having any of these GPU hangs, as well as about the i3 window manager. In case it's more appropriate to raise the issues in that ticket, please close this one and let me know, that I should paste my report there. Might be related to the fixes described in https://wiki.archlinux.org/index.php?title=Intel_graphics&oldid=814542#Crash/freeze_on_low_power_Intel_CPUs, but it can be hard to tell if any of these fixes work, considering how rare and random the GPU hangs can get (e.g. once per 3 months). I could provide more information, like a kernel dump/backtrace, in case of this hang happening the next time, but I'd request assistance, how should I prepare for it (should I just use kdump or do something else beforehand?), and where exactly can I read that specific information (meaning no verbose noise obscuring the valuable information), that might shed some light on this issue. ### Steps to reproduce Unknown at the point in time of writing this ticket, and trying to list any might well become fortune telling/providing unrelated noise at best, and misleading information at worst - the ![Brief summary](#brief-summary) paragraph should be more appropriate in this case. ### Expected behavior The system works fine without random GPU hangs and forcing the user to perform a hard reset. ### Actual behavior Random GPU hangs during seemingly random and rare times, e.g. once per 2 months, worked-around only with hard resets.