QubesOS freeze, crash and reboots

For me they disappeared when switching from a self-build HEADS version to one from the Nitrokey repository.

Interesting, I was on the Nitrokey version all the time.

15 monthā€™s after QubesOS started crashing randomly and created this post so far I have not been able to resolve it.

Today I took another look at it and entered sudo journalctl and the last two lines in red before reboot show:

dom0 xscreensaver[17577]: pam_unix(xscreensaver:auth): conversation failed
dom0 xscreensaver[17577]: pam_unix(xscreensaver:auth): auth could not identify password for [system76]
--reboot--

My Qubes 4.1 is not happy with kernel 6.5.(x up to 8-1). It freezes, crashes, getting shutdown due to uncontrolled hotnessā€¦ It works with 6.4.13, though.

I updated to QubesOS 4.2-RC4. I selected use latest kernel install. I am using QubesOS and all the templates with the kernel 6.5.

From time to time, during once every couple of start-up, shutdown, qubesOS freezes and I have to push down to the power button to shut it down.

Today it crashed twice.

I saw this as the last message:

dom0 anacron[28220]: Can't find sendmail at /usr/sbin/sendmail, not mailing output

I also saw this message a few lines before:

Xen free = 251622285 too small for satisfy assignments! assigned_but_unused=23471681968, domdict={'0': {'memor

I saw others see these errors too when they have crashes:

and

However so far I have not found the cause or a solution for QubesOS crashing.

I did a memory test using https://memtest.org/ and no hardware problems where found. Memory is working fine. Crashes must be caused by something else so far unknown to me.

Every crash, I now open the log and notice just random log notes. Nothing shows in the log.

But it always crashes if I resize a window. What can possibly cause a crash each time I resize the window? And how to solve this issue?

@trounces
I had a similar problem. My thread started here and was moved to its own by admin. Similarly, when resizing (or sometimes loading) a KeePassX window the whole screen would lock up but background things (like audio) would still work.

Oddly, replacing KeePassX with KeePass2 ā€œsolvedā€ the problem. The best I can figure isā€¦ in my case at least, looking in the Xen logs, I see IOMMU warnings about the graphics driver. The best I can figure is itā€™s one of two things: (1) Poor IOMMU driver compatibility thatā€™s agitated by certain applications doing certain things visually (2) Some poor graphics interop between VM space and the Xen final composite master display that is agitated by how some applications do certain things graphically.

I am able to replicate my test case reliably on two different laptops (one running Ryzen 5 PRO and another running Ryzen 7 PRO). The 7 model seems to tolerate a bit more of the KeePassX windows open but can be agitated to freeze with minimal effort.

Qubes R4.2.0, kernel 6.1.75-1

Had a freeze with minor visual artifacts. Sound kept going.

Right before the freeze had the following ERROR messages related to i915 driver:

Feb 29 03:12:08 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] flip_done timed out
Feb 29 03:12:18 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Feb 29 03:12:18 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] commit wait timed out

Some time before the freeze had another error message related to i915:

Feb 29 01:15:31 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* Atomic update failure on pipe A (start=2537188 end=2537189) time 175 us, min 1017, max 1023, scanline start 1016, end 1027

These are all the entries related to i915 in my journal:

Feb 27 04:19:00 dom0 kernel: i915 0000:00:02.0: [drm] VT-d active for gfx access
Feb 27 04:19:00 dom0 kernel: i915 0000:00:02.0: vgaarb: deactivate vga console
Feb 27 04:19:00 dom0 kernel: i915 0000:00:02.0: [drm] Transparent Hugepage support is recommended for optimal performance when IOMMU is enabled!
Feb 27 04:19:00 dom0 kernel: i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
Feb 27 04:19:00 dom0 kernel: i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4)
Feb 27 04:19:01 dom0 kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
Feb 27 04:19:01 dom0 kernel: fbcon: i915drmfb (fb0) is primary device
Feb 27 04:19:01 dom0 kernel: i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device
Feb 27 04:19:19 dom0 kernel: snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
Feb 27 04:19:20 dom0 kernel: mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_component_ops [i915])
Feb 28 13:31:17 dom0 kernel: i915 0000:00:02.0: [drm] VT-d active for gfx access
Feb 28 13:31:17 dom0 kernel: i915 0000:00:02.0: vgaarb: deactivate vga console
Feb 28 13:31:17 dom0 kernel: i915 0000:00:02.0: [drm] Transparent Hugepage support is recommended for optimal performance when IOMMU is enabled!
Feb 28 13:31:17 dom0 kernel: i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
Feb 28 13:31:17 dom0 kernel: i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4)
Feb 28 13:31:17 dom0 kernel: i915 0000:00:02.0: [drm] [ENCODER:111:DDI C/PHY C] is disabled/in DSI mode with an ungated DDI clock, gate it
Feb 28 13:31:17 dom0 kernel: i915 0000:00:02.0: [drm] [ENCODER:115:DDI D/PHY D] is disabled/in DSI mode with an ungated DDI clock, gate it
Feb 28 13:31:17 dom0 kernel: i915 0000:00:02.0: [drm] [ENCODER:119:DDI E/PHY E] is disabled/in DSI mode with an ungated DDI clock, gate it
Feb 28 13:31:18 dom0 kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
Feb 28 13:31:18 dom0 kernel: fbcon: i915drmfb (fb0) is primary device
Feb 28 13:31:18 dom0 kernel: i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device
Feb 28 13:31:37 dom0 kernel: mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_component_ops [i915])
Feb 28 13:31:37 dom0 kernel: snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
Feb 29 01:15:31 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* Atomic update failure on pipe A (start=2537188 end=2537189) time 175 us, min 1017, max 1023, scanline start 1016, end 1027
Feb 29 03:12:08 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] flip_done timed out
Feb 29 03:12:18 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Feb 29 03:12:18 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] commit wait timed out
Feb 29 03:15:45 dom0 kernel: i915 0000:00:02.0: [drm] VT-d active for gfx access
Feb 29 03:15:45 dom0 kernel: i915 0000:00:02.0: vgaarb: deactivate vga console
Feb 29 03:15:45 dom0 kernel: i915 0000:00:02.0: [drm] Transparent Hugepage support is recommended for optimal performance when IOMMU is enabled!
Feb 29 03:15:45 dom0 kernel: i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
Feb 29 03:15:45 dom0 kernel: i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4)
Feb 29 03:15:45 dom0 kernel: i915 0000:00:02.0: [drm] [ENCODER:111:DDI C/PHY C] is disabled/in DSI mode with an ungated DDI clock, gate it
Feb 29 03:15:45 dom0 kernel: i915 0000:00:02.0: [drm] [ENCODER:115:DDI D/PHY D] is disabled/in DSI mode with an ungated DDI clock, gate it
Feb 29 03:15:45 dom0 kernel: i915 0000:00:02.0: [drm] [ENCODER:119:DDI E/PHY E] is disabled/in DSI mode with an ungated DDI clock, gate it
Feb 29 03:15:46 dom0 kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
Feb 29 03:15:46 dom0 kernel: fbcon: i915drmfb (fb0) is primary device
Feb 29 03:15:46 dom0 kernel: i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device
Feb 29 03:16:04 dom0 kernel: mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_component_ops [i915])
Feb 29 03:16:04 dom0 kernel: snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])

Had another freeze, same error.

dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] flip_done timed out
dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] commit wait timed out

affected device is 00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD Graphics 630]

Switching to kernel-latest, will see if errors persist.

Aaaaand another freeze, which after further investigation appears to be a Xorg crash: When the system crashes, the screen freezes and the desktop completely locks up. There is no response to mouse or keyboard input and I am unable to use CTL-ALT-F2 etc. to switch to a console tty, but the system itself keeps running (next youtube video loads automatically). After a few minutes I get a black screen.

Running kernel-latest 6.6.9-1

More of the same in journal messages:

Mar 03 05:29:21 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* Atomic update failure on pipe A (start=5712222 end=5712223) time 210 us, min 1017, max 1023, scanline start 1013, end 1026
Mar 03 12:19:50 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] flip_done timed out
Mar 03 12:24:10 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Mar 03 12:24:10 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] commit wait timed out
Mar 03 12:24:21 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Mar 03 12:24:21 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [PLANE:31:plane 1A] commit wait timed out
Mar 03 12:24:31 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] flip_done timed out
Mar 03 12:24:42 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Mar 03 12:24:42 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] commit wait timed out
Mar 03 12:24:52 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Mar 03 12:24:52 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [PLANE:31:plane 1A] commit wait timed out
Mar 03 12:25:02 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] flip_done timed out
Mar 03 12:25:12 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Mar 03 12:25:12 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] commit wait timed out
Mar 03 12:25:22 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Mar 03 12:25:22 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CONNECTOR:95:DP-1] commit wait timed out
Mar 03 12:25:33 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Mar 03 12:25:33 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [PLANE:31:plane 1A] commit wait timed out
Mar 03 12:25:43 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] flip_done timed out
Mar 03 12:25:53 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Mar 03 12:25:53 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] commit wait timed out
Mar 03 12:26:03 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Mar 03 12:26:03 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CONNECTOR:95:DP-1] commit wait timed out
Mar 03 12:26:14 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Mar 03 12:26:14 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [PLANE:31:plane 1A] commit wait timed out
Mar 03 12:26:24 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] flip_done timed out
Mar 03 12:26:34 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Mar 03 12:26:34 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] commit wait timed out
Mar 03 12:26:44 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Mar 03 12:26:44 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CONNECTOR:95:DP-1] commit wait timed out
Mar 03 12:26:55 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] flip_done timed out
Mar 03 12:27:05 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Mar 03 12:27:05 dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:51:pipe A] commit wait timed out

Note: atomic update failure happens hours before the crash

xorg crash messages in ~/.xsession-errors.old:

xscreensaver-systemd: 12:24:11.01: X connection closed
2024-03-03 12:24:11,014 root: Connection error: xcb connection errors because of socket, pipe and other stream errors.
X connection to :0.0 broken (explicit kill or server shutdown).
xscreensaver: 12:24:11.02: pid 5576: xscreensaver-systemd exited unexpectedly with status 1
XIO:  fatal IO error 10 (No child processes) on X server ":0.0"
      after 43381 requests (43381 known processed) with 0 events remaining.

** (xss-lock:5541): CRITICAL **: 12:24:11.046: X connection lost; exiting.

Note: process xss-lock periodically dumps core (donā€™t know if because of kernel-latest or not)

/var/log/Xorg.0.log.old has nothing interesting related to the crash

1 Like

Still getting xorg crashes related to i915 driver. What other driver can I try?

00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD Graphics 630]
	DeviceName: Onboard - Video
	Subsystem: ASRock Incorporation Device 3e92
	Kernel driver in use: i915
	Kernel modules: i915