QubesOS freeze, crash and reboots

Frequent kernel updates. Still no freezes/crashes after initial one when upgraded to kernel 6.02

Latest dom0 updates

Updating dom0

local:
----------
kernel-latest:
----------
new:
1000:5.19.14-1.fc32.qubes,1000:6.0.2-2.fc32.qubes,1000:6.0.7-1.fc32.qubes
old:
1000:5.19.14-1.fc32.qubes,1000:5.19.9-1.fc32.qubes,1000:6.0.2-2.fc32.qubes
kernel-latest-qubes-vm:
----------
new:
1000:5.19.14-1.fc32.qubes,1000:6.0.2-2.fc32.qubes,1000:6.0.7-1.fc32.qubes
old:
1000:5.19.14-1.fc32.qubes,1000:5.19.9-1.fc32.qubes,1000:6.0.2-2.fc32.qubes
libxdo:
----------
new:
1:3.20210804.2-3.fc32
old:
1:3.20210804.2-2.fc32
qubes-desktop-linux-common:
----------
new:
4.1.13-1.fc32
old:
4.1.12-1.fc32
qubes-menus:
----------
new:
4.1.13-1.fc32
old:
4.1.12-1.fc32
scrypt:
----------
new:
1.3.1-1.fc32
old:
1.2.1-3.fc32
xdotool:
----------
new:
1:3.20210804.2-3.fc32
old:
1:3.20210804.2-2.fc32

Have not done any exhaustive testing, but just installed kernel latest 6.0.2-2. Ran my acid test in i3wm, which was to switch a window to floating and move it around: result was an instant crash and reboot!
Was pretty sure @marmarek said the xorg bug fix was backported to 6.0.2, but maybe I was mistaken…
Anyway, the pain continues…

It’s been 11 days and several updates incl. dom0 updates without any freeze. Starting to think (at least for my use case) this might be over.

Xen:            4.14.5
Kernel:         5.15.74-2
3 Likes

Yep. Me on 6.0.7 still stable, so common thing is (was) Xen… But who’d care anyway…

1 Like

For what it’s worth I got the very occasional “freeze” a few weeks back, too. Not enough to really notice a pattern. But it has been a while now that you mention it. (I’ve managed to lose mouse and keyboard but that was something I did wrong.)

1 Like

Still totally reproducible here, despite the recent Xen updates. I’m still on kernel 6.0.2 but I guess the difference with me is that I’m on i3wm and the system halts continue…

2 Likes

Regrettably, I gave up on testing 5.15: didn’t fix issues for me.

I’m now testing kernel-latest 6.0.7 - with poor results so far.
I’ve had one crash the other day that left it impossible to activate
lvs. (With live distros, unlocking the drive left the same lv activated.)
Numerous freezes, and lock ups.
Starting (and stopping) qubes seems a trigger in many cases.

Just to check, I dropped in a 4.0.4 drive, and everything was as I
expect. Solid performance with many qubes open.
This still leaves open the possibility that 4.1 is stressing components
that are on their way out, but I’ve tried dropping the 4.1 drive in to
otherwise working machines, only to see them tank.

I never presume to speak for the Qubes team.
When I comment in the Forum or in the mailing lists I speak for myself.

@unman do you think there could be something to @enmus’ conclusion that it’s in fact XEN and not the kernel that caused/causes these issues? What version of XEN are your running?

1 Like

I’ve started to think that Xen is needed to upgrade their project from current version of 4.16 to 5.0 to be more compatible to this current version of Linux—V6.0.

4.14.5

Well perhaps.
But while these problems trouble some users, they dont seem to trouble
all users,not even those using qubes-testing.
I assume from the fact that these packages are released to testing, and no
one else from the dev team has commented at all, that none of them
experience these problems.

This is not enough to state, since there are several upgrades of this version, and 4.14.5-12 resolved some other significant issues (can’t remember or search at the moment). Just updated to 4.14.5-14, though.
frequent Xen updates noticeable, anyway…

I thought I said I am always running latest.
4.14.5-12 fixed no issues for me.

Btw if anyone experiences freezes or reboots on VM starts with PCI devices attached, I recommend hiding those devices from dom0 via the kernel command-line option rd.qubes.hide_pci=[comma-separated list of PCI devices] (which should be the recommended default anyway).

However IIRC most people on this thread have freees on different occasions.

:worried:

Once again, during update ...

Nov 21 09:37:39 dom0 kernel: Linux version 5.15.76-1.fc32.qubes.x86_64 (mockbuild@2b0cec5c8b1143fc8fd66539678daa>
– Reboot –
Nov 21 09:35:42 dom0 qrexec-policy-daemon[29902]: 2022-11-21 09:35:42.471 qrexec-client[29902]: process_io.c:187>
Nov 21 09:35:40 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.NotifyUpdates+: debian-11-net → @adminvm: allowe>
Nov 21 09:35:38 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.UpdatesProxy+: debian-11-net → @default: allowed>
Nov 21 09:35:37 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.NotifyUpdates+: debian-11-net → @adminvm: allowe>
Nov 21 09:35:35 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.UpdatesProxy+: debian-11-net → @default: allowed>
Nov 21 09:35:33 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.UpdatesProxy+: debian-11-net → @default: allowed>
Nov 21 09:35:31 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.UpdatesProxy+: debian-11-net → @default: allowed>
Nov 21 09:35:27 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.UpdatesProxy+: debian-11-net → @default: allowed>
Nov 21 09:35:23 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.VMRootShell+: disp-mgmt-debian-11-net → debian-1>
Nov 21 09:35:23 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.VMRootShell+: disp-mgmt-debian-11-net → debian-1>
Nov 21 09:35:22 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.GetDate+: debian-11-net → @default: allowed to d>
Nov 21 09:35:19 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.VMRootShell+: disp-mgmt-debian-11-net → debian-1>
Nov 21 09:35:19 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.WindowIconUpdater+: debian-11-net → @adminvm: al>
Nov 21 09:35:18 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.VMRootShell+: disp-mgmt-debian-11-net → debian-1>
Nov 21 09:35:18 dom0 qrexec-policy-daemon[1502]: qrexec: qubes.VMRootShell+: disp-mgmt-debian-11-net → debian-1>

1 Like

Quick side note for others who might in this context see a notification that “domain disp-mgmt-bla already exists with uuid …”. @marmarek posted in a qubes-issue comment how to fix this:

virsh -c xen:/// undefine disp-mgmt-bla

Had my first freeze today, after a debian template update finished and the template was about to shutdown.

1 Like

I’m tracking kernel-latest, as the stable kernel did not solve issues
for me.
I had disappointing response to my request for information from users.

I have reverted my system to vanilla out of the box settings, and scaled
down my qube use.
I still see hard crash, and freezes (sometimes recovered), in normal
use, particularly during updates or qube start/stop.
I also see arbitrary reaping of running qubes.

There are two obvious areas which could impact - I/O and memory. I
suspect memory reallocations between qubes.
(As I have said before, I have swapped in 4.0 on an identical SSD, and
see none of these issues with aggressive memory allocation to qubes.)

1 Like

FWIW, after spending quite a bit of time trying to solve graphical issues I had, I found that there were quite a few issues/posts scattered around reporting crashes with intel /i915 that in hindisght seem closely related or even downright duplicates.
So far: issues # 4782, 7507, 7664, 7902, 7894. Forum post. Blog post (+ I remember reading another two but haven’t written down their urls).

The issue is that there’s tearing/glitches/artifacts/corruption with the fb driver when there’s no compositing (which i3wm doesn’t provide - probably explaining the high proportion of reports from i3wm users). People then switch to the intel driver because that’s the solution mentioned in posts to fix those glitches - but for some (yet to be found) reason intel is unstable for some people - from oopses to hard freeze/reboot.

In my case I’d get a few random reboots a day. I eventually made the connection and reverted to fb and so far haven’t encountered a single crash. In hindsight, given all the problems reported in this thread, I’m wondering which proportion is due to people using the intel driver instead of fb and being hit by bugs in intel

5 Likes

Seems to happen for me on Qubes Updates; never during a Dom0 update – had hoped those recent updates were addressing this. Oddly, and I’m not certain this is related, the crashes only occur for me when the laptop (Librem 14) is running on battery.

1 Like