QubesOS freeze, crash and reboots

My previously rock solid x230 shows this behaviour intermittently, after
recent updates. (Not after 30-60 mins - I cant see any pattern.)

Thanks, done.

Last week I had my first 2 system freezes in 2 days ever… It looked like it was about the RAM (especially since I’m using zram swap), but for months before that I haven’t faced issues while using it and I wasn’t overused my resources more than usual.

And it looks like it isn’t related to these kernels only. Just after writing my previous post, my Qubes simply crashed and rebooted, since I was updating dom0 while writing that post. Today it happened again, so I chose lower version kernel (5.15.52 if I can recall) and update went smoothly. After update I set the kernel-latest as default kernel again. Have no courage to try to update dom0 again under it.

xen_version : 4.14.5
Linux 5.18.16-1.fc32.qubes.x86_64

I updated dom0 the other day and now I managed to keep my laptop running for 8 hours. A few VMs crashed after this time but it’s still a big improvement compared to previously :+1:

Might have been too early to celebrate, seems like Qubes is back to crashing regularly today. Oh well…

I’ve got the same issue. I’ve got graphics artefact and soon after a crash : only the mouse can be moved but nothing responds. I’ve seen that issue : Xen-related dom0 freeze / crash · Issue #7751 · QubesOS/qubes-issues · GitHub and applied the kernel param nouveau.noaccel=1 but I just go a new crash. Really frustrating

Had my first freeze today in my 2+ years of using Qubes.
Froze completely when trying to shutdown a vm, no mouse movement, no switching between tty. Immediately after it froze system looked like it was under massive stress, fans spinning at full speed and temperature increasing, but no network or disk i/o activity. After forced power off it took longer to boot the first time. Guess it run some checks?
Tried to look up something in the logs:
journalctl just shows --Reboot-- and that’s it, can’t see anything abnormal otherwise
Why does it show “Reboot” when the system was powered off, not rebooted?

/var/log/xen/console/hypervisor.log and vm-that-was-shutting-down.log don’t show anything abnormal either

Anywhere else I could look at what might be the cause?

This is all very troubling from a Qubes stability standpoint. I’m using a L380 which has been rock solid for several years until the other day. I use the i3 window manager and was setting a window to floating mode and resizing it when bam! …instant reboot of the system. Never happened before, ever. Nothing in logs, everything up to date.

1 Like

I am also constantly crashing since updating to Qubes 4.1.1. I have reverted to my debian machine as a system that crashes every few hours is essentially useless. I’ve lost a lot of data and time on this.

It’s mostly from the kernel or the hardware system does not have that power to get into. However, mostly heating issues of the hardware system.

It’s not exclusively heating issues that are causing the crashes, but there are heating issues as well. I get heating issues when doing large inter-VM file copies. I don’t know if these heating issues were around prior to 4.1.1.

Yes - my x220 and x230 which have always been stable, show much
hotter running under 4.1.1 - I can drop in old drives and confirm that
this is the case.
But beside that, I now have seemingly random restarts, which like you
are not connected to overheating, and for which I can find no reason,
and no help in the logs.
This is a huge backwards step for me - I’m now working on the basis that
my machine may go down at any time.

I never presume to speak for the Qubes team.
When I comment in the Forum or in the mailing lists I speak for myself.
1 Like

If it is worth, I have just encountered random logout and when I logged in back, all vms were there as left before it. At the moment it happened while I tried to update some deb-11-template. Don’t know if it’s related, but more than twice crashes happened to me during update…

I too have seen increased freezes with zero log entries explaining what happened. For me this happens exclusively during salt-based updates. Updating using the old method never leads to a freeze, but was discouraged by the core team.

My “workaround” is to run updates only when I am not actively using the computer (e.g. during breaks) and while all but the sys- qubes are shutdown. This prevents data loss in case of a freeze. It’s a sad and bothersome affair but lacking any usable debug data I don’t see what I could even report other than chiming into this general complaint.

I’ve made sure that it’s not temperature related by running a system monitor in the task bar at all times. Also a temperature shutdown would show up in the logs.

One notable exception to the above is that with 5.15.63 I had these freezes all the time without even touching update. So I think that was a separate issue. Going back to 5.15.52 cured that in the sense that those freezes only happened occasionally while running salt-based updates. I just see 5.15.64 came in, I’ll switch to that and see what happens.

1 Like

I have seen sudden reboots after about a week of running with 5.15.61. So I have updated to 5.15.63 and so far it has been running for 14 days.

Update: It froze up after 3 weeks or so.

Well it’s happening for more than 2 months now…

I don’t want to jinx it, but it appears 5.15.64 is stable for me. I have not experienced any freezes so far.

… and I get what I deserve. The freeze came within hours of posting. :frowning:

1 Like

If I’d be to guess this lasts too long to be related to kernel, and crisscrossing recent issues and some workarounds, my guess is it’s Qubes or Xen related, and most probably related to gui, like someone is experimenting something with it. I wouldn’t be surprised soon to get some good news about sys-gui. Just a hintch, nothing more, looking for something good in this never-before situation.