Would you be prepared to test this?
Can you identify a time when you were able to work without crashes?
Do you have the testing repositories enabled?
We should be able to identify packages that seemed to work, roll back to
them, and confirm that at that point all was good.
Will you help with this?
Also, and this is important, it isn’t true that every one experiences
this. Some people do - anecdotally, it’s more common in older hardware,
but it’s not all old hardware, and not even all users with the same
model, and it’s not restricted to there.
There are so many usage patterns in Qubes that it is very difficult to
find these common patterns which will allow full analysis.
I’m taking the freedom to answer even though I wasn’t asked.
I am prepared to test whatever you @unman or some other core team member deems worth testing in this regard. I have two identical T430. One for my actual work, the other for testing/backup/standby.
For me unfortunately that is back in R4.0 before the kernel changed to 5.x … this was never stable for me. Actually it was quite horrible with several freezes per day.
I agree that these three things are triggers for the crashing, however I’m not certain that it is the starting of VMs that is the issue because copying between VMs is something else that triggers it on my system.
I’m willing to help test, I don’t have the testing repositories enabled currently. For me the crashing that we are discussing in this thread does not occur in the base 4.1.0 iso without any updates applied.
That corresponds with symptoms observred by me. I was nearly to gave up using QubesOS since being up to date on 4.1, it was happening even few times a day.
I am experiencing nearly no freezes since updated dom0 to kernel-latest 5.18.16.
My workflow involves me doing lots of copying between qubes often of files 50+gb in size. Back when this first started happening the logs would always report the crashes as being due to overheating. However in playing around trying to recreate the overheating, I’ve gotten many crashes without any overheat log and with the temperature sensors visible I can see that the system was not overheating.
I’ll keep playing around, but it might be the case that I can now recreate these crashes on demand. Has anyone else encountered crashing doing large transfers between VMs?
Some of the symptoms in this thread sound similar to the issues I have been experiencing over in Crash on dom0 kernel 5.15.57 - in case it helps with the summarisation. The crashy behavior occurs on 5.18, 19, and 20 (aka 6) (but not on 5.15 after .64) usually after suspend resume.