How to start troubleshooting a kernel panic / reboot?

I’m running into an issue with random crashes/reboots that happen when launching VMs, ever since the 5.10+ kernel. 5.4.98 works, but nothing since then. There are a few bug reports I’ve tried to read through, but the main problem I have is getting any information at all to report.

  1. I set /etc/sysctl.conf kernel.panic=0 in dom0 to attempt to stop automatic reboots, but it doesn’t work, the machine hard resets to the bios. Is there another setting in xen or something to stop automatic reboots?
  2. There’s nothing in /sys/fs/pstore even on crashes that occur with the newest kernels (crash occurs in 5.10+ and reboot into same 5.10+ kernel, I know this isn’t available if I boot back into 5.4). This is commonly mentioned as a source for crash information, but doesn’t appear to be working for whatever problem I’m having.
  3. Almost every single crash leaves a mess of *-volatile lvs volumes behind that I have to clean up by hand. More than once I’ve had to manually vgcfgbackup and hack at the metadata and then vgcfgrestore to even be able to clean up the volumes. So doing a lot of debugging on this issue is a major pain.

I’d appreciate any advice on either how to collect some kind of crash report data successfully, and/or how to simplify repeatedly cleaning up the wreckage of my filesystem left by crashing during a VM start.

Hm funny, I have a similar issue and 5.10.x kernels help…

Anyway I’d be interested, too.

1 Like

Ah yes, you could do a blind kernel bisect … if you have a few spare days. ^^