After installing the latest kernel update 5.15.57-1.fc32 for dom0 after a couple of hours the whole system crashes, which did not happen with 5.15.52. Xen is at 4.14.5 for both. Coreboot version is from a branch at 2021-08-20 so I will be updating to see whether there’s been some fixes upstream.
Hardware is a System76 lemp9 which runs fairly stable otherwise (some minor problems with the intel ax201 wifi card that are known issues).
Is there a kernel/xen cmdline to persist the kernel crash log to console at least so that the system does not restart?
Update 1: I am now testing 5.18.16 with some i915 kernel parameters
That was a good suggestion, but unfortunately it didn’t help. I:
Updated BIOS to 2022_08_12_2680d93 (which ships coreboot 4.17 I believe)
Got a rebooting crash on the 5.15.52 kernel after 8 hours of use (I started suspecting something to do with i915 display driver because I started using a monitor with an unknown quality HDMI cable and have been seeing some screen flickering occasionally and just before the crash)
Got a display freeze just after logging into xfce on kernel 5.18.16
I am not using sys-gui yet so I wonder if it will help.
Update: I’ve booted again with 5.18.16 and added the following to dom0 cmdline: i915.enable_fbc=1 i915.enable_dc=0 intel_idle.max_cstate=1 ahci.mobile_lpm_policy=1 based on arch wiki i915 troubleshooting instructions. Will report back
Right. The Lemur Pro. Very nice hardware, actually. You said System76, and I thought Oryx. My bad
Very nice find.
It seems to think your kernel is tainted. I’ve seen this before on my Tiger Lake machines. Older versions of Xorg don’t seem to cope well with Tiger Lake and above. Not sure why yet…
Not sure why the kernel is tainted. But even with the updated dom0 kernel params I still get hangs sometimes on 5.18.16. I am going to try and downgrade linux-firmware (because of comments in https://github.com/QubesOS/qubes-issues/issues/7648) but this could also be related to https://github.com/QubesOS/qubes-issues/issues/7513 (on the 5.15 kernel I got a crash but now it’s all silent hangs without even a black screen).
Kernel 6.0.2 crash reboots after suspend (about 10 seconds after logging in, could be GUI-related)? Some new error messages:
dom0 kernel: nvme nvme0: Shutdown timeout set to 10 seconds
dom0 kernel: i915 0000:00:02.0: [drm] [ENCODER:118:DDI C/PHY C] is disabled/in DSI mode with an ungated DDI clock, gate it
dom0 kernel: i915 0000:00:02.0: [drm] [ENCODER:102:DDI B/PHY B] is disabled/in DSI mode with an ungated DDI clock, gate it
dom0 kernel: i915 0000:00:02.0: [drm] [ENCODER:94:DDI A/PHY A] is disabled/in DSI mode with an ungated DDI clock, gate it
dom0 kernel: ACPI: EC: event unblocked
dom0 kernel: ACPI: EC: interrupt unblocked
dom0 kernel: ACPI: PM: Waking up from system sleep state S3
dom0 kernel: CPU3 is up
dom0 kernel: ACPI: \_SB_.CP03: Found 3 idle states
dom0 kernel: cpu 3 spinlock event irq 143
dom0 kernel: installing Xen timer for CPU 3
dom0 kernel: CPU2 is up
dom0 kernel: ACPI: \_SB_.CP02: Found 3 idle states
dom0 kernel: cpu 2 spinlock event irq 137
dom0 kernel: installing Xen timer for CPU 2
dom0 kernel: CPU1 is up
dom0 kernel: ACPI: \_SB_.CP01: Found 3 idle states
dom0 kernel: cpu 1 spinlock event irq 131
dom0 kernel: installing Xen timer for CPU 1
dom0 kernel: Enabling non-boot CPUs ...
dom0 kernel: xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU7
dom0 kernel: xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU5
dom0 kernel: xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU3
dom0 kernel: xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU1
dom0 kernel: xen_acpi_processor: Uploading Xen processor PM info
dom0 kernel: ACPI: PM: Restoring platform NVS memory
dom0 kernel: ACPI: EC: EC started
dom0 kernel: ACPI: PM: Low-level resume complete
Kernel 5.15.74 survives suspend and comes back, but:
new error message after suspend resume: dom0 kernel: i915 0000:00:02.0: [drm] *ERROR* Atomic update failure on pipe A (start=367 end=368) time 590 us, min 1073, max 1079, scanline start 1043, end 1083
GUI qubes need restarting because X11 windows become unresponsive (and qvm-shutdown does not work, you need to kill…)