Hello all,
I’m experiencing an issue with my Lenovo notebook when the second monitor is connected through the HDMI port.
Notably, the problem only occurs when the second monitor is connected.
Randomly, the dom0 GUI gets stuck, rendering the machine unusable.
During these occurrences, I cannot select any items on the desktop, and after a few seconds, the mouse freezes.
At this point, the only solution is to force a hardware reset.
I’ve been unable to locate any logs that might report this issue. The only information I’ve found is that ‘journalctl’ logs the following lines when the problem arises:
dic 31 17:56:41 dom0 kernel: i915 0000:00:02.0: [drm] ERROR [CRTC:131:pipe B] flip_done timed out
dic 31 17:56:51 dom0 kernel: i915 0000:00:02.0: [drm] ERROR flip_done timed out
dic 31 17:56:51 dom0 kernel: i915 0000:00:02.0: [drm] ERROR [CRTC:131:pipe B] commit wait timed out
I have attached the ‘journalctl’ for your reference. journalctl.log (376.5 KB)
Here is the HCL report: hcl.log (765 Bytes)
As a newcomer to Qubes, I’m seeking your assistance in understanding the issue and guiding me through the troubleshooting process.
Committed the configuration:
sudo grub2-mkconfig -o /boot/efi/EFI/qubes/grub.cfg
Generated the initramfs image:
sudo dracut -f
Rebooted the system.
After the reboot, I verified that the kernel parameter is correctly added:
cat /proc/cmdline
placeholder root=/dev/mapper/qubes_dom0-root ro rd.lvm.lv=qubes_dom0/root rd.lvm.lv=qubes_dom0/swap plymouth.ignore-serial-consoles 6.6.2-1.qubes.fc37.x86_64 x86_64 rhgb quiet pci=nomsi usbcore.authorized_default=0
Is this the right procedure?
I apologize for asking, but I’m new to Qubes, and I want to make sure I have followed the correct steps.
If what I have done is correct, I will let you know if the pci=nomsi option works.
But now, with the pci=nomsi option enabled, the journalctl continuously reports the following error:
Jan 02 17:15:42 dom0 kernel: ata1: illegal qc_active transition (00000000->ffffffff)
Deleting the pci=nomsi option and recompiling resolves the error log.
Attached you can find the journalctl. journalctlgen02_1.log (1.2 MB)
I suspect that disabling Message Signaled Interrupts (MSI) for PCI devices may impact the interrupt handling mechanism and lead to issues with the ATA controller.
What are your thoughts?
Despite this error, Qubes OS seems to be working fine.
It could cause some issues but I don’t have enough knowledge to comment on this.
I’ve offered this option for a test only because Qubes OS developer suggested to try this option in one of the linked github issues. If it’ll work with this option then I suggest to report this on github issue for further tracing of this issue.
Hi apparatus,
I’ll try to go deeper into this issue, searching and collecting more information.
In the meantime, I appreciate your valuable assistance, and I’ll keep you updated if there are any developments.
By the way, I have another question: when you suggest reporting this issue on GitHub, are you referring to this link: Issues · QubesOS/qubes-issues · GitHub?
Unfortunately, it’s not possible to reopen that case; I cannot use the Qubes issue tracker to ask for support.
I have a small update: if I suddenly disconnect the HDMI port of the second monitor, the dom0 GUI is recovered on the primary monitor.
After the GUI recovery, I can reconnect the second monitor, and I can work while waiting for the next issue
In the journal log, I can see the following messages:
gen 03 18:18:40 dom0 kernel: i915 0000:00:02.0: [drm] ERROR [CRTC:131:pipe B] flip_done timed out
gen 03 18:18:55 dom0 kernel: i915 0000:00:02.0: [drm] ERROR flip_done timed out
gen 03 18:18:55 dom0 kernel: i915 0000:00:02.0: [drm] ERROR [CRTC:131:pipe B] commit wait timed out
gen 03 18:19:05 dom0 kernel: i915 0000:00:02.0: [drm] ERROR flip_done timed out
gen 03 18:19:05 dom0 kernel: i915 0000:00:02.0: [drm] ERROR [PLANE:82:plane 1B] commit wait timed out
gen 03 18:19:07 dom0 systemd[1]: Started getty@tty5.service - Getty on tty5.
In this log, I can see more error messages, but in a nutshell, it seems not helping…
Attached, you can find the complete journal log. journalctl_03_1.log (682.7 KB)
Reading the “GUI randomly freezes” issue, I have seen:
Looks like a GPU driver problem, may be related to IOMMU. Try adding iommu=no-igfx to the hypervisor command line (options= in /boot/efi/EFI/qubes/xen.cfg).
Moreover, I have found this case in the forum:
So now, I have added the iommu option to /etc/default/grub:
GRUB_CMDLINE_XEN_DEFAULT=“$GRUB_CMDLINE_XEN_DEFAULT iommu=no-igfx”
compile and reboot…
Hi apparatus,
I have an update:
While searching for a solution, I came across this post: https://gitlab.freedesktop.org/drm/intel/-/issues/8685.
After reading the comments, I noticed that they were experiencing the same problem I’m facing in Qubes 4.2.
Therefore, I attempted to start Qubes with the previous kernel that uses the i915 driver version 2.16:
Since I’m using kernel 6.1.62-1.qubes.fc37.x86_64 and i915 driver version 2.16, the issue no longer occurs.
Now, I don’t have enough knowledge to troubleshoot this issue;
it’s not clear to me if this problem can be seen as a Qubes bug or a problem related to the i915 drivers.
What are your thoughts?
I have noticed another issue that could be related to the i915 driver version 2.20:
the suspend service doesn’t work reporting this error in the journalctl log:
gen 13 11:50:23 dom0 kernel: intel_pmc_core INT33A1:00: PM: dpm_run_callback(): acpi_subsys_suspend_late+0x0/0x50 returns -5
gen 13 11:50:23 dom0 kernel: intel_pmc_core INT33A1:00: PM: failed to suspend late: error -5
gen 13 11:50:23 dom0 kernel: PM: late suspend of devices failed
attached you can find the journalctl log lastboot-journallog.log (342.4 KB)
just to summarize:
dom0 kernel 6.1.62-1 with i915 driver 2.16: no issue, suspend works
dom0 kernel 6.6.2-1 with i915 driver 2.20: flip_done timeout issue and suspend doesn’t work
dom0 kernel 6.6.9-1 with i915 driver 2.20: under test, suspend doesn’t work
At this stage, it seems that these are two different issues. What I mean is that:
the ‘flip_done’ issue is not present with kernel versions 6.1.62-1 and 6.6.9-1, so it appears to be a problem associated with version 6.2.
the suspend issue seems to be related to the i915 driver version; it is not present with i915 2.16.
Do you agree?
Another thing I would like to test is kernel version 6.6.9-1 with i915 2.16, but I don’t know how to downgrade the i915 drivers in kernel 6.6.9-1. I’m quite sure I need to recompile the kernel, but I do not have enough knowledge to do this.
This appears to indicate that my Lenovo doesn’t support the Suspend-to-RAM (S3) state, correct?
I have checked the BIOS, and it is updated to the latest version.
Additionally, I cannot find any BIOS option to enable the Suspend-to-ram S3 state.
As you can see in both cases, the system log reports:
PM: suspend entry (s2idle)
This should indicate that it is using state S0 and not S3.
According to kernel docs, s2idle maps to ACPI state S0:
This state is a generic, pure software, lightweight, system sleep state. It allows more energy to be saved relative to runtime idle by freezing user space and putting all I/O devices into low-power states (possibly lower-power than available at runtime), allowing processors to spend more time in their idle states."
So in this case, I could open an issue to report that Suspend-to-idle is not working with kernel 6.6.9,
but I cannot ask for S3 because it is not supported by my Lenovo.
Have I understood this correctly, or am I missing something?
Just for information,
attached you can find the logs reported with kernel 6.1.62 and 6.6.9. suspend_output_6.1.62.log (3.9 KB) suspend_output_6.6.9.log (5.5 KB)
With kernel 6.6.9, the log reports this error:
Jan 21 10:04:42 dom0 kernel: intel_pmc_core INT33A1:00: PM: dpm_run_callback(): acpi_subsys_suspend_late+0x0/0x50 returns -5
Jan 21 10:04:42 dom0 kernel: intel_pmc_core INT33A1:00: PM: failed to suspend late: error -5