Qubes GUI issues

Since yesterday, I see some strange GUI issues in various qubes based on different templates: All windows disappear but the vm is still running.

All I saw so far are these guid-vm-name.log:

Icon size: 128x128
docking an override-redirect window 0x56001e8 - clearing override-redirect
invalid PMaxSize for 0x560034c (16384/16384)
invalid PMaxSize for 0x560035d (32767/177)
invalid PMaxSize for 0x560035d (32767/177)
invalid PMaxSize for 0x560035d (32767/177)
invalid PMaxSize for 0x560035d (32767/177)
invalid PMaxSize for 0x560035d (32767/177)
ErrorHandler: 128
                 Major opcode: 130 ()
                 Minor opcode: 3
                 ResourceID:   0x5600371
                 Failed serial number:  9446
                 Current serial number: 9446

qvm-start-gui vm-name brings all windows back.

Any idea what causes this behaviour or where I can continue to look for the problem?

The system was setup with Qubes OS 4.1 and upgraded to 4.2 a few weeks ago. No problems so far. Yesterday I deleted the outdated debian-11 template. But due to problems in all templates I guess this is an update- or configuration-related problem.

PS: I don’t know if this is related. guest-vm-name.log shows:

...
[2024-07-28 10:55:37] [   91.905704] xen:grant_table: g.e. 0x2068 still pending
[2024-07-28 10:55:37] [   91.905767] xen:grant_table: g.e. 0x2067 still pending
[2024-07-28 10:55:37] [   91.905830] xen:grant_table: g.e. 0x2066 still pending
[2024-07-28 10:55:37] [   91.905893] xen:grant_table: g.e. 0x2065 still pending
...

You can enable debug mode for your qube and check the guid-vm-name.log, there should be more details.
You can enable the debug mode in qube Settings → Advanced tab → “Run in debug mode”.

Thank you! I will test that…

I am sorry. I gave up, because I need this PC to function I decided to re-install and restore my qubes from backup.

Unfortunatly I ran into this problem and needed some hours to fix it in the end.

I’ll come back when I found out if reinstall and restore helped to fix the root problem of this issue.

I added some experiences during the re-insatll process.

Hi @hildi - I am experiencing the exact same behavior you are describing in your OP. Except this is occurring for me on a completely fresh install of 4.2.3

I have 2 video cards in my system, intel IGP & an older nVidia 1070, however, I only have the IGP configured, the nVidia card is blacklisted in Xen via the startup flags as well as in dom0 - so it’s difficult to imagine the card is the issue here, it’s not being used at all for anything, I don’t have it assigned to any qubes.

At the exact moment that the VM loses all it’s windows I get a coredump in the dom0 kernel messages:

Nov 21 09:50:25 dom0 systemd-coredump[273033]: [🡕] Process 245173 (qubes-guid) of user 1000 dumped core.
                                               
                                               Module /usr/bin/qubes-guid with build-id f8a19fd2db97345047e8c234023b12da7e2edf51
                                               Metadata for module /usr/bin/qubes-guid owned by FDO found: {
                                                       "type" : "rpm",
                                                       "name" : "qubes-gui-daemon",
                                                       "version" : "4.2.8-1.fc37",
                                                       "architecture" : "x86_64",
                                                       "osCpe" : "cpe:/o:fedoraproject:fedora:37"
                                               }
                                               
                                               Module linux-vdso.so.1 with build-id d6d823a6fa46e7acb285d9aff0a35643b3a4e4a9
                                               Module libXfixes.so.3 with build-id 8f3a78f737b814c0008f3cf57ef6d11cc646a7ef
                                               Metadata for module libXfixes.so.3 owned by FDO found: {
                                                       "type" : "rpm",
                                                       "name" : "libXfixes",
                                                       "version" : "6.0.0-4.fc37",
                                                       "architecture" : "x86_64",
                                                       "osCpe" : "cpe:/o:fedoraproject:fedora:37"
                                               }
                                               
                                               Module libXrender.so.1 with build-id e28b0d1a111084521bef458753ba20e0e28aa299
                                               Metadata for module libXrender.so.1 owned by FDO found: {
                                                       "type" : "rpm",
                                                       "name" : "libXrender",
                                                       "version" : "0.9.10-17.fc37",
                                                       "architecture" : "x86_64",
                                                       "osCpe" : "cpe:/o:fedoraproject:fedora:37"
                                               }
                                               
                                               Module libXcursor.so.1 with build-id b4392b1f9e68ce45d15a4c82c7052d0c160bbf7c
                                               Metadata for module libXcursor.so.1 owned by FDO found: {
                                                       "type" : "rpm",
                                                       "name" : "libXcursor",
                                                       "version" : "1.2.1-2.fc37",
                                                       "architecture" : "x86_64",
                                                       "osCpe" : "cpe:/o:fedoraproject:fedora:37"
                                               }

There’s a lot more in that list but I don’t know of most of it is relevant. At the end of the dump I have:

                                               Module libm.so.6 with build-id fe777a29b3563583f764b5a8ae28782dbcd65352
                                               Stack trace of thread 245173:
                                               #0  0x00007e23fd259ecc __pthread_kill_implementation (libc.so.6 + 0x8cecc)
                                               #1  0x00007e23fd209ab6 raise (libc.so.6 + 0x3cab6)
                                               #2  0x00007e23fd1f37fc abort (libc.so.6 + 0x267fc)
                                               #3  0x00007e23fd1f371b __assert_fail_base.cold (libc.so.6 + 0x2671b)
                                               #4  0x00007e23fd202696 __assert_fail (libc.so.6 + 0x35696)
                                               #5  0x000062084fb9334c n/a (/usr/bin/qubes-guid + 0xb34c)
                                               #6  0x00007e230000001c n/a (n/a + 0x0)
                                               ELF object binary architecture: AMD x86-64

Did you ever end up getting any VM GUI crashes on your newly-reinstalled system?

For me, most of the time it happens it doesn’t seem random, but rather it’s when I’ve been away from my PC for a while, the screen has gone to sleep & locked itself, and I return to system and log back in.

Hard to say if it’s the actual “login” event triggering the GUI crash or the screen coming back out of sleep.