This deserves a qubes-issues ticket in my opinion.
I can post a copy there if you think it would be prudent.
Are you using XFCE right now? … did you enable/disable the compositor under window manager tweaks?
I am and did. Disabling the compositor helped a little, leading me to believe it is somewhere in or adjacent to the graphics pipeline. When switching display managers (which was a little tricky to do) there were parts of the system that were trickier to access (e.g. not fully supported) but the intermittent visual freezes would still occur despite being in a completely different display manager.
Very interesting. Is your keyboard / touchpad connected via USB (I saw reports of that being the case in some AMD based laptops)?
Yes. It’s a USB optical mouse. Still responsive, but all elements on display stop responding. I would love to confirm if the mouse still works (e.g. is still interacting with things without being able to see the renders of it) but am not sure how one would do so reliably.
Changed my mind. Let’s collect more info here first and submit a ticket later.
This is what makes your report so different from the ‘freezes’ reported above. In case of a freeze the computer remains entirely unresponsive or restarts by itself. In your case Xen and at least some of the VMs continue to run (e.g. video streaming) but inputs from mouse and keyboard are no longer recognized. However the movement of the mouse cursor can still be observed.
To me this sounds not like a freeze/crash at all but like somehow the part of the system that deals with your input has an issue. Do you have trackpad? What about the internal keyboard?
Agreed. My only caution is wondering how many of the prior reports would indicate if the underlying system is still operating. It could be possible more than one bug exists (visual freezes vs. crashes) and some of the reports are getting mixed up, complicating narrowing it. I suppose it would also be reasonable that one bug may manifest in slightly different ways on different chipsets.
Do you have trackpad? What about the internal keyboard?
It has a trackpad. I have not tried using it during a hang. I can try next time if it will help.
The keyboard is integrated into the laptop board (not external). Not sure how to confirm if it’s still operating. Best I may be able to do there is indicate if the “Caps Lock” key still lights up without visual feedback.
@Sven - Would it help if I take a video recording of everything to post? I can set aside time to do it from my phone so that the setup and all peripherals are obvious. The reproducibility may help give a presentation of the glitch.
Also, if we do not make headway from this chipset, I have a separate Ryzen 7 PRO Thinkpad I can take some time to install Qubes on to confirm if the problem is chipset/version specific. The two models are fairly close in most regards other than the CPU upgrade.
So the keyboard is internal but the key presses have no more effect.
Correct. Everything about it acts like the display rendering (in isolation) has completely frozen.
IIRC mouse pointers are handled by a different part of the rendering system from everything else to avoid a rendering lag for the pointer. So that may give a hint on what part of the rendering pipeline is responsible. At least… that’s how it worked when I programmed for graphics systems that weren’t driven by hypervisors.
Actually, I have an idea. I can start a video in a window and let it play in the corner. I can try to preposition the mouse cursor over the pause/resume button in the video playback so that when I get a freeze… I can confirm if the mouse pointer input still works to the video player without seeing it (can confirm with audio feedback). That might allow me to confirm if the keyboard still works once the video player is back in input focus.
Back in the times when R4.1 was unable to properly resume from suspend on Ryzen laptops, I observed similar behaviours on resume. But I don’t why it can be triggered by software in a qube.
Have you installed xorg-x11-drm-amdgpu? Also, a more recent kernel as well as linux-firmware might be of help.
Not sure, but it wouldn’t hurt. Obviously only a subset of forum users would see it, so we should make sure to post a verbal description along with the video
I would have uploaded the .mp4 here with (.gz) extension but the forums won’t let me (new user restriction). So, link to download video (shows Qubes OS visually freezing on fresh boot):
Shows:
Keepass2 induced screen freezing when opening more than 1 database
Delayed response from when launching second keepass2 to the time there is any visual response at all (in this case a rendering freeze)
Video playback continues in background while visuals lock up
Mouse pointer still works (both optical input and trackpad)
CapsLock key light no longer works
Full-screen TTY switch doesn’t work.
Input controls (from optical mouse, track-pad input, and keyboard) do not cause video to respond when clicking on its elements (indicating all input other than moving the mouse cursor no longer works)
Optical mouse distance/activation sensor laser still responds to varying distance of surfaces
Question: Could I run a diagnostic script in the background that probes hardware / logs / system status and flushes results to disk so that I can retrieve them on next boot? I wouldn’t know what to scan for precisely though. One would think dmesg logs and the like are already flushed to disk on the regular.
For the first time the system froze when moving the KeePass window shortly after loading it. I am approaching the point where I may find myself debugging this at the lower level to see where the system condition is.
Also checked upgraded firmware as @augsch recommended (see prior post)
Results:
Laptop still freezes when launching second keepass2 from second appvm. I can do a clean boot, launch the two appvms, and just launch keepass2 from the terminal in both cases. Instant freeze.
Additional information:
I’ve used xfce4 on this same laptop pushing the hardware to the max (thrashing internal swap drive, etc…) and never had a freeze. It only seems to be happening in Qubes. It leads me to believe the problem might be in Xen.
Log of additional things attempted:
1
On a bash script in dom0 background while inducing freeze:
sleep 30
xfwm4 --replace
No improvement/recovery during freeze.
2
BIOS power management which stops clock on individual processors disabled. No improvment.
3
BIOS AmdPowerNow! feature disabled. No improvement.
Additional boot-time ‘dmesg’ logs worth noting:
[ 6.695988] kfd kfd: amdgpu: error getting iommu info. is the iommu enabled?
[ 6.695990] kfd kfd: amdgpu: Error initializing iommuv2
[ 6.697130] kfd kfd: amdgpu: device 1002:15dd NOT added due to errors
…
[ 114.487618] pciback 0000:05:00.3: xen-pciback: Driver tried to write to a read-only configuration space field at offset 0x6c, size 2. This may be harmless, but if you have problems with your device:
1) see permissive attribute in sysfs
2) report problems to the xen-devel mailing list along with details of your device obtained from lspci.
I have confirmed AMD Virtualization Technology is enabled in the BIOS.
The xen error seems to be for a USB device: “05:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1”. Disabling USB devices in BIOS has no impact on this error message.
KeePassX causes a screen lockup on the unit in this thread and a closely related model running Ryzen 7 Pro. However, KeePass2 allows the system to run rock solid. Nothing ever crashes on either when replacing KeePassX with KeePass2.
Possible causes that leap to mind:
The IOMMU control of Xen reports back warnings for the graphics driver at boot time.
Something in the KeePassX runtime accidentally exploits a weakness in the graphical control pipeline between VMs and the hypervisor.