ThinkPad (AMD) becomes unresponsive to input

I have my own report to submit. I believe this is the correct thread to do so.

I can reproduce the problem reliably. I have been experiencing it consistently for about a year now.

System: AMD Ryzen 5 PRO 2500U w/ Radeon Vega Mobile Gfx - Lenovo Thinkpad
Qubes Kernel: 5.15.103-1

There are a few things that agitate it. Most notable of which is running more than one keepassx or keepass2 at the same time in different VMs.

If lucky: After entering the appropriate password the keepass window will do a slow side-to-side (not up and down) old-school rendering blit before coming back up to full speed.

If unlucky: Display will completely seize either immediately or at some point when playing with keepass (e.g. opening an entry to edit - very annoying). The internal processes seem to continue to operate (e.g. videos / streams in background will continue to play indefinitely) but the display becomes completely unresponsive.

Attempting to switch fullscreen TTYs does nothing. No combination of hotkeys help. The mouse pointer still moves around but nothing visually responds. I also updated the chipset firmware last year to see if it would fix the problem. It did not.

I have also noticed that electrum (the bitcoin application) seems to agitate it if a keepass is open at the same time. I find myself copying entries out of keepass, closing it, and running electrum separately to avoid a display hang. Again, all in separate VMs.

I tried changing the display manager earlier on in the year prior, which did not fix it, so I rolled back to my backup. None of the kernel updates have helped so far.

I might be able to show the techs the bug by capturing the screen with a phone. I’m not sure if seeing it with ones own eyes would help though. It’s not particularly easy to capture logs because all temporal memory is lost when forcing a hard reboot on the system (along with temporary keepass edits… save often).

I saw an email address to contact someone with reports earlier in this thread. I’m not sure if that’s still appropriate. Let me know if so and I will.

1 Like

Very good report @bitcipher.

Are you using XFCE right now? … did you enable/disable the compositor under window manager tweaks?

Very interesting. Is your keyboard / touchpad connected via USB (I saw reports of that being the case in some AMD based laptops)?

This deserves a qubes-issues ticket in my opinion.

I can post a copy there if you think it would be prudent.

Are you using XFCE right now? … did you enable/disable the compositor under window manager tweaks?

I am and did. Disabling the compositor helped a little, leading me to believe it is somewhere in or adjacent to the graphics pipeline. When switching display managers (which was a little tricky to do) there were parts of the system that were trickier to access (e.g. not fully supported) but the intermittent visual freezes would still occur despite being in a completely different display manager.

Very interesting. Is your keyboard / touchpad connected via USB (I saw reports of that being the case in some AMD based laptops)?

Yes. It’s a USB optical mouse. Still responsive, but all elements on display stop responding. I would love to confirm if the mouse still works (e.g. is still interacting with things without being able to see the renders of it) but am not sure how one would do so reliably.

Changed my mind. Let’s collect more info here first and submit a ticket later.

This is what makes your report so different from the ‘freezes’ reported above. In case of a freeze the computer remains entirely unresponsive or restarts by itself. In your case Xen and at least some of the VMs continue to run (e.g. video streaming) but inputs from mouse and keyboard are no longer recognized. However the movement of the mouse cursor can still be observed.

To me this sounds not like a freeze/crash at all but like somehow the part of the system that deals with your input has an issue. Do you have trackpad? What about the internal keyboard?

Happy to supply any additional information you need.

Agreed. My only caution is wondering how many of the prior reports would indicate if the underlying system is still operating. It could be possible more than one bug exists (visual freezes vs. crashes) and some of the reports are getting mixed up, complicating narrowing it. I suppose it would also be reasonable that one bug may manifest in slightly different ways on different chipsets.

Do you have trackpad? What about the internal keyboard?

It has a trackpad. I have not tried using it during a hang. I can try next time if it will help.

The keyboard is integrated into the laptop board (not external). Not sure how to confirm if it’s still operating. Best I may be able to do there is indicate if the “Caps Lock” key still lights up without visual feedback.

For sure, that’s why @unman is trying to get some order into the chaos.

Yes please try the trackpad. So the keyboard is internal but the key presses have no more effect.

@Sven - Would it help if I take a video recording of everything to post? I can set aside time to do it from my phone so that the setup and all peripherals are obvious. The reproducibility may help give a presentation of the glitch.

Also, if we do not make headway from this chipset, I have a separate Ryzen 7 PRO Thinkpad I can take some time to install Qubes on to confirm if the problem is chipset/version specific. The two models are fairly close in most regards other than the CPU upgrade.

So the keyboard is internal but the key presses have no more effect.

Correct. Everything about it acts like the display rendering (in isolation) has completely frozen.

IIRC mouse pointers are handled by a different part of the rendering system from everything else to avoid a rendering lag for the pointer. So that may give a hint on what part of the rendering pipeline is responsible. At least… that’s how it worked when I programmed for graphics systems that weren’t driven by hypervisors. :slight_smile:

Actually, I have an idea. I can start a video in a window and let it play in the corner. I can try to preposition the mouse cursor over the pause/resume button in the video playback so that when I get a freeze… I can confirm if the mouse pointer input still works to the video player without seeing it (can confirm with audio feedback). That might allow me to confirm if the keyboard still works once the video player is back in input focus.

Back in the times when R4.1 was unable to properly resume from suspend on Ryzen laptops, I observed similar behaviours on resume. But I don’t why it can be triggered by software in a qube.

Have you installed xorg-x11-drm-amdgpu? Also, a more recent kernel as well as linux-firmware might be of help.

1 Like

Have you installed xorg-x11-drm-amdgpu?

I have not. I would presume this would need to be done in dom0 as the whole system rendering freezes.

Also, my last reply to @Sven was eaten by the SPAM filter. In case anyone with admin powers reads this before fishing me out of the filter box. :slight_smile:

Right. This package can improve rendering on dom0 side.

Right. This package can improve rendering on dom0 side.

I will need to do a backup of my internal drive before trying that one, just in case it ends up being a one-way trip. :slight_smile:

Not sure, but it wouldn’t hurt. Obviously only a subset of forum users would see it, so we should make sure to post a verbal description along with the video

I would have uploaded the .mp4 here with (.gz) extension but the forums won’t let me (new user restriction). So, link to download video (shows Qubes OS visually freezing on fresh boot):

Shows:

  • Keepass2 induced screen freezing when opening more than 1 database
  • Delayed response from when launching second keepass2 to the time there is any visual response at all (in this case a rendering freeze)
  • Video playback continues in background while visuals lock up
  • Mouse pointer still works (both optical input and trackpad)
  • CapsLock key light no longer works
  • Full-screen TTY switch doesn’t work.
  • Input controls (from optical mouse, track-pad input, and keyboard) do not cause video to respond when clicking on its elements (indicating all input other than moving the mouse cursor no longer works)
  • Optical mouse distance/activation sensor laser still responds to varying distance of surfaces

Let me know if you need anything else.

1 Like

Installed xorg-x11-drv-amdgpu as root in dom0:

qubes-dom0-update xorg-x11-drv-amdgpu

  • Mouse cursor feels a little different after reboot.
  • No improvement with freezing. Reproduces with two keepass2 apps opened at once just as in video above.

Should I try version bump on kernel or linux firmware? Tempted to think there’s a race condition somewhere tbh.

Question: Could I run a diagnostic script in the background that probes hardware / logs / system status and flushes results to disk so that I can retrieve them on next boot? I wouldn’t know what to scan for precisely though. One would think dmesg logs and the like are already flushed to disk on the regular.

For the first time the system froze when moving the KeePass window shortly after loading it. I am approaching the point where I may find myself debugging this at the lower level to see where the system condition is.

Current linux-firmware: linux-firmware-20230117-146.fc32.noarch

Appears to be up to date.

Will explore newer kernel versions.

Upgraded to kernel version: 6.2.10-1.qubes.fc32.x86_64

Followed upgrade instructions from @ChrisA on originating thread:

sudo qubes-dom0-update --disablerepo=* --enablerepo=qubes-dom0-current-testing kernel-latest

Confirmed running kernel version using:

uname -sr

Also checked upgraded firmware as @augsch recommended (see prior post)

Results:

Laptop still freezes when launching second keepass2 from second appvm. I can do a clean boot, launch the two appvms, and just launch keepass2 from the terminal in both cases. Instant freeze.

Additional information:

I’ve used xfce4 on this same laptop pushing the hardware to the max (thrashing internal swap drive, etc…) and never had a freeze. It only seems to be happening in Qubes. It leads me to believe the problem might be in Xen.

Log of additional things attempted:

  • 1

On a bash script in dom0 background while inducing freeze:

sleep 30
xfwm4 --replace

No improvement/recovery during freeze.

  • 2

BIOS power management which stops clock on individual processors disabled. No improvment.

  • 3

BIOS AmdPowerNow! feature disabled. No improvement.

Additional boot-time ‘dmesg’ logs worth noting:

[ 6.695988] kfd kfd: amdgpu: error getting iommu info. is the iommu enabled?
[ 6.695990] kfd kfd: amdgpu: Error initializing iommuv2
[ 6.697130] kfd kfd: amdgpu: device 1002:15dd NOT added due to errors

[ 114.487618] pciback 0000:05:00.3: xen-pciback: Driver tried to write to a read-only configuration space field at offset 0x6c, size 2. This may be harmless, but if you have problems with your device:
1) see permissive attribute in sysfs
2) report problems to the xen-devel mailing list along with details of your device obtained from lspci.

  • I have confirmed AMD Virtualization Technology is enabled in the BIOS.
  • The xen error seems to be for a USB device: “05:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1”. Disabling USB devices in BIOS has no impact on this error message.
  • I am unsure what the amdgpu iommuv2 device is is.

Hardware compatibility:

My system is very close to the following entry: Hardware compatibility list (HCL) | Qubes OS

My unit is: Lenovo ThinkPad A485, model 20MU000VUS

Hypervisor error log:

Buried in the entries:

  • (XEN) xenoprof: Initialization failed. AMD processor family 23 is not supported

This seems to trace to this issue:

I’m not sure who’s end this would be on or if it even directly would fix my freezing issue. However, the ticket remains open.