Win10 HVM qrexec fails to connect

I have a Win10 HVM on Qubes R4.0.4 that is intermittent with regards to the qrexec connection. The qrexec timeout is currently set to 600 so that I can investigate things before Qubes finally forces the VM to shutdown, which can be catastrophic if Windows has already decided to start doing updates or something else important. I have had to revive it by restoring it from backups on occasion because of this.

Each time I start the Win10 vm the Qube Manager shows the yellow dot indicating that it is starting and it may connect and turn green for a number of times that I use it, until it doesn’t. Once it fails to connect it will continue failing for all subsequent attempts, even with shutting down or rebooting Qubes.

The only remedy I have found is to start the VM as normal and then tell it to “restart” from the start menu at which point it will get a xenbus error and freeze solid and the VM’s window can not be closed by any Qubes Gui/qvm command I can find. I either have to restart Qubes or issue a dom0 “xl destroy Win10” command to force it to close that catatonic window. Once I finally get that window closed the next time the VM will start and qrexec will connect properly each and every time until it decides once again that its time to start failing. Then I need to restart/xenbuss-error again to get it working again.

When it does qrexec connect the VM works just fine. When it doesn’t connect the Win10 VM still works fine but without the additional qrexec services until it is forced to close by the timeout. In all other respects it seems to work with no visible errors. I have been snooping around but I don’t know what exactly I need to look for when its not connecting properly, and then it mystifies me as to why a xenbus failure would somehow correct the problem, but only for a while (hours, days, months if not used often).

Q: What is supposed to happen on the Windows side needed to support this qrexec connection? Does that qrexec receiver process have a logfile on the Windows side? I’m just looking for ideas of where to look.

guest-Win10-dm.log (53.8 KB)

I haven’t fully read your post. But if you believe this is a bug, it may deserve a bug report:

I have similar problems. It always runs with a yellow state icon without the ability to do global copy/paste and copy to other AppVM. I can get the green icon every ‘even’ / other reboot by changing the qrexec_timeout to 25 but then it crashes every ‘odd’ reboot with the general qrexec error. If I get a blue screen, sometimes I get the Stop Code: ‘IRQL NOT LESS THAN OR EQUAL TO’ or, ‘SYSTEM SERVICE EXCEPTION’. Sometimes it says the crash is due to the xenbus driver.

I have a workaround by creating an AppVM and setting the template to start at the ‘even’ rotation. However, I can only use it for a few minutes before the blue screen.

I think it’s already reported https://github.com/QubesOS/qubes-issues/issues/5462

1 Like

I found on one templateVM with network access I can consistently start with a green state by previously doing a restart instead of shutdown.

On one template without network, I have to shut it down knowing the next restart will have a green state. Then I run an AppVM off that last saved state and the AppVM will always start with a green state. Once in a while I get the bsod though.

This sounds very familiar to my problem reported here:

https://forum.qubes-os.org/t/win10-hvm-qrexec-fails-to-connect/5432?u=slcoleman

The one difference is whenever I do a “restart” from the VM’s start menu I always got a xenbus error, and I would then have to kill that frozen VM via ‘xl destroy Win10’, and then the next time the Win10 VM would start correctly.

it the same thread

It always starts correctly after an unsuccessful start. If you have a templateVM, maybe it’s best if you run an AppVM off it. The AppVM will always start correct if the TemplateVM’s next start will be correct. It’s a workaround.