Qubes 4.1 + Windows 7 (w/ QWT) crashes

marioabbott · April 5, 2022, 8:39pm

Continuing the discussion from [WIP] Windows/QWT user reports:

Hello everyone!

This is my first post on the forum, so please bear with my newbie question.

With the new Qubes 4.1 now I am struggling in making Xen block device work. So my configuration in brief:

Qubes OS 4.1
Windows 7 HVM
QWT 4.1.67

I have seen all sort of claims like “worked like a charm”, “only crash after first boot”, “you have to update Windows first”, etc. but it my case, it always ended up crashing the VM with BSOD.

The crash never happens right away. For me, the very first attaching of a hard disk partition to Windows 7 HVM always success, but after detaching and re-attaching, it always crashes. Or after reboot (implicit detaching) and attaching, it always
crashes too, no matter what I did.

Can somebody with the expertise confirm black-on-white A or B below, please? I think it would help clarify the situation and save newbie like me a lot of trouble.

A. This Xen PV disk driver will not work in Qubes OS 4.1 + Windows 7 + QWT 4.1.67 configuration, or
B. It works in Qubes OS 4.1 + Windows 7 + QWT 4.1.67 configuration and here is how I did it.

Cheers!

deeplow · April 5, 2022, 9:10pm

Moved it into it’s own topic. Although related to the other post, here it will be easier to find

enmus · April 5, 2022, 9:44pm

And, you are attaching it… how?

P.S. I have no expertise but it works like a charm for me.

jevank · April 6, 2022, 10:05am

It looks like reattach issue, is your system updated? Could you show xen and xen-hvm-stubdom-linux packages version? Does it (attach-detach-attach) work with linux VMs fine?

GWeck · April 6, 2022, 12:02pm

I am getting these crashes, too, but the pattern is unclear: On the first attach of the (only) partition of a USB-stick, the Windows 7 VM crashed. After this, I could attach the whole device, and it appeared, but without a drive letter. After assigning a drive letter, the device could be used. Detaching worked, and subsequent attaching of the device or the partition worked without a crash several times, and this also after rebooting.

In the Windows explorer, the device several times did not disappear after detaching but stayed as an empty device still using the drive letter. Each new attach used a new drive letter when I assigned this letter. After the third or so attempt, the device was removed correctly, including its letter, and then could be reattached using the same letter.

My system is a fully updated Qubes R4.1 with Xen version 4.14.4.

marioabbott · April 6, 2022, 2:08pm

I use an external SATA disk with several NTFS partitions to exchange data with Linux. The hard drive is connected to eSATA port prior to booting the machine. Then the “attaching” is simply selecting a partition in the “Data (Block) Devices” in the graphical widget and assign it to the said Windows VM.

I never select the whole drive, only a partition at a time, because other partitions are used by other VMs.

The “detaching” operation works like this: I use “Disk Management” and make the partition (which appears in Windows VM as if it was an independent disk) off-line, then “eject” it using Qubes’ graphical widget. The reason I have to make it off-line in Windows is to avoid data loss.

jevank · April 6, 2022, 2:12pm

I do not understand the reason to attach a partition instead of an entire device. As I understand Windows unable to detect a volume w/o a partition table.

marioabbott · April 6, 2022, 2:16pm

Which “system” are you referring to? Qubes or Windows?

Here are all “xen” I can see in my system:

[user@dom0 ~]$ rpm -qa '*xen*'
python3-xen-4.14.4-2.fc32.x86_64
xen-hvm-stubdom-linux-1.2.2-1.fc32.x86_64
xen-hvm-stubdom-legacy-4.13.0-1.fc32.x86_64
xen-runtime-4.14.4-2.fc32.x86_64
xen-licenses-4.14.4-2.fc32.x86_64
xen-hvm-stubdom-linux-full-1.2.2-1.fc32.x86_64
xen-hypervisor-4.14.4-2.fc32.x86_64
xen-4.14.4-2.fc32.x86_64
libvirt-daemon-xen-6.6.0-5.fc32.x86_64
xen-libs-4.14.4-2.fc32.x86_64
qubes-libvchan-xen-4.1.7-1.fc32.x86_64

And certainly attaching/detaching works fine with Linux VMs. It smells very much like a Windows-only issue.

marioabbott · April 6, 2022, 2:24pm

The fact is I have been using Windows VM this way for years in Qubes 4.0 and it seems that Windows can attach a single partition, or maybe the Xen driver is smart enough to create a virtual partition table before presenting it to Windows.

jevank · April 6, 2022, 2:55pm

Did you see smth suspicious in log files?
/var/log/xen/console/guest-[windows-qube]-dm.log

Did you see BSOD screen (requires ‘debug’ option to get emulated screen visible)?

deeplow · April 7, 2022, 7:45am

(@marioabbott I’ve added markdown formatting to your post for improved readability)

marioabbott · April 8, 2022, 2:03am

No, I don’t see anything suspicious in “/var/log/xen/console/guest-[windows-qube]-dm.log” files. I searched for “error” and “fatal” keywords but there are none. There are few warnings about deprecated options being used in the command line but I think they are not suspicious. As I don’t know what to look for in that 500+ KB file, I could not say everything was fine as it should be.

As regards to the BSOD screen, I did see it several times, even now when I intentionally crash the VM by attaching a block devive to it. It tells nothing specific except few memory address in hex form, of course.

However, I remember seeing a BSOD saying something like “unhandled exception in system object” - not the exact quote - but it probably was from Windows 10 running Xen drivers version 8.2.2, not version 9.0, and not Windows 7. Until I can reproduce it, please consider this information unconfirmed.

marioabbott · April 8, 2022, 2:38am

Update: By comparing the log file before and after the Windows VM being crashed, there are only 4 more lines added:

[] configure msg, x/y 640 294 (was 640 294), w/h 1024 768
[] qubes_gui: got unknown msg type 145, ignoring
[] qubes_gui: got unknown msg type 145, ignoring
[] qubes_gui: got unknown msg type 145, ignoring

Previously I was looking for keyword “error” in the log file and found nothing, but I saw this, hope it means something. The pattern repeated couple of dozen of times in the 5-day period when the Windows VM is being tested:

[] E: [vchan-sink] module-vchan-sink.c: .[1;31msink cork req state =1, now state=-2.[0m
[] E: [vchan-sink] module-vchan-sink.c: .[1;31msource cork req state =1, now state=-2.[0m

[] E: [vchan-sink] module-vchan-sink.c: .[1;31msink cork req state =0, now state=1.[0m
[] E: [vchan-sink] module-vchan-sink.c: .[1;31msource cork req state =0, now state=1.[0m

jevank · April 8, 2022, 6:37am

There might be kernel or qemu error messages that indicate problems outside the Windows VM. If you can reproduce the issue (BSOD/hanging) than latest messages are important.

An exception error code (in hex smth like 0x0000007E) might be useful.

Have you installed 9.0 Xen drivers? QWT provides 8.2.2.

It is generally harmless.

enmus · April 8, 2022, 9:34am

When examining, I use sudo journalctl -f in a “always on top” terminal window and not doing anything else except examined process and watching it live. When the process finishes, I exit journalctl, copy content of a terminal window to a text editor and examine output. I am not sure if this is the best way for expertise, but I mostly find it helpful.

marioabbott · April 10, 2022, 8:48am

The last messages were the four that I have posted (3 of them were “got unknown msg type 145, ignoring”).

The BSOD, among other blah-blah was this (so yes, the error code was 0x0000007E as you expected):

Technical information:

*** STOP: 0x0000007E (0xFFFFFFFFC0000005, 0xFFFFF800026E24C5, 0xFFFFF880031FBDD8, 0xFFFFF880031FB640)

Sorry if I was not clear, I did not install Xen drivers 9.0 on Windows 7. It happened that when I tried to make sure that QWT works at least on Windows 10, but I forgot to install Xen drivers 9.0 before installing QWT 4.1.67, so I ended up having Xen 8.2.2 in Windows 10, and that was where I think I saw something about unhandled exception in system object.

marioabbott · August 7, 2023, 1:44am

An update, just in case someone is looking for a solution:

I removed the PV Networking driver completely from Windows VM, and after removing, the problem is gone. No more crash. Now I can assign/remove any block drive to/from Windows VM and everything works just fine.

It seemed that the PV Networking driver and PV Block Storage driver don’t like each other. I have no intention to allow Windows access to internet, so removing the PV Networking driver makes sense to me, and it solved the problem.

If you are the person who is capable of fixing Xen driver, maybe this would give you a hint where to look for the cause of the crash.

gonzalo-bulnes · August 7, 2023, 9:31am

Thank you for reporting back @marioabbott!

I marked your post as the solution to make it easier for folks that have a similar issue to yours to find it. Please feel free to correct if that doesn’t feel correct