Major Qubes malfunction - pci device not available, won't boot on certain laptop. Backup VM's?

I am having a major issue with Qubes since updating dom0 I believe and turning off after that.

Qubes will boot from my old laptop but not the current. The current laptop will not get to the password screen but doesn’t throw any errors.

On my old laptop, I can get past the user login, all my AppVM’s etc are there, but I try to open anon-whonix, or sys-net, and get this error:

First question: is this fixable on the current SSD? That would be the best option.

Second question: Is a recovering like this a possibility for me - https://self-hosted.tools/p/recover-vms-from-broken-qubesos/

My last option would be to buy a new SSD and just fresh install, thankfully I have a somewhat updated backup of files but this might take a while due to other reasons.

Side question: Did this happen perhaps because I did not reboot dom0/the other updated qubes before turning off?

As always, any help appreciated.

Thanks a lot

From the screenshot, I’m guessing the network device on sys-net has gone missing or has changed it’s name. I’m not on 4.3 but on 4.2.4 you could open the Qube Manager open settings of the sys-net qube. On the Devices tab remove the missing device and add the correct device if there is one. The rest of those qubes seem to be failing because the dependency chain is broken from the top.

The issue isn’t with the SSD — it’s related to the PCI device.

Let me give you a simple example:

In Qubes OS, suppose I plug in my USB Wi-Fi adapter and permanently assign it to sys-usb by going to sys-usb Settings → Devices and adding the device there.

Then I shut down the machine. Later, I power it on again but this time without the USB Wi-Fi adapter connected. Qubes OS fails to boot properly and doesn’t even let me log in. This happens because the device previously assigned to sys-usb is now missing, and sys-usb is set to autostart. As a result, it throws an error.

In your case, you need to remove the old, unused PCI device from sys-net , sys-firewall , and sys-whonix to resolve the startup failure.

1 Like

Hi,

I go to click settings of sys-net and the qubes manager, as well as anything else open (settings of other qubes for example) just closes.

I can click settings of sys-whonix and sys-firewall however.

From what I can tell it seems to be the WiFi device… I don’t use any special wifi adapter, just the one in the laptop.

Furthermore, I did backup an AppVM, seemingly succesfully. However when I opened the zipped backup on another machine, all there is is a backup-header file, despite the zip being 2gb.

Any ideas?

I would do that, as it would at least mean I can at a last resort open the qubes to restore my data (backup via dom0 doesn’t seem to be working for some reason) but if I open settings via qubes manager of sys-net everything closes!

Very odd

It seems to be a bug

According to this bug report you could try
sudo qvm-pci unassign sys-net
in dom0 and then adding the correct network device through Qube Manager settings.
I don’t know if a more exact method works:
sudo qvm-pci detach sys-net dom0:00_14.3:0x8086:0x7740::p028000
I tried to copy the device name from the error report. Double check it before running please.

Hi, thanks for linking that, the unassign did allow me to open the settings of sys-net!

Now for probably a stupid question: How do I remove that device? All I see under Devices is “available devices” on the left and “devices always connected to this qube” on the right.

I can’t see remove anywhere.

Thank you

It’s already removed. That is what the command did. Now you need to find the correct device from the available devices if there is one.

EDIT:
On the 4.2.4 you click on the arrow right to move the device from the available devices to the always connected ones and arrow left to remove it.

Oh, I didn’t do the detatch command just in case. I thought I could remove from settings.

All my devices are in the left column. There is no devices in the always connected column.

Use arrow right to move the correct device from the available devices to the always connected devices. The unassign command removed all the always connected devices from the sys-net.

After doing

 sudo qvm-pci unassign sys-net

I was able to at least open my appVM’s data which was good, I then stupidly added all the devices back to sys-net always attached thing (right column)

Now I cannot get past the user login on Qubes (I can still enter the disk passphrase and get to that screen)

Is there any way to do the above command before it boots? So I can at least do a fresh Qubes install and transfer my data over.

Thank you

I’ve never had to do it so I don’t know, but there is this:

But please stop gunning everything without thinking. Read the information at the link. If you get in, unassign the devices from the sys-net again and try to add the one, exact and correct network device. If you don’t recognize it from the devices list, don’t add any device.

2 Likes

Hi,

qubes.skip_autostart

didn’t change anything, still just reboots after entering disk password. Perhaps the documentation is outdated, as the only thing I changed was sys-net devices.

I was able to boot with 0 qubes starting up so this must be the case.

It sounds unlikely that qubes.skip_autostart does not prevent autostart, unless there is something very wrong with Dom0 or the initrd.

  • Did you understand that it must be present at every boot, until you have fixed your PCI attachment problem?
  • Did you make sure to put it a space before it, but only space?
  • You can delete the “rhgb” and “quiet” items, so that more information is shown after the disk is decrypted.

[ It’s probably best to avoid making unnecessary new threads… ]

Hi,

I understand - I have tried it multiple times and failing to get to user login.

I put one space.

I don’t seem to have rhgb or quiet items on the pre-boot menu.

Are you hiding any pci devices in the grub file? If so and the device is not present at that address it will cause major problems.

Unfortunately, there are now three threads relating to this single problem, and the latest one only links back to this oldest one - half hiding the second one.

Normally, better help comes when people understand the full history, and if you feed back lots of detail of what you see when you follow the suggestions.

The issue seems to be that the PCI device identifiers have changed in the most recent kernel.

It is probable that only three simple steps are necessary to resolve this problem.

  1. Use the solution in the second thread to prevent sys-net autostart (see below for a summary)
  2. Remove attached PCI devices from qubes that fail to start, using the Settings for the qubes.
  3. When it is possible to reboot the whole machine with no PCI-related failures and without disabling of autostart, add only the single device to sys-net, using the recommendation of @mnfTgh in this first thread at comment 12.

In previous cases, I have found that those are the correct solution when kernel updates cause problems like this.

If you try them, make careful notes of every step you do, in case you get blocked.

For 1, follow the Autostart troubleshooting instructions

  1. Press the E key on the first prompt (or your custom prompt). It may say “Select operating system”.
  2. Then, press the down arrow key multiple times to reach the line starting with module2 /vmlinuz
  3. Use the right arrow to go to the end of the line, which might be off the screen.
  4. add a space and then qubes.skip_autostart at the end of the line
  5. Press Ctrl-x or F10 to boot

For 2. the Settings app is simple. sudo qvm-pci unassign sys-net can also work to remove attachment of wrong devices.

For 3. take care at first only to add the necessary devices to sys-net. qvm-start sys-net can be used to test.

If there is a problem, then it will be necessary to troubleshoot.

  • Maybe it will be necessary to use “qubes.skip_autostart” again.
  • There are some ideas in the pci troubleshooting document.
  • If something is not clear, then get help before making big changes.