Need help | qubes_dom0-root does not exist

Rafa · November 9, 2023, 8:19pm

Hi everyone, I think something went wrong around here… After researching this error, I saw that some people also experienced a similar problem, but always with a peculiarity that made their case unique. That’s why I am creating this topic.

After months of using Qubes 4.1.2 without any problems, I came across the following error screen after starting the system today:

This error appears exactly after I enter the password to decrypt the disk. I guarantee I’m not getting the password wrong. But I must say that the decryption process is taking a much longer time compared to when the system was working normally.

If I press Control-D to continue, this is what I get:

Some details that may be relevant:

Last night something quite strange happened. My keyboard just stopped working out of nowhere, something that never happened before. As it was already at the end of the day, I simply turned off the computer in the hope that when I started it today, the keyboard would work naturally again. How naive…

While I was researching the problem, I noticed in some instances that the cause may have something to do with running out of disk space. Well, again last night, I received a system alert message warning me that my disk was getting full. I don’t remember why, but at the time I thought that it was actually the space of the VM I was using that was getting full, so I simply increased the space allocated to that VM in its settings.

Again while I was researching, I noticed that when pressing Enter for maintenance, other people enter the dracut emergency shell (dracut:/#), whereas in my case I enter the sh-5.0# shell.

If the problem really is related to a lack of disk space, there is a particular VM that I’ve created that is taking up a considerable amount of space on the disk, but there will be no problem for me in simply deleting it. If it is possible to just delete a VM through this shell, perhaps this could be a solution, but I don’t know how to do that.

apparatus · November 10, 2023, 4:32am

Can you check the /run/initramfs/rdsosreport.txt log for more specific errors?
I don’t think that running out of space could cause qubes_dom0-root not existing.

balko · November 10, 2023, 9:20am

On R3.2, or R4.0 I had very similar situation. I ran out of space, something bad was happening with the system after that. So, I rebooted and found out that LVM was broken, I managed to extract only part of the the information from the LVM using LiveCD because LVM was indeed corrupted.
All attempts to fix it in place failed and I had to reinstall the system completely. The SMART of the drive showed no problems.

Rafa · November 10, 2023, 5:36pm

Hi I’m back. Yeah so, I typed journalctl to access the system logs and I noticed a familiar “Manual repair required!” warning, something that other people who experienced this problem also went through.

About the rdsosreport.txt, I can say that I went through a tough time saving this file. First I tried to copy it directly to a USB, but no external device was being recognized. So I mounted the sda1 partition, where the EFI is located, and managed to copy the file onto it. Then I booted from a USB into a Qubes installation image and entered recovery mode. Through it I was able to mount the sda1 partition again and copy the file to another USB drive which this time was recognized. Speaking like this may seem easy, but the difficulty is having the insight into which steps should be taken.

Anyway, it is not possible to share .txt files here, but perhaps it is possible via direct message?

Still on the problem, as suggested in other cases, I tried running the command ''lmv lvconvert --repair qubes_dom0/root-pool", both in the supposed dracut environment (after trying to start Qubes), as well as in the Anaconda repair environment (on the live usb image of Qubes), and both resulted in the same message:

Regarding my theory that the problem would be a lack of disk space, when using the lvremove command, I managed to remove that VM with a large allocated space I mentioned earlier, but the problem persists even after that.

Finally, it may be relevant to say that I managed to recover the files that were in the VMs when booting from USB to a Linux Mint image. After starting Mint, the driver in which Qubes is installed appeared as available. I just double clicked it, entered the password to decrypt the disk, and voila, the VMs showed up.

apparatus · November 10, 2023, 6:36pm

You’ve run out of space in dom0 (root-pool), all your VMs are using vm-pool so removing VMs is useless and won’t help with your problem. You need to free up the space in dom0.
If lvconvert --repair is not working then maybe you need to first extend the root-pool and then try it again. But it’s just a guess.

Rafa · November 10, 2023, 11:00pm

Unfortunately the same error occurred when trying to extend the root-pool.

As you can see, after finding out that the root-pool had 20GiB allocated, I then tried to extend the volume in two different ways, without success. Upon seeing the “Failed to activate” message, I then tried to manually activate the root-pool with lvchange -ay, but also without success.

Following your reasoning that all my VMs are using the vm-pool, I decided to try to extend the vm-pool volume, but I came across this strange warning message:

I mean, how can the sum of thin volume sizes exceeds the size of thin pools and the size of whole volume group? I am guessing that these 360GiB refers to a dynamic volume that is not necessarily completely filled?

Still, apparently the vm-pool volume sucessfully increased. I used the vgdisplay command before and after extending the vm-pool and I noticed that the “Free PE / Size” actually decreased.

Finally, I tried running the repair command for root-pool one more time, but the same “Manual repair required!” error message pops up.

Hey @balko what were your attempts to try to fix this? With some luck maybe they can work for me.

balko · November 11, 2023, 8:45am

I booted Ubuntu LiveCD and after some struggle was able to manually mount private LVM volumes of qubes and extract part of the information. Unfortunately not all, but I had up to date backup. I tried different ways of recovering LVM from the internet, but they all failed.

After that I do not trust LVM that much and keep 50%+ of free space on the Qubes OS disk, because I fear that it can collapse unrecoverably when it runs out of space.

Now, when there is not a single case of this, you can report the issue on github providing your up to date details. I decided not to report it because I did not find the same cases, so I considered the possibility that the reason is random hardware failure or something rare like cosmic rays that happened the day when I ran out of space just by chance. And now I do not recall details for the report.

balko · February 10, 2024, 5:48am

What was the outcome of your recovery struggle?