Hard shutdown broke my system

fsflover · June 3, 2022, 6:46pm

After a hard shutdown, my Qubes OS 4.1 has broken. It did not allow me to start any VMs with the error something like

with different numbers.

After trying these commands (I shouldn’t have done it!)

Help! Are my qubes gone after lvm resize?

vgcfgbackup qubes_dom0
# Then edit the backup with vim and replace transaction ID for the vm-pool logical volume
# Restore the new metadata
vgcfgrestore qubes_dom0 --file /etc/lvm/backup/qubes_dom0
# Need --force
vgcfgrestore qubes_dom0 --file /etc/lvm/backup/qubes_dom0 --force
# Need to deactivate vm-pool_tmeta and vm-pool_tdata before activating vm-pool
lvchange -a n qubes_dom0/vm-pool_tmeta
lvchange -a n qubes_dom0/vm-pool_tdata
# Finally activate vm-pool
lvchange -a y qubes_dom0/vm-pool

I completely broke the boot and it is now stuck at

Job is running for LVM event activation on device 253:0 (10min 10s / no limit)

Finally, as I feared, I reached a point at which qvm-volume-revert cannot save me Please help.

51lieal · June 3, 2022, 10:07pm

add 𝚛𝚍.𝚋𝚛𝚎𝚊𝚔 in your kernel parameters,
then try manually :

cryptsetup luksOpen /dev/$drive luks
vgchange -ay qubes_dom0
lvchange -ay /dev/qubes_dom0/root
lvchange -ay /dev/qubes_dom0/swap
lvchange -ay /dev/qubes_dom0/vm
swapon /dev/qubes_dom0/swap
exit

will that do ?

scallyob · June 6, 2022, 3:31pm

Sounds like you made some alterations that I cannot help you with.

But just mentioning that hard shutdowns have been causing me problems too. I get back in with a combination of rebooting multiple times and when i get past the login with no qubes/vms launching open a dom0 Terminal and run:

sudo qubesd

Then reboot some more.

airelemental · June 6, 2022, 5:15pm

As someone who hard-shutdowns every other day, this makes me nervous…

fsflover · June 6, 2022, 6:28pm

Thank @51lieal for a prompt reply trying to help!

All I get is

cryptsetup: command not found
vgchange: command not found
lvchange: command not found
swapon: command not found

By the way, the password for the hard drive is correctly accepted before all this.

Yes, this is concerning that Qubes is so fragile with respect to hard shutdowns…

tzwcfq · June 6, 2022, 7:07pm

Following my intuition I would try to do this:

Boot from some Live USB and mount Qubes.
Assuming /dev/nvme0n1n3 - your Qubes LUKS partition.

# Decrypt LVM and mount dom0
sudo cryptsetup open /dev/nvme0n1n3 qubes_dom0
sudo vgchange -ay
sudo mount /dev/mapper/qubes_dom0-root /mnt
# Edit backup and change transaction ID to what it was before
sudo cp /mnt/etc/lvm/backup/qubes_dom0 /mnt/etc/lvm/backup/qubes_dom0.backup
sudo nano /mnt/etc/lvm/backup/qubes_dom0
# Restore the new metadata
sudo vgcfgrestore qubes_dom0 --file /mnt/etc/lvm/backup/qubes_dom0
# Need --force
sudo vgcfgrestore qubes_dom0 --file /mnt/etc/lvm/backup/qubes_dom0 --force
sync

Reboot.

fsflover · June 13, 2022, 9:17am

Thank you @tzwcfq for trying to help! Unfortunately it did not seem to solve the problem. The second command vgchange -ay took a very long time (about an hour) and printed a bunch of these errors:

Thin pool qubes_dom0-vm--pool-tpool (253:9) transaction_id is 8558, while expected 8560.

Then I could mount the partition, but file /mnt/etc/lvm/backup/qubes_dom0 does not contain any mentions of 8558 and the first transaction_id is actually 8560 (and later, many other numbers, too). So I don’t understand what I should change in this file. Subsequent reboot without changes did not solve the problem.

By the way, I had to perform a couple of impossible tasks for that: choosing an ultimately trusted USB stick and an ultimately trusted live distro… Is reinstalling always a more secure action in such cases (so lost data might even worth it)?

unman · June 13, 2022, 12:01pm

In my experience Qubes is extremely forgiving of hard shutdowns, and not
at all fragile.
It isn’t (of course) recommended.

tzwcfq · June 13, 2022, 12:01pm

You can try to follow this guide:
https://blog.monotok.org/lvm-transaction-id-mismatch-and-metadata-resize-error/
After you create volume config backup with:
vgcfgbackup vg-fed -f /home/<user>/backup
Open the file /home//backup and change the transaction_id in qubes_dom0 { logical_volumes { vm-pool { segment1 { transaction_id = XXX } } } } block:

qubes_dom0 {
...
        physical_volumes {

                pv0 {
...
                }
        }

	logical_volumes {

                vm-pool {
...
                        segment_count = 1

                        segment1 {
...

                                type = "thin-pool"
                                metadata = "vm-pool_tmeta"
                                pool = "vm-pool_tdata"
                                transaction_id = XXX
...
                        }
                }

Change vg-fed to qubes_dom0 and thin-fed to vm-pool in the giudes commands.

It’s a good practice to keep a trusted USB with Live system for a system rescue.
Since you’ll need to create a trusted USB with Qubes installer ISO to reinstall the system anyway then you can create and use it for system rescue when you boot from it and enter Troubleshooting → Rescue a Qubes OS system in grub.

fsflover · June 17, 2022, 5:53pm

Thank you very much, my problem is solved now! Pure magic!

The problem is (or was) that the transaction_id was already correct (if you trust the error message from vgchange -ay). So I changed it to the “wrong” one, 8558. I guesst it’s not why the problem got solved.

vextend qubes_dom0/vm-pool --poolmetadatasize -L+128M

This gave me a syntax error, however this

lvextend qubes_dom0/vm-pool --poolmetadatasize +128M

worked.

Then I tried

lvconvert --repair qubes_dom0/vm-pool

and, after some waiting, it fixed the problem for me. I can boot and even the VMs can start. Booting however takes a bit more time for some reason with a slight delay at

Job is running for LVM event activation on device 253:0

I guess I will just backup the qubes and reinstall the OS, just in case.

Div · September 30, 2023, 5:38pm

This is such a fantastic solution. Worked like a charm! Thank you!

qinix · October 2, 2023, 3:37pm

This “transaction-id is xxxx, while expected yyyy” issue just occured here. Unfortunately that blogpost is down but it is in the archives. So the solution was:

sudo vgcfgbackup qubes_dom0 -f /home/username/backup

Change transaction_id in the backup file as given above. Then:

sudo vgcfgrestore qubes_dom0 -f /home/username/backup --force

lvchange -ay qubes_dom0/root

Reboot.