Clean install of Qubes 4.3 with restored VMs on a machine I’ve run Qubes on for the past 3 years has resumed from suspend without a problem for weeks until tonight where I had to perform a hard shutdown to use the computer again.
TL;DR: Unsure what happened, or why reverting to the oldest /var/lib/qubes/backup/qubes-* lost all of my VMs, and the second-oldest backup restored the VMs, or whether one or more of the warnings lvconvert throws might be clues despite the fact my vm-pool has plenty of space.
The time between decrypting the drive and getting the login screen seemed unusually long. Upon logging in, no VMs were visible and sys-usb didn’t auto-start. Any qvm-* commands run from dom0 would throw similar errors to Qubesd fails to start ANY qube. (I saw similar errors during a failed attempt to upgrade from Qube 4.2.4 to 4.3–I ended up resolving this by performing a clean installation of 4.3 and restoring my VMs from backup).
I forgot where I came across the idea, but I moved dom0’s current /var/lib/qubes/qubes.xml file to /var/lib/qubes/backup/ and copied the oldest backup from /var/lib/qubes/backup/ to /var/lib/qubes/qubes.xml
This seemed to resolve the qvm-* commands issue, but now dom0 is the only vm visible from the Qube Manager–all of my other qubes were gone.
Searched the forums and Hard shutdown broke my system most closely described the circumstances of my issue, but I didn’t have any transaction ID errors.
I tried to run the following commands from that post:
sudo lvconvert --repair qubes_dom/vm-poolWARNING: Sum of all thin volume sizes (1.42 TiB) exceeds the size of thin pools and the size of whole volume group (<475.34 GiB).
WARNING: You have not turned on protection against thin pools running out of space.
WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
WARNING: LV qubes_dom0/vm-pool_meta holds a backup of the unrepaired metadata. Use lvremove when no longer required.sudo vgchange -ay175 logical volume(s) in volume group “qubes_dom” now active`
sudo lvconvert --repair qubes_dom/vm-poolCannot repair active pool qubes_dom0/vm-pool. Use lvchange -an first.
sudo lvchange -anNo command with matching syntax recognised. Run ‘lvchange --help’ for more information
sudo lvextend qubes_dom0/vm-pool --poolmetadatasize +128MSize of logical volume qubes_dom0/vm-pool_tmeta changed from 104.00 MiB (26 extents) to 232.00 MiB (58 extents).
WARNING: Sum of all thin volume sizes (1.42 TiB) exceeds the size of thin pools and the size of whole volume group (<475.34 GiB).
WARNING: You have not turned on protection against thin pools running out of space.
WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
Logical volume qubes_dom0/vm-pool successfully resized.
I don’t get the sense any of these commands really did anything. I reverted to the second-oldest backup in/var/lib/qubes/backup/ and rebooted to see if it’d make a difference.
This seemed to restore all of my VMs and gave me a chance to take another backup, document my adventures, and try investigate each of these warnings. I’m not even sure these warnings are related. The disk usage tray widget indicates vm-pool has plenty of room for data and metadata.
WARNING: Sum of all thin volume sizes (1.42 TiB) exceeds the size of thin pools and the size of whole volume group (<475.34 GiB).
-
How to compare dom0 snapshots, to find out possible malware / compromise? indicated:
The warning you get is a friendly reminder that the sum of all Thin LVMs (LVMs are "logical volumes not fully provisioned (difference between Thin LVMs and plain LVMs), where Thin means only consumed space is taken from the Volume Group).
Basically, this is telling you that if you consume all assigned space in Logical Volumes, the Volume Group would be overfilled.LVMs are Copy on Write (CoW) volumes. That is, creating a snapshot is no cost, but whatever changes that happens on top of a snapshot or over the original volume that was snapshotted will diverge later on. When changes starts between those volumes, this is where storage cost begins to show: the divergence between those volumes will grow as fast as the delta between those several snapshots at block level.
- Made this seem more like information than an actual warning
WARNING: You have not turned on protection against thin pools running out of space.
- Couldn’t find any relevant forum posts or guidance on this, but it does seem like a good idea to enable if possible.
WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
- Found some posts recommending adjusting the thin_pool_autoextend_threshold variable, but all of them were commented out in dom0’s
/etc/lvm/lvm.conffile, so I’m not even sure if I should uncomment it.
Has anyone else experienced anything similar to this? qubesd breaking or losing all VMs after a hard shutdown from suspend? Does anyone have any recommendations to prevent this from happening or any repair commands I should consider?