Thin pool corruption after over-allocation- VMs won't start, txn id mismatch

TreadingWater · January 14, 2026, 1:36pm

After going back and forth with Claude, this is the summary it generated from our discussion:

What Went Wrong:

Qubes OS refused to start 4 critical VMs (sys-net, sys-firewall, sys-whonix, wallet)
Initial errors: thin pool activation failures, then “snapshot already exists” conflicts
Root cause: Massive over-allocation - 480GB promised to VMs but only 85GB actual thin pool space
Final state: Thin pool corrupted with transaction ID mismatch, preventing all repairs

Diagnosis Process:

Started with snapshot conflicts → tried to remove them
Hit pool activation failures → attempted refresh commands
Discovered only 9.6GB free in volume group (98% full)
Found 102 total volumes (way too many for relatively fresh install)
Uncovered the real problem: 24 VMs × 20GB = 480GB allocated vs 85GB pool capacity
Thin provisioning worked until actual usage filled the 85GB, then pool corrupted

What Went Right:

Successfully accessed dom0 and ran diagnostics
Identified the space exhaustion issue
Pool corruption caught before data operations attempted

Prognosis: Poor. The thin pool has corrupted metadata with transaction mismatches. Standard repair commands failed. Without backups:

Recovery options limited to rescue boot or Qubes forum expertise
May require reinstall with proper space planning
Some/all VM data likely unrecoverable

I’ll add that system was not properly shut down. I suspended the machine by closing the lid yesterday and opened it to the disc encryption screen this morning. It had plenty of battery remaining. In the month or so since I began using it, I’ve never properly shut it down. Could accumulated suspend cycles without clean shutdowns have cause the snapshot accumulation and pool corruption?

Any help would be greatly appreciated. Thanks