Vault VM not starting any more after disk space issues, lost 400GB data

Hi everyone, I think I’ve broken my vault VM and lost all data 400GB :frowning: and don’t know what to do now, I was trying to find some solution, but couldn’t find anything. Please help me restore my data.

Scenario of the issue:

  1. I got “Out of disk space” error (yellow triangle warning saying 5% space left)
    so I started deleting the biggest files from vault, but it didn’t help
  2. as it didn’t helped thought is save to just get back data and I put more data to Vault, and it was too much, because suddenly yellow triangle warning gave me the error saying there is 0% space left, 100% is used. So I wanted to remove some data from vault, but it didn’t work, because I got error saying “file system is read only”
  3. I restarted whole system and when I try to start Vault VM it doesn’t start, only CPU is working at 50%, and RAM is 400MB (this is initial value which is set in Vault VM settings). After a few minutes whole system is freezing and only cold restart helps.

I don’t have any issues with other VMs, I can run them.

Is there a possibility to get access to Vault VM from another VM or dom0, just to restore data?
Is there any possibility to run Vault VM?
I am scared to change anything now, but maybe changing settings will help? like giving more initial memory and increasing private storage max size?

please help me
thank you in advance!

Increasing vault private storage may help.
But in some cases it won’t help and you may need to fix vault filesystem form dom0. First shutdown vault then run this command in dom0:
sudo fsck -f /dev/mapper/qubes_dom0-vm--vault--private
But first thing I’d backup data from vault private storage and only after this I’d try to recover vault storage:
In dom0:
sudo losetup -f /dev/mapper/qubes_dom0-vm--vault--private
New blockdevice will be available dom0:loopX - attach it to some qube and then attach some drive that you can use to backup your info on.
After you copy all info from vault private storage and detached blockdevice dom0:loopX from your qube run this command in dom0 to remove loop device:
sudo losetup -d /dev/loopX
Now you can try to recover your vault storage as I’ve advised at the beginning.

1 Like

thank you very much for your answer tzwcfq!

after running sudo losetup -f /dev/mapper/qubes_dom0-vm--vault--private I can see new doom0:loop1 device under Qubes Devices, and I’ve attached it to the VM, but now I don’t know where to find info, could you please tell me filepath to the info I should copy?

You can look for available devices in VM:
sudo fdisk -l
When you connect block device to VM it’ll show there as /dev/xvd* device starting from /dev/xvdi in ascending alphabetical order so next device will be /dev/xvdj …
So you can mount your vault storage like this:
sudo mount /dev/xvdi /mnt

1 Like

thank you very much tzwcfq!

  1. I can see data, is there a possibility it is corrupted? or it can be just one file what was in the process of copying until no space issue happened?
  2. if I can see data but can not run VM, what exactly happened?
  3. is it worth to restore VM or I just delete it and create new with the same name?
  4. how to prevent the situation happening again in future? I remember I lost one VM in the past but didn’t know why, now I understood it was the same issue

The data shouldn’t be corrupted except from files that were copied but didn’t have enough space to write.

You can connect to you VM console and see what error is it giving you from dom0:
qvm-console-dispvm vault

It’s worth to just restore it.

Maybe install some software that will periodically check your disk usage and notify if your free disk space is low. Or just create some script that will start at startup and periodically check your disk space and if it’s low then send notification:
notify-send --icon=dialog-warning "$(hostname) disk space is low!"
Then add this script in template so it will start at boot for all your qubes.

1 Like

thank you @tzwcfq !
I’ve increased private storage by 10GB, and run vault, but it didn’t help.
Connected vault console shows:
“A start job is running for Initiali…rw and /home (5min 5s / no limit)” step
EXT4-fs (xvdb): error count since last fsck: 110
EXT4-fs (xvdb): initial error at time 1655… : ext4_ext_map_blocks:4310: inode 288…
“A start job is running for Initiali…rw and /home (8min 5s / no limit)” step

Run fsck for vault private storage then.

it fixed the file system and now I am able to run vault, but I’ve noticed I can not run another VM because of “cannot connect to qraxec agent for 60 seconds…” should I fix it the same way as vault?

You can connect to its console and check for errors. If it’ll have ext4-fs errors then you can fix it in the same way.

1 Like

yes, I can see error “Failed to start, File System Check on Root Device”

I am doing backup from 2nd broken VM, but can not see browser bookmarks anywhere, is it possible to find and backup bookmarks from dom0/loopX? or it shouldn’t be broken?

What’s this VM type? Is it StandaloneVM or AppVM? If AppVM then based on what template? What browser do you have there?

1 Like

I did run command sudo fsck -f /dev/mapper/qubes_dom0-vm--2nd_VM_NAME--private

it showed:
Pass 1:…
Pass 2:…
Pass 3:…
Pass 4:…
Pass 5:…
/dev/mapper/… (2.9% non-contigous)…

I didn’t get any message about fixing anything like before (for vault), and when I want to run this 2nd VM it doesn’t run, just showing “cannot connect to qrexec agent for 60 seconds…” error and it shutdowns itself. Logs still show “Failed to start, File System Check on Root Device”

it’s standalone with a few different browsers, eg. FF and Brave

If it’s standalone then you need to fix its root storage as error says to you:

Run this command:
sudo fsck -f /dev/mapper/qubes_dom0-vm--2nd_VM_NAME--root
And your /home directory seems to be on the root storage as well as it seems that you didn’t use the private storage for your /home directory.

1 Like

would it be possible to restore browser bookmarks before this operation?

Mount root storage:

1 Like

I tried to run it but got error:
ext2fs_open2: Bad magic number in super-block
fsck.ext2: Superblock invalid, trying backup blocks …
fsck.ext2: Bad magic number in super-block while trying to open /dev/mapper/qubes_dom0-vm–2nd_VM_NAME–root
The superblock could not be read or does not describe a valid ext2/ext3/ext4 filesystem…you might try running e2fsck with an alternate superblock:
e2fsck -b 8193
or
e2fsck -b 32768
Found a gpt partition table in /dev/mapper/qubes_dom0-vm–2nd_VM_NAME–root

My bad, forgot that root have multiple partitions.
Before running fsck you need to run kpartx.
Check what’s the root partition number on your VM storage:
sudo fdisk -l /dev/mapper/qubes_dom0-vm--2nd_VM_NAME--root
For example, its 3rd partition. Then run kpartx:
sudo kpartx -a /dev/mapper/qubes_dom0-vm--2nd_VM_NAME--root
Then run fsck for your 3rd partition in this example:
sudo fsck -f /dev/mapper/qubes_dom0-vm--2nd_VM_NAME--root3
After this run:
sudo kpartx -d /dev/mapper/qubes_dom0-vm--2nd_VM_NAME--root

1 Like

I can see three partitions:
/dev/mapper/qubes_dom0-vm–2nd_VM_NAME–root-part1 EFI System
/dev/mapper/qubes_dom0-vm–2nd_VM_NAME–root-part2 BIOS boot
/dev/mapper/qubes_dom0-vm–2nd_VM_NAME–root-part3 Linux filesystem
which one is root partition? is it the Linux filesystem?

This one. So run this command:
sudo fsck -f /dev/mapper/qubes_dom0-vm--2nd_VM_NAME--root3

1 Like