System Crash: No Boot Device - Crash happened after or during "Qubes Backup"

qufo · September 16, 2020, 8:56am

Hi, an hour ago I came back to my office and saw on my Laptop’s screen in big letters: “No Boot Device”. That means that during last night’s Qubes Backup the system crashed, but wouldn’t know why. Before I started the backup I did shutdown all Qubes except of sys-usb and dom0 of course. I use sys-usb to mount the external USB disk on which I do the backups. That never happened before in the last 3 months.

I think for some reason grub got corrupted. So, I started Tails and the partitions are all there. So what I think I have to do is to re-install grub. I found this document How to mount a Qubes partition from another OS | Qubes OS which tells me how to mount the partition so that I could chroot into it. But since I’m not so accustomed with Xen I don’t know how to re-install grub.

Any help would be highly appreciated.

System in question is described in the below post, which is the HCL + support files I contributed a couple of days ago:

https://www.mail-archive.com/qubes-users@googlegroups.com/msg35351.html

Disk-layout:

Blockquote
root@amnesia:~# lsblk -f
NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT
loop0 squashfs 0 100% /lib/live/mount/rootfs/filesystem.squashfs
loop1 squashfs 0 100% /lib/live/mount/rootfs/4.10.squashfs
sda
└─sda1 crypto_LUKS 5f0e319c-69a8-4e05-b37b-02ad1d5e4857
sdb
├─sdb1 vfat TAILS 0DBD-4495 6.6G 18% /lib/live/mount/medium
└─sdb2 crypto_LUKS 3ada973a-c2c0-487a-ab91-4b4afde32a2c
└─TailsData_unlocked
ext4 TailsData b086637e-f95f-4a8c-8556-565955d03936 45.5G 1% /live/persistence/TailsData_unlocked
nvme0n1
├─nvme0n1p1 vfat DD8A-3C1A
├─nvme0n1p2 ext4 c48bdfce-3a74-43c8-996f-b3a7cd9021a2
└─nvme0n1p3 crypto_LUKS 66de607a-b7e1-4bde-a3db-a4f6d7100a8c

qufo · September 16, 2020, 12:37pm

It must have happened during backup because the latest backup file is corrupt. I can’t untar it.

qufo · September 16, 2020, 5:51pm

Could rebuild the fs-tree but the EFI rebuild didn’t help. Still no bootable device. I did a fresh install and try to restore.

I still haven’t an idea what could have caused such a system crash. There was only dom0 and sys-usb running. sys-usb to hold the backup medium and dom0 ran the backup. So, only read access on the disks. How can that corrupt the boot record?

adw · September 18, 2020, 8:01am

Is it possible that you ran out of disk space (on the main drive on which Qubes is installed)?

qufo · September 18, 2020, 8:37am

I think that what happened. After digging into the source code I think that’s the most likely cause. I found that

python -m qubes.tarwriter

works in dom0 and hence writes there the collected data to the disk (/tmp). At a certain point qubes-backup splits the stream and scrypt starts to write the data to the backup medium, which in my case is a 4TB USB disk. That is quite cool because only encrypted data flows throw sys-usb and hence to the USB controller. But what I couldn’t find out how qubes-backup calculates the actually limit when to split the stream. In other words how much data are written in dom0 before the split occurs.

I also couldn’t find the source code for qubes.tarwriter. I found where it gets called, though:

qubes.backup — core-admin mm_203ee458-0-g203ee45-dirty documentation

   file_stat = os.stat(path)
                if stat.S_ISBLK(file_stat.st_mode) or \
                        file_info.name != os.path.basename(path):
                    # tar doesn't handle content of block device, use our
                    # writer
                    # also use our tar writer when renaming file
                    assert not stat.S_ISDIR(file_stat.st_mode), \
                        "Renaming directories not supported"
                    tar_cmdline = ['python3', '-m', 'qubes.tarwriter',
                        '--override-name=%s' % (
                            os.path.join(file_info.subdir, os.path.basename(
                                file_info.name))),
                        path]

Do you know where the tarwriter source is hiding?

qufo · September 18, 2020, 1:14pm

Found tarwriter:

https://github.com/QubesOS/qubes-core-admin/blob/master/qubes/tarwriter.py

deeplow · September 19, 2020, 6:38pm

6 posts were split to a new topic: Dev.qubes-os.org unavailable

qufo · September 20, 2020, 10:51am

I monitored the tmpfs and it turns out that qubes-backup writes 101MB and then sends it off to the backup medium, hence there couldn’t have been a space problem with the root fs (/). I keep digging…