Rescue Re-Install the EFI and Boot Grub initramfs Partitions

Maybe it was your dual boot. Made a mistake with GPU passthru. Or killed an update in progress. There’s a dozen ways to bork your boot. I see a lot of people asking,

HOW DO I REGENERATE MY EFI AND/OR BOOT PARTITIONS??

You’ve searched. You’ve read the threads. You tried some commands, but whatever you did just made it worse. And no amount of reading or re-reading is getting you closer to your goal.

I’ve done this dance many times, each time resolving that I’ll finally figure it out, rather than just reinstall like a dumb pleb. And EVERY SINGLE TIME, I end up reinstalling like a dumb pleb. Coz despite being reasonably proficient in a terminal, it’s faster/easier to reinstall than the endless trial/error nightmare of trying to recover the few megabytes that boot the system.

This is a travesty.

I’m sure that for some arcane technical reasons, the following sentiment is wrong, but I just installed this system a few days ago, and SOMETHING ran a few lines of instructions that populated those partitions with valid boot files. So what the hell needs to be run to recreate what the installer did? I propose:

A step by step, guide to recover both the EFI and boot partitions to a bootable state.

I’ll put what I think is correct below, doing my best to help n00bs that know little/nothing. Assuming that enough smart people come to the boot rescue; I will edit the guide below until it’s correct, and my boot is restored.

This is for a UEFI system. Qubes 4.2.

Make a bootable USB

Boot into the UEFI partition, and select
Rescue a Qubes OS system ; then
3) Skip to shell

Find the disk and partitions

lsblk -f
NAME         FSTYPE      FSVER  LABEL UUID
nvme0n1
  nvme0n1p1  vfat        FAT32        <basically-a-serialnum>
  nvme3n1p2  ext4        1.0          <another-serialnum>
  nvme3n1p3  crypto_LUKS 2            <another-serialnum>

“p1” is the EFI partition, responsible for getting you the screen where you can select a particular kernel, and readable by your motherboard UEFI.
“p2” is the actual BOOT partition that boots into your OS
“p3” is the encrypted operating system, dom0, and your VMs
Important: If you’re using a SATA SSD, you’ll have a disks named sda, sdb, sdc [and so on] … each with numbered partitions that look like sda1 sda2 sda3 [and so on]. Obviously you’ll need to substitute those partitions in place of nvme0n1p(1, 2, 3).

If lsblk -f didn’t help you figure out which disk, then fdisk -l might also help you narrow it down (if you’re like me and have alot of disks).
[inserts obligatory warning about storing nuclear secrets on a Qubes system that shares a dual boot and multiple separately purposed disks]

Decrypt and Mount Your Data Partition

cryptsetup luksOpen /dev/nvme0n1p3 qubes_rescue
Enter passphrase for /dev/nvme0n1p3: 
vgchange -ay qubes_dom0
mkdir /mnt/rescue
mount /dev/qubes_dom0/root /mnt/rescue

dom0 of your encrypted partition will now be mounted to: /mnt/rescue. Side note for the curious: “qubes_rescue” is arbitrary. You could call it anything you wanted, and LUKS would create the decrypted partition under that device name. Same with our arbitrary directory “/mnt/rescue”

Mount EFI, BOOT, and Crucial System Directories

mount /dev/nvme0n1p2 /mnt/rescue/boot
mount /dev/nvme0n1p1 /mnt/rescue/boot/efi
mount --rbind /dev /mnt/rescue/dev
mount --rbind /proc /mnt/rescue/proc
mount --rbind /sys /mnt/rescue/sys

rbind is necessary because the commands used to rebuild the efi and boot, rely on a number of special files and devices which only exist for a live/running system. But the /mnt/rescue root environment is not a live/running system, so we “bind” the respective directories to ones created by the live USB.

Rebuild the BOOT and EFI

chroot /mnt/rescue
grub2-mkconfig -o /boot/efi/EFI/qubes/grub.cfg

Here is where I’m running into problems:

I’m seeing two different commands, and I’m not sure which is right (or both), and moreover, I’m getting errors:

grub2-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=qubes
grub2-install: error: /usr/lib/grub/x86_64-efi/modinfo.sh doesn't exist. Please specify --target or --directory.

efibootmgr -v -c -u -L QubesOS -l /EFI/qubes/xen.efi -d /dev/nvme0n1 -p 1
Could not prepare Boot variable: Success
error trace:
   <6 lines of tracing which I wont rewrite here>

I think I’m close, but not quite sure what to do here. Does a dnf package need to be reinstalled? Will that required bringing up the network connection? One person who was particularly helpful mentioned copying the efi/EFI/qubes/* files to efi/EFI/BOOT and changing some filenames but that didnt work either.

Unmount everything, close up, reboot

exit
umount -R -l /mnt/rescue/dev
umount -R -l /mnt/rescue/proc
umount -R -l /mnt/rescue/sys
umount -R -l /mnt/rescue/run
umount /mnt/rescue/boot/efi
umount /mnt/rescue/boot
umount /mnt/rescue
vgchange -an qubes_dom0
cryptsetup luksClose qubes_rescue

I had to include -R and -l to get everything to unmount. On one occasion had to manually unmount /mnt/rescue/run , in order to unmount /mnt/rescue.

Conclusion

Like I said, I think I’m close. I really hope to make this a comprehensive boot rescue for 90% of people running qubes on a single UEFI disk, regardless of whatever problems they have (or created while trying to recover).

Note that, what I’m doing here really isnt a full recreation of the EFI and BOOT partitions, but would be fixing the most common problems. I still feel, well honestly, a bit dismayed that recovering boot on Linux is so difficult. How is there not an inbuilt command that just rebuilds the proper partitions? This isnt a Qubes specific problem. I’ve never been able to recover a boot once its gone bad, on any Linux.

Well anyways, hoping that some SmartGuys can help me here. Thanks in advance.