I have finished --all-pre-reboot stages smoothly. Restarted Qubes, and I am getting 4.3 screen with only linux 7.0.9 f41 entry, while all other - 7.09 (again), 7.06, 6.18.35 are f37(?)
When choosing linux 7.0.9 f41 entry it can’t pass the screen and continuously restart to the menu.
I can boot to f37 entries, but getting lot of errors where nothing works, except dom0, even qubes-qube-manager cannot start
How to start to debug why the system keeps restarting when choosing linux 7.0.9 f41 entry. because I guess that is the one I have to boot to, to finish in-place upgrade.
I would first separate the boot problem from the post-reboot upgrade step. Since the older entries still boot, use one of those only to inspect logs and confirm the new 7.0.9/f41 kernel/initramfs entry was generated correctly, not to make more package changes immediately.
Useful first checks from dom0 are the previous boot logs and the exact grub entry: journalctl -b -1 -p warning…alert, journalctl -k -b -1, and whether /boot has matching vmlinuz/initramfs files for that 7.0.9 entry. If it reboots so fast that logs are empty, try removing quiet/rhgb from the grub entry for one boot so the last message is visible. Then paste that last error here before rerunning the upgrade command.
Hey @NuageQubes81, thanks for your response. I have no means to copy/paste log parts here, but basically what I see is that during boot there are a lot of errors about Firmware Bugs on CPU cores about APIC ID mismatches. Never saw before. Then, respectively tmpfs: Unsupported parameter @huge@Then i915 error that says it failed disable qgv points 0x307, some block groups have wrong amount of free space, then ACPI BIOS Error (bug)" AE_AML_BUFFER_LIMIT… is beyond end of object, then bunch of errors Failed to start qubesd.service - Qubes OS daemon with result ‘exit code’ and finally Device luks-xxxx is still in use with Failed to deactivate it.
Never saw these errors earlier and I was proud to have had a log without errors except proc-thermal_pci xxx error" proc_thermal_add, will continue. But that error was about sensors I don’t have I think I realized at some point, so benign.
Thanks for the hint @parulin. So these APIC ID mismatches are safe to ignore for now… Any idea why qubesd service cannot be started, or it doesn’t matter when the system is resetting at the blue screen when linux 7.0.9x f41, since those were the log parts prior to reset the computer to get to STAGE 4 all-post-boot?
When I edit grub for each entry, I see that f37 entries have root=UUIID=xxxx while it is rd.luks.uuid=yyyyyy
and for f41 entry it is root=/dev/mapper/yyyyyy and rd.luks.uuid=yyyyyy so, both “yyyyyy” 's are the same as for f37 in its case and no “xxxxx” for f41 entry.
I removed rhgb quiet and I-m getting:
/dev/root: can’t open blockedv
VFS: Cannot open root device “root=/dev/mapper/yyyyy” or unknown block (0,0): error -6
Please append a correct “root=” boot option: here are the available partitions:
ext3
ext2
ext4
btrfs (this is how the Qubes was installed)
Kernel panic: not syncing: VFS: unable to mount root fs on unknown block (0,0):
… and a few more lines after that, not helpful for debugging anymore.
Any idea? I tried to manually enter root=UUIID=xxxx instead of root=/dev/mapper/yyyyy but to no avail, it can’t be mounted too.
That VFS line makes the f41 entry look like a boot/initramfs/root-argument problem, not just a qubesd problem. I would not rerun the upgrade yet. From a bootable old entry, compare the working and failing grub stanzas and also check what names actually exist with lsblk -f and ls /dev/mapper after boot. If the old entry reaches the system because it uses a different root= form, the safer next step is probably regenerating the f41 initramfs/grub entry so it matches the real encrypted/root layout, rather than hand-editing the menu each time. Posting the redacted working vs failing linux lines would help a lot.
Thanks for the tip @quietbadger27. I have tried to regenerate initramfs, but it complained about (infamous) lack of space (thanks a bunch for the tip! Nothing further couldn’t be done without that!), so I cleaned it and succeeded to regenerate it but to no avail again. Couldn’t pass the grub screen with the same error messages stated above.
Meanwhile, I have tried to recover qubesd service. I have finally traced it (service by service) to the first missing point:
libvirtd.socket: Failed to queue service startup job: Unit virtlogd.socket not found
Since I always keep a copy of dom0’s /etc and /usr directories in a user@dom0 too, not only in other VMs/external drives (yep, I know I am smart ), I have just copied backup of /systemd missing files into /etc/systemd, manually enabled and started service by service, finally to succeed to recover ‘qubesd’ service. That of course didn’t resolve f41 missing entry.
But I got an idea. I booted f37 now fully working entry, and all I need to wait for was… a new kernel update, in this case 7.0.12 from the current-testing repo.
Ran sudo qubes-dom0-update --action=upgrade first, just in case to get new kernel.
After that, manually ran STAGE1-3, only smoothly to pass it with an updated kernel, which now caused proper rebuilding of the grub and regenerating initramfs, and after reboot, here I am writing from 4.3.1 f41!
I will not mark any post as a solution, because it was rather a complex process, with the much appreciated helpful tips from many of you guys. I will edit topic subject as [solved] instead.
So, as usually, at the end of the day it’s not about