Stuck in login loop

Setup

  • OS: Qubes OS 4.0.4
  • Hardware: Purism Librem 14
  • Hypervisor: Xen 4.8.5-34
  • Firmware: PureBoot 17.1

Issue

Qubes is stuck in a loop at the login prompt. Upon entering the incorrect password, the login prompt displays the appropriate incorrect password message. Upon entering the correct password, the GUI cuts to a red screen showing what appears to be boot logs for less than a second and then cuts back to the login prompt. Attempting to login via TTY (using Ctrl+Alt+F2) results in the same sort of behavior.

Context

I recently installed Qubes OS 4.0.4 on a recently acquired Librem 14. Everything was working fine for a few days. I admittedly made a fair number of changes yesterday. Most of the changes were within AppVMs (setting autostart to true for a number of Qubes, changing initial memory settings, adding autostart files, installing oh-my-zsh, etc.) I also installed devilspie2 in dom0 (I’m aware of risks of installing software in dom0) and configured it to autostart via ~/.config/autostart. Some of my templates also received updates via the regular update mechanism. I then rebooted and immediately ran into the above issue. The login advanced past entering the LUKS disk password without a problem.

This login loop issue was reported on a post on the Qubes subreddit close to a year ago as well.

Debugging

I am able to mount the OS disk manually by booting into a Qubes live USB installer and using the recovery prompt and using these commands:

$ blkid | grep LUKS
…
$ cryptsetup luksOpen -v /dev/nvme0n1p2 qubes_dom0
…
$ cd /dev/mapper/
$ vgchange -ay qubes_dom0
…
$ vgscan --mknodes
…
$ mkdir -p /mnt/media
$ mount /dev/mapper/qubes_dom0-root /mnt/media/
…
$ umount /mnt/media
$ vgchange -an qubes_dom0
…
$ cryptsetup close qubes_dom0
$

From this, I have done the following from the dom0-root volume:

  • Removed the devilspie2 autostart config file from dom0 home directory and then rebooted into Qubes. This did not solve the issue.

  • Removed the autostarting of all qubes on boot by deleting the the /etc/systemd/system/multi-user.target.wants/qubes-vm@*.service symlinks, including sys-net, sys-firewall, sys-usb, sys-whonix, and then rebooted into Qubes (from Qubes issue 4312). This did not solve the issue, and I confirmed via journalctl —directory=/mnt/media/var/log/journal/ that Qubes did not start these qubes on boot.

I am able to look at the Qubes logs by mounting the dom0-root volume, but nothing is jumping out at me. Is there a specific log I should be looking at? Any other tips for debugging this?

Most recent boot journal logs

$ journalctl -b --directory=/mnt/media/var/log/journal/ | egrep -I "warn|error|fatal|fail"
Jun 19 00:14:47 dom0 kernel: PM-Timer failed consistency check  (0xffffff) - aborting.
Jun 19 00:14:47 dom0 kernel: RAS: Correctable Errors collector initialized.
Jun 19 00:14:49 dom0 kernel: i915 0000:00:02.0: Failed to program MOCS registers; expect performance issues.
Jun 19 00:15:52 dom0 kernel: xen_acpi_processor: (CX): Hypervisor error (-14) for ACPI CPU1
Jun 19 00:15:52 dom0 kernel: xen_acpi_processor: (CX): Hypervisor error (-14) for ACPI CPU3
Jun 19 00:15:52 dom0 kernel: xen_acpi_processor: (CX): Hypervisor error (-14) for ACPI CPU5
Jun 19 00:15:53 dom0 kernel: platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
Jun 19 00:15:53 dom0 kernel: cfg80211: failed to load regulatory.db
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: asr: seeking device "/dev/dm-38" to 18446744073709551104
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: ddf1: seeking device "/dev/dm-38" to 18446744073709551104
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: ddf1: seeking device "/dev/dm-38" to 18446744073709420032
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: hpt45x: seeking device "/dev/dm-38" to 18446744073709545984
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: isw: seeking device "/dev/dm-38" to 18446744073709550592
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: jmicron: seeking device "/dev/dm-38" to 18446744073709551104
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: lsi: seeking device "/dev/dm-38" to 18446744073709551104
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: nvidia: seeking device "/dev/dm-38" to 18446744073709550592
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: pdc: seeking device "/dev/dm-38" to 137438913024
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: pdc: seeking device "/dev/dm-38" to 137438920192
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: pdc: seeking device "/dev/dm-38" to 137438927360
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: pdc: seeking device "/dev/dm-38" to 137438934528
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: sil: seeking device "/dev/dm-38" to 18446744073709551104
Jun 19 00:15:53 dom0 fedora-dmraid-activation[2225]: ERROR: via: seeking device "/dev/dm-38" to 18446744073709551104
Jun 19 00:15:54 dom0 libvirtd[2402]: 2021-06-19 00:15:54.346+0000: 2434: error : virConnectOpenInternal:1118 : no connection driver available for qemu:///system
Jun 19 00:16:06 dom0 lightdm[2532]: Could not chown user data directory /var/lib/lightdm-data/user1: Error creating directory /var/lib/lightdm-data/user1: File exists
Jun 19 00:16:07 dom0 lightdm[2532]: Could not chown user data directory /var/lib/lightdm-data/lightdm: Error creating directory /var/lib/lightdm-data/lightdm: File exists
Jun 19 00:18:10 dom0 lvmetad[1381]: Failed to accept connection errno 11.
Jun 19 00:18:14 dom0 systemd-cryptsetup[3313]: Failed to deactivate: Device or resource busy
Jun 19 00:18:14 dom0 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-cryptsetup@luks<...> comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Jun 19 00:18:14 dom0 systemd[1]: systemd-cryptsetup@luks<...>.service: Unit entered failed state.
Jun 19 00:18:14 dom0 kernel: audit: type=1130 audit(1624061894.822:175): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-cryptsetup@luks<...> comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Jun 19 00:18:14 dom0 systemd[1]: systemd-cryptsetup@luks<...>.service: Failed with result 'exit-code'.
$

Thank you for any help!

1 Like

You made too many changes at once with out confriming that they would work or not cause any issues. You really should do things in step instead of “at all at once”. In order to isolate the issue you need to know what you last did but you don’t but. Now you have made more work for yourself.

Wipe reinstall Qubes…

Sure, I admit that I made a number of changes, because I was setting up my system. Suggesting that I reboot after every change is not practical. It would take an order of magnitude longer to set up.

Your comment is not helpful to debugging this quite severe problem, which already impacted another Qubes user (see Reddit post). If we all just wiped and reinstalled every time we hit a critical problem then these sorts of bugs would never get resolved and then we are left with a building full of broken windows.

When looking for bugs or for that matter system changes or installing software one must excerise “caution” and test before going forward with anything else. Thus then “isolating” issuses or bug and being able to then address known problems, that which you did not do.

Buy maybe you and the other user ober on (redeit) suffer the same problem. Do everything at once and hope for the best or everything will work out. And that never happens my friend…test,test,test, conform, confirm confirm

Took a slow motion video of a login attempt to TTY and noticed the following line this time:

no shell: No such file or directory

Interestingly, I can’t find this error message grepping all of /var/log/. Regardless, it made me realize that I had configured a shell that did not exist. I modified /etc/passwd to change the default shell for the user back to /usr/bin/bash and everything is now working fine.