Template update breaks all Apps / CLI commands

I have a template based on Debian-10 called d10-social. In this template, I have some social apps installed, like Slack, Element, Zoom. Then I have AppVMs using this template, like work-social, and personal-social.

My most recent backup is from March 27. At that time, I had no problems with any of my template or app VMs.

On March 28, I updated all VMs using the Qubes update manager. Now, when I try to open any app from an AppVM using the d10-social template, no windows/GUI features will launch. For example, even just trying to open gnome-terminal, it will not open. The Qube Manager shows the work-social AppVM is running, but trying to open any apps on there does nothing. Please see this reply which elaborates on the issue, where I discovered that no terminal commands (e.g., whoami) will execute on AppVMs based on this updated termplate. Perhaps the services are starting but I don’t actually see any windows. For what it’s worth, I also cannot access these apps from the d10-social template VM itself after updating.

Restoring the d10-social template VM to the backup I created on March 27 is a work-around - I can then open and use apps once again. But after an update of the template VM and restart, all AppVMs based on that template become useless.

I have other VMs based on Debian-10, like a d10-dev VM that I use for development. This continues to work fine after the most recent updates. This leads me to believe that something unique to the d10-social template VM is causing issues with Qubes.

How do I resolve this? What additional info can I provide to help triage? Is this an issue introduced by some Debian package maintainers? Or more likely an issue that can be resolved with Qubes?

Hi @0x9060 ,

the first idea is an updated debian app/lib, but your social applications depend on a previous app/lib version (but this doesn’t explain the gnome-terminal issue !).

I suggest you run from dom0 terminal:

qvm-run personal-social xterm

This try to launch the xterm application (a simplier gnome-terminal) for your personal-social AppVM (and start the AppVM if not yet started).

If failed, please report here the error message.
Else if worked, try to launch your social application from the terminal, and if the application launch failed please report here the error message.

Thanks for the reply, @ludovic.

Unfortunately, I have no errors to report. Just more “phantom” apps:

From dom0, running qvm-run personal-social xterm does not result in an error, but also the xterm app window does not open. Running the command causes the dom0 terminal to “hang” (which is normal) as it is likely waiting for me to interact with the xterm session and then terminal it, before returning the command on dom0. So, this appears to be working fine from dom0 (in that dom0 is not reporting any errors), but I still can’t see/access the launched app window from personal-social AppVM.

Same result from running the command against the templateVM directly (e.g., qvm-run d10-social xterm) - the dom0 terminal doesn’t return an error, and appears to just be waiting for me to interact with the launched xterm session.

weird…

Which Qubes OS version? 4.0.4 or 4.1 alpha ?

Check the TemplateVM and AppVM logs (with the vm-troubleshooting documentation).

1 Like

I think R4.0.3

I performed the install using the R4.0.3 image, and have been diligently updating all qubes (including dom0) since installing 9 months ago. From dom0, cat /etc/fedora-release just returns “Qubes release 4.0 (R4.0)”. I can’t tell what patch version I’m on.

From the Qube Manager, there is a Version Information info prompt in the About section which includes a list of installed packages/versions on dom0, but I don’t see any reference to a OS version there.

Definitely not using any alpha or beta versions. Stable all the way.

Ok, so you are in 4.0.4 (= 4.0.3 + updates).

And did you check the logs?

Yes, I’m not seeing anything different than other working AppVMs (based of off different Debian templates).

I’m tailing the logs from dom0, and don’t see any logs flowing when launching apps. In particular, reproduction:

  • From dom0, run tail -f -n 30 /var/log/xen/console/guest-personal-social.log
  • Restart the personal-social AppVM. Logs flow in the dom0 tail session, with the last few lines:
[    14.672653] fbcon: Taking over console

Debian GNU/Linux 10 personal-social hvc0

personal-social login:
  • Run qvm-run personal-social xterm from dom0, and no logs are output to this file (and the xterm window does not appear)

Rerun the above from my fully functioning personal-dev AppVM, which is based on a different template, and the log output is the same.

Grepping for errors in this logfile, I only see 1 (consistent with other VM logfiles):

[   8.220031] Error: Driver 'pcspkr' is already registered, aborting...

This error message is apparent in both working and non-working AppVM logfiles.

Update on the problematic Template/AppVM:

It actually appears that all commands “hang”. For example, when running qvm-run personal-social whoami from dom0, the dom0 terminal is “hung” (apparently waiting for the command to execute). The terminal looks like so:

[user@dom0 ~]$ qvm-run personal-social whoami
Running 'whoami' on personal-social

and will remain this way until I ctrl-c cancel the command.

Running this command against other working AppVMs executes the command quickly:

[user@dom0 ~]$ qvm-run personal-dev whoami
Running 'whoami' on personal-dev
[user@dom0 ~]$ 

(it executes instantly and returns the cursor).

I just wanted to propose useing the --pass-io parameter and doing
something like

qvm-run -pass-io personal-social whoami

Anyway, since it’s hung we can assume that something with qrexec is not
working anymore.

My proposal:

  1. restore template version form March 27 from backup
  2. make a clone of it for testing
  3. qvm-run -u root cloned-template xterm

In there do:

apt update
apt full-upgrade

… and see if there are any warnings about conflicts. At the very least
we should see what get’s updated and thereby have a better idea what to
look at next.

Basically you are doing the updating manually to see the details.

1 Like

Thanks, @Sven. Solved. Root cause: Template was out of disk space (on /dev/xvda3).

I should have suspected this, but manually performing the update made it obvious. Details:

  • My template has a max system storage of 12240 MiB. Qubes Manager reports 11704 MiB of disk usage. So >500 MiB disk space free.
  • Running apt full-upgrade requires me to confirm the operation, reporting “After this operation, 326 MB of additional disk space will be used.” Seems like I should be safe.
  • After confirming and letting this run, I see 4 general stages of apt full-upgrade: 1) “Gets”, 2) “Preparing & Unpacking”, 3) “Setting up”, 4) “Processing triggers”. The device runs out of disk space in the “Processing triggers” phase, with the following few lines of messages:
...
Processing triggers for libc-bin (2.28-10) ...
Processing triggers for systemd (241-7~deb10u7) ...
Processing triggers for man-db (2.8.5-2) ...
/usr/bin/mandb: can't write to /var/cache/man/13358: No space left on device
/usr/bin/mandb: can't create index cache /var/cache/man/13358: No space left on device
Processing triggers for gnome-icon-theme (3.12.0-3) ...
Processing triggers for qubes-core-agent (4.0.61-1+deb10u1) ...
...
Processing triggers for initramfs-tools (0.133+deb10u1) ...
update-initramfs: Generating /boot/initrd.img-4.19.0-16-amd64
cryptsetup: WARNING: The initramfs image may not contain cryptsetup binaries
    nor crypto modules. If that's on purpose, you may want to uninstall the
    'cryptsetup-initramfs' package in order to disable the cryptsetup initramfs
    integration and avoid this warning.
I: The initramfs will attempt to resume from /dev/xvdc1
I: (UUID=blah blah blah)
I: Set the RESUME variable to override this.
  • I created a new clone of the template before the update. I increased the max system storage to 15000 MiB and manually performed apt full-upgrade on the template again. The last several lines from the update are as follows (exactly the same, but without the “No space left on device” error messages from man pages package:
...
Processing triggers for libc-bin (2.28-10) ...
Processing triggers for systemd (241-7~deb10u7) ...
Processing triggers for man-db (2.8.5-2) ...
Processing triggers for gnome-icon-theme (3.12.0-3) ...
Processing triggers for qubes-core-agent (4.0.61-1+deb10u1) ...
...
Processing triggers for initramfs-tools (0.133+deb10u1) ...
update-initramfs: Generating /boot/initrd.img-4.19.0-16-amd64
cryptsetup: WARNING: The initramfs image may not contain cryptsetup binaries
    nor crypto modules. If that's on purpose, you may want to uninstall the
    'cryptsetup-initramfs' package in order to disable the cryptsetup initramfs
    integration and avoid this warning.
I: The initramfs will attempt to resume from /dev/xvdc1
I: (UUID=blah blah blah)
I: Set the RESUME variable to override this.

Thanks everyone for the help on this!

I wonder why such an important error message, i.e. “No space left on device”, does not stop the update process and inform the user about it!?

It is amazing how the community works. I suffered a similar but not duplicate failure with Debian 10. Worked fine last night but today no Debian 10 qubes would start. Searched Goo*** but as usual ended up 10000000 hits that were pointless. I come here and ten minutes later I find this, increase my storage on the Debian template and what happens? Problem solved…
Thanks to all of those who share and support

4 Likes

Welcome @hammerhead8599
Yes indeed! Qubes Community i superb.

I had a problem with @discobot which @Sven & @deeplow solved in the blink of an eye :clap: and it required changes in @discobot code.

1 Like