Preloaded dispoables in 4.3

Measure startup time:

qvm-shutdown default-dvm
time qvm-run --no-gui --service -- default-dvm qubes.WaitForRunningSystem
qvm-shutdown default-dvm

If it takes too long to start:

systemd-analayze blame
systemd-analayze plot

Measure preload usage time:

qvm-features default-dvm preload-dispvm-max 1
disp="$(qvm-features default-dvm preload-dispvm)"
while ! qvm-check --paused -- "$disp"; do sleep 1; done

time qvm-run --no-gui --service – default-dvm qubes.WaitForRunningSystem


Increase `preload-dispvm-max` and loop the `qvm-run` calls. I recommend
experimenting with different templates (Debian and Fedora).

@unman @kenosen

I just noticed I made the instructions incorrect.

time qvm-run --no-gui --service --dispvm=default-dvm – qubes.WaitForRunningSystem

This is the correct one. Notice that the $disp variable only works for one preloaded. If there are more, a more elaborate script is needed.

They do have more than 1 loaded.

I’m not only preloading default-dvm :thinking: and rarely use it, but rather a modified default-dvm with various browsers.

Substitute --dispvm=default-dvm with any or every disposable template your prefer and try again.

qvm-features default-dvm preload-dispvm-max 1
disp="$(qvm-features default-dvm preload-dispvm)"
while ! qvm-check --paused -- "$disp"; do sleep 1; done
time qvm-run --no-gui --service -- default-dvm qubes.WaitForRunningSystem

I dont seee the point in this - surely that time call should be
--dispvm default-dvm to test spawning a disposable? Or am I missing
something?

Also, I think it is a mistake to exclude any time taken waiting for a
preloaded disposable to become available
- this is a crucial part of the
user experience.

On the basis of my testing, under moderate load, if there is a call
to a disposable, the preloaded version is significantly slower, and
sometimes neither the disposable nor the preloaded disposable would
start. Starting a NON preloaded disposable under the same load
was fine.

  1. I ran a simple script to time starting a disposable based on the
    default-dvm and an identical disposable template without preloading set -
    deb-dvm. Both are vanilla debian-13-xfce.

Average over 10 runs:
WaitForRunningSystem:
normal - 21.076
preload - 27.772

WaitForSession:
normal - 21.584
preload - 28.021

  1. With reduced load, starting a second disposable 30 seconds after the
    first completed. The load consisted of 12 running qubes , 6 of them part
    of infrastructure. All have mem=400 bar qubes-builder where mem=800. The
    only qube directly generating activity is a disposable running tor
    browser auto reloading 3 tabs every 2-3 minutes.
    Timing second start, average over 5 runs:
    WaitForRunningSystem:
    normal - 18.6
    preload - 23.12

WaitForSession:
normal - 21.152
preload -
No result is shown for preloaded because the disposable did not start on
request for the GUI call.

I never presume to speak for the Qubes team. When I comment in the Forum I speak for myself.
qvm-features default-dvm preload-dispvm-max 1
disp="$(qvm-features default-dvm preload-dispvm)"
while ! qvm-check --paused -- "$disp"; do sleep 1; done
time qvm-run --no-gui --service -- default-dvm qubes.WaitForRunningSystem

I dont seee the point in this - surely that time call should be
--dispvm default-dvm to test spawning a disposable? Or am I missing
something?

Yes, as per my most recent comments, I missed adding --dispvm=default-dvm.

Also, I think it is a mistake to exclude any time taken waiting for a
preloaded disposable to become available - this is a crucial part of the
user experience.

It was intended to know how much time is a good result on the tested hardware. You can wait for the first to pause and them loop qvm-run 10 times, but the result of the first iteration would show the best result.

I still need the qubesd journal logs. It is difficult to debug without it.

I dont remember seeing that comment. That’s fine.

I’m not sure what you mean by the best result? Is it not worth
averaging multiple runs?

It’s a little difficult for me to take them at the moment.
I’ll set things up for a run later and load up the logs.

I never presume to speak for the Qubes team. When I comment in the Forum I speak for myself.
user@dom0:~$ time qvm-run --no-gui --service --dispvm=default-dvm -- qubes.WaitForSession
Running 'qubes.WaitForSession' on $dispvm:default-dvm

real	0m9.824s
user	0m0.086s
sys	0m0.034s
user@dom0:~$ time qvm-run --no-gui --service --dispvm=default-dvm -- qubes.WaitForRunningSystem
Running 'qubes.WaitForRunningSystem' on $dispvm:default-dvm

real	0m18.501s
user	0m0.087s
sys	0m0.038s
user@dom0:~$ 

One reason that it is difficult to get results is that under load the
preloaded disposable repeatedly fail to load, together with the next
in line. This leaves the disposable template with no preload-dispvm.

Very often, it is then almost impossible to start a disposable from the
default-dvm, whereas it is possible to start a normal disposable. I
believe that the reason for this is that the system under load cannot
effectively start two disposables in short order, as it must in the
preloaded case.

Attached is part of the qubesd journal from a failed test session under load.

I never presume to speak for the Qubes team.
When I comment in the Forum I speak for myself.

(Attachment 30_qubesd is missing)

I’m not sure attachment work on the web forum.

Apparently not.

Excerpt from qubesd:

Nov 21 00:24:21 dom0 qubesd[2853]: INFO: vm.disp3630: Starting qube disp3630
Nov 21 00:24:23 dom0 runuser[99066]: pam_unix(runuser:session): session closed for user user
Nov 21 00:24:34 dom0 qubesd[2853]: INFO: vm.disp3630: Setting Qubes DB info for the qube
Nov 21 00:24:34 dom0 qubesd[2853]: INFO: vm.disp3630: Starting Qubes DB
Nov 21 00:24:35 dom0 runuser[99366]: pam_unix(runuser:session): session opened for user user(uid=1000) by (uid=0)
Nov 21 00:24:35 dom0 runuser[99366]: pam_unix(runuser:session): session closed for user user
Nov 21 00:24:35 dom0 qubesd[2853]: INFO: vm.disp3630: Activating qube
Nov 21 00:24:35 dom0 runuser[99377]: pam_unix(runuser:session): session opened for user user(uid=1000) by (uid=0)
Nov 21 00:25:00 dom0 runuser[98811]: pam_unix(runuser:session): session closed for user user
Nov 21 00:25:00 dom0 qubesd[2853]: ERROR: vm.disp3803: Start failed: Cannot connect to qrexec agent for 60 seconds, see /var/log/xen/console/guest-disp3803.log for details
Nov 21 00:25:04 dom0 runuser[99377]: pam_unix(runuser:session): session closed for user user
Nov 21 00:25:08 dom0 qubesd[2853]: INFO: vm.default-dvm: Removing qube(s) from preloaded list because automatic cleanup was called: 'disp3803'
Nov 21 00:25:08 dom0 qubesd[2853]: INFO: Removing appmenus for 'disp3803' in 'dom0'
Nov 21 00:25:08 dom0 runuser[99551]: pam_unix(runuser:session): session opened for user user(uid=1000) by (uid=0)
Nov 21 00:25:17 dom0 runuser[99551]: pam_unix(runuser:session): session closed for user user
Nov 21 00:25:17 dom0 qubesd[2853]: INFO: Removing appmenus for 'disp3630' in 'dom0'
Nov 21 00:25:17 dom0 runuser[99713]: pam_unix(runuser:session): session opened for user user(uid=1000) by (uid=0)
Nov 21 00:25:17 dom0 qubesd[2853]: INFO: vm.disp3803: Removing volume root: qubes_dom0/vm-disp3803-root
Nov 21 00:25:17 dom0 qubesd[2853]: INFO: vm.disp3803: Removing volume private: qubes_dom0/vm-disp3803-private
Nov 21 00:25:17 dom0 qubesd[2853]: INFO: vm.disp3803: Removing volume volatile: qubes_dom0/vm-disp3803-volatile
Nov 21 00:25:17 dom0 qubesd[2853]: INFO: vm.disp3803: Removing volume kernel: 6.17.4-1.fc41
Nov 21 00:25:18 dom0 qubesd[2853]: ERROR: Task exception was never retrieved
Nov 21 00:25:18 dom0 qubesd[2853]: future: <Task finished name='Task-295666' coro=<Emitter.fire_event_async() done, defined at /usr/lib/python3.13/site-packages/qubes/events.py:211> exception=ExceptionGroup('unhandled errors in a TaskGroup', [QubesVMError('Cannot connect to qrexec agent for 60 seconds, see /var/log/xen/console/guest-disp3803.log for details')])>
Nov 21 00:25:18 dom0 qubesd[2853]:   + Exception Group Traceback (most recent call last):
Nov 21 00:25:18 dom0 qubesd[2853]:   |   File "/usr/lib/python3.13/site-packages/qubes/events.py", line 243, in fire_event_async
Nov 21 00:25:18 dom0 qubesd[2853]:   |     effect = task.result()
Nov 21 00:25:18 dom0 qubesd[2853]:   |   File "/usr/lib/python3.13/site-packages/qubes/vm/mix/dvmtemplate.py", line 496, in on_domain_preload_dispvm_used
Nov 21 00:25:18 dom0 qubesd[2853]:   |     async with asyncio.TaskGroup() as task_group:
Nov 21 00:25:18 dom0 qubesd[2853]:   |                ~~~~~~~~~~~~~~~~~^^
Nov 21 00:25:18 dom0 qubesd[2853]:   |   File "/usr/lib64/python3.13/asyncio/taskgroups.py", line 71, in __aexit__
Nov 21 00:25:18 dom0 qubesd[2853]:   |     return await self._aexit(et, exc)
Nov 21 00:25:18 dom0 qubesd[2853]:   |            ^^^^^^^^^^^^^^^^^^^^^^^^^^
Nov 21 00:25:18 dom0 qubesd[2853]:   |   File "/usr/lib64/python3.13/asyncio/taskgroups.py", line 173, in _aexit
Nov 21 00:25:18 dom0 qubesd[2853]:   |     raise BaseExceptionGroup(
Nov 21 00:25:18 dom0 qubesd[2853]:   |     ...<2 lines>...
Nov 21 00:25:18 dom0 qubesd[2853]:   |     ) from None
Nov 21 00:25:18 dom0 qubesd[2853]:   | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
Nov 21 00:25:18 dom0 qubesd[2853]:   +-+---------------- 1 ----------------
Nov 21 00:25:18 dom0 qubesd[2853]:     | Traceback (most recent call last):
Nov 21 00:25:18 dom0 qubesd[2853]:     |   File "/usr/lib/python3.13/site-packages/qubes/vm/qubesvm.py", line 2185, in start_qrexec_daemon
Nov 21 00:25:18 dom0 qubesd[2853]:     |     await self.start_daemon(
Nov 21 00:25:18 dom0 qubesd[2853]:     |     ...<4 lines>...
Nov 21 00:25:18 dom0 qubesd[2853]:     |     )
Nov 21 00:25:18 dom0 qubesd[2853]:     |   File "/usr/lib/python3.13/site-packages/qubes/vm/qubesvm.py", line 2142, in start_daemon
Nov 21 00:25:18 dom0 qubesd[2853]:     |     raise subprocess.CalledProcessError(
Nov 21 00:25:18 dom0 qubesd[2853]:     |         p.returncode, command, output=stdout, stderr=stderr
Nov 21 00:25:18 dom0 qubesd[2853]:     |     )
Nov 21 00:25:18 dom0 qubesd[2853]:     | subprocess.CalledProcessError: Command '['runuser', '-u', 'user', '--', '/usr/sbin/qrexec-daemon', '-q', '-u', '2b188b41-b702-46c9-b512-fecff52b671d', '--', '177', 'disp3803', 'user']' returned non-zero exit status 3.
Nov 21 00:25:18 dom0 qubesd[2853]:     |
Nov 21 00:25:18 dom0 qubesd[2853]:     | During handling of the above exception, another exception occurred:
Nov 21 00:25:18 dom0 qubesd[2853]:     |
Nov 21 00:25:18 dom0 qubesd[2853]:     | Traceback (most recent call last):
Nov 21 00:25:18 dom0 qubesd[2853]:     |   File "/usr/lib/python3.13/site-packages/qubes/vm/dispvm.py", line 735, in from_appvm
Nov 21 00:25:18 dom0 qubesd[2853]:     |     await dispvm.start()
Nov 21 00:25:18 dom0 qubesd[2853]:     |   File "/usr/lib/python3.13/site-packages/qubes/vm/dispvm.py", line 838, in start
Nov 21 00:25:18 dom0 qubesd[2853]:     |     await super().start(**kwargs)
Nov 21 00:25:18 dom0 qubesd[2853]:     |   File "/usr/lib/python3.13/site-packages/qubes/vm/qubesvm.py", line 1549, in start
Nov 21 00:25:18 dom0 qubesd[2853]:     |     await self.start_qrexec_daemon()
Nov 21 00:25:18 dom0 qubesd[2853]:     |   File "/usr/lib/python3.13/site-packages/qubes/vm/qubesvm.py", line 2193, in start_qrexec_daemon
Nov 21 00:25:18 dom0 qubesd[2853]:     |     raise qubes.exc.QubesVMError(
Nov 21 00:25:18 dom0 qubesd[2853]:     |     ...<5 lines>...
Nov 21 00:25:18 dom0 qubesd[2853]:     |     )
Nov 21 00:25:18 dom0 qubesd[2853]:     | qubes.exc.QubesVMError: Cannot connect to qrexec agent for 60 seconds, see /var/log/xen/console/guest-disp3803.log for details
Nov 21 00:25:18 dom0 qubesd[2853]:     +------------------------------------
Nov 21 00:25:19 dom0 runuser[99713]: pam_unix(runuser:session): session closed for user user
Nov 21 00:25:19 dom0 qubesd[2853]: INFO: vm.disp3630: Removing volume root: qubes_dom0/vm-disp3630-root
Nov 21 00:25:19 dom0 qubesd[2853]: INFO: vm.disp3630: Removing volume private: qubes_dom0/vm-disp3630-private
Nov 21 00:25:19 dom0 qubesd[2853]: INFO: vm.disp3630: Removing volume volatile: qubes_dom0/vm-disp3630-volatile
Nov 21 00:25:19 dom0 qubesd[2853]: INFO: vm.disp3630: Removing volume kernel: 6.12.58-1.fc41

If it is not apparent, both the preloaded disposable and the new
preloaded timeout and are reaped.
After this it is not possible to manually start a disposable from the
default-dvm, which has preloaded set, whereas it is possible to start
a normal disposable. The only way to start a disposable from the
default-dvm is to manually reduce system load.

I never presume to speak for the Qubes team. When I comment in the Forum I speak for myself.

Thanks. Qubesd logs would allow me to see the cause.

I don’t have the full story with the shared log. I see 2 disposables starting, one fails to connect to Qrexec and the other I don’t know why it was cleaned up. I don’t see them being requested nor it attempting to preload. I see disp3803 was created by the “used” event of some other disposable, unsure about this.

I don’t grasp this fully, “whereas it is possible to start a normal disposable”, when the previous phrase is that it is not possible to start manually.

When a preloaded disposable is used, it starts another one, independent if the used has shutdown already or not. This means that there may be a time when 2 disposables are running.

Correct.

Now this makes more sense, if your system can’t have two qubes running at the same time so the time saving is not marginal for the first request, but much worse on the second request as no qube could be preloaded.

You can increase preload-dispvm-threshold dom0 feature, it can also be done from Global Config. This would prevent preloading if there is not enough available memory on the system, thus it would fallback to starting normal disposables when necessary and preloading when possible. The intended value of this feature was to avoid normal qubes failing to start if there is not enough memory on the system, but I guess it makes sense in this case also.

The minimum on install is 1000 MiB, it is adjust to how much RAM you have, maximum is 6000 MiB. You can suggest another minimum value on

Maybe it is better to experiment with different values in your or your user’s setup, increasing 100-200 MiB between tests. This, of course, is difficult to generalize, each workflow requires a different amount of resources so I don’t think we can get to a value everyone can agree on (some templates and tasks are more resource intensive than others).

I’m sorry this was not clear. The default-dvm has preloaded set. It is
not possible to start a disposable from the default-dvm disposable
template.
A disposable template without preloaded set, creates what I term a normal
disposable. While it is not possible to create a disposable from the
default-dvm it IS possible to start a normal disposable.
Is that clearer?

I think you still do not understand what happens in real use. The
time saving for a preloaded disposable to run an application IS marginal
on the first request. Indeed the timing is much worse on the second
request.

I will try this and see what is the result. I remain uncertain about
rolling out a feature in the default that makes the user experience
worse without manual intervention. (Of course, I recognise that the
effects will vary depending on available memory, qube usage, etc.)

I never presume to speak for the Qubes team.
When I comment in the Forum I speak for myself.