Survey: CPU and VM boot time

With the work you and others are doing, a sub-1-second boot seems to be a fast-approaching reality. Hopefully with some template-side-tweaks (or even a native specialized template like debian-15-quick) this should be attainable within a few years.

Maybe I should start a thread asking people for ideas about what could be done with this–i.e. how to leverage the almost-negligible cost of booting (and shutting down) a VM to make Qubes more secure and more usable.

1 Like

@fiftyfourthparallel Hi FFP! Feel free to open an issue when something like this happens. FWIW, when I upgraded to 4.1, I just kept using the same tools that are already there. Qubes-VM-hardening did get an update to the dom0 instructions.

1 Like

Perhaps these two possibilities are relevant:

1 Like

The idea of preloaded-dispVMs are nice, but both threads seem to indicate that the idea isn’t actively being worked on.

Either way, the thread I’m proposing is another one of my idea threads where people just throw things out there and hopefully something inspires someone. Kind of like my “Now you’re thinking with Qubes” thread. Might as well just go ahead and post it instead of navel-gazing.

1 Like

Thanks, I’ll be sure to look at the changes once I have time.

I think my findings that I separately published in Speeing up VM startups will also be interesting to people here.

The “have minimal RAM to speed-up startup” is because there’s wasteful (cpu-time wise) clearing of already clear VM pages. I filed a ticket: Wasteful memory clearing at start of every (standard) qube kernel · Issue #8228 · QubesOS/qubes-issues · GitHub
Add kernel option init_on_free=off to your vm kernel options with something like this (this rewrites the options, but I imagine nobody has them set nowadays anyway):

qvm-prefs -s YOUR-VM-NAME kernelopts "init_on_free=off"

I measured 3+ seconds savings in the boot time on gen8 i7 cpu using the default 4G maxmem qube. slower CPUs will benefit more than faster ones.

The “don’t include in memory balancing” seems to be related to this:

[    2.327810] xen:balloon: Waiting for initial ballooning down having finished.
[    3.058150] xen:balloon: Initial ballooning down finished.

so you basically don’t load this balloon driver and get the savings from its init. Somebody could look into what the init does and actually optimize it I guess. Again slower CPUs benefit more here. but other factors seem to have big impact too. free xen ram?

[    1.309210] xen:balloon: Waiting for initial ballooning down having finished.
[    7.153167] xen:balloon: Initial ballooning down finished.
1 Like

There are preset kernelopts for sys-gui and I set them for all qubes (zswap.enabled=0, for example while using zram-tools), but it’s not a problem to actually add your option to existing ones. Disabling balooning indeed dramatically increases overall performance., including boot time.

My best debian-11-minimal boot time so far is 4.05s

while hero numbers are good, for day to day usability the normal times should be consistently low too.

Disabling balloon (if you don’t increase the initial RAM) also cuts your memory to init from 4G to 400M, but that’s not very realistic amount to run something like e.g. firefox nowadays I imagine.

Of course. Manually setting each qube’s RAM is a part of Qubes’ setting routine. My services run with 300-800MB, while browser qubes runs on 2048MB. Nothing more than that except Win11 qube with 2118MB.
Sys-gui-gpu runs with 1000MB.

well, manually setting RAM is quite limiting in many cases. Sharing memory in general is good as that makes things faster.

Now of course every VM running their own kernel they cannot share crucial stuff like caches, so it’s quite a bit of waste, though with fast NVMe may be the cache is not all that much needed and could be dialed down substantially inside the VMs (I just looked into a random sampling of my VMs and I have substantial pagecaches there, sometimes in gigabytes, which probably means Xen balloon driver assumes the VM really needs a lot more memory than it actually does?), despite the “free ram is wasted ram” adage, it might not apply as well in case of qubes situation and keeping healthy amount of Xen RAM free comes a long way to speed up VMs startup. I’ll probably have to experiment some in this area, and thisis is probably getting somewhat offtopic for this particular thread.

2 Likes

Is it the Snapshot operation or the blkdiscard? Or both?

B

What Tasket’s describing matches what I experience whenever I shut down a large VM (>100GB). The time for the shutdown operation to end is multiples longer than VMs with negligble storage (e.g. sys-net). However I think this can be sidestepped by compartmentalizing VMs or treating larger VMs as exceptions.

@kindagreen I don’t think your theses and experiments are off-topic at all (see this post)–I eagerly await your experiment results!

Also, this thread has reached 100 posts after 2.5 years–That’s a staggering 40 posts per year! Hopefully, with a combination of hardware advances, optimizations, and ingenuity, we can get boot times down to 1 second within the next hundred posts, and show the moderators what a good return on moderating investment looks like.

1 Like

I would say blkdiscard has little or nothing to do with it.

Because Wyng Backup also does snapshot management, I was faced with snapshot deletion causing long delays during the backup process. Or, at least they are long compared to the nearly instant decision Wyng makes when there are no changes in a volume to backup… it can’t decide without the snapshot, but deleting the comparison snap might take >10 seconds.

I determined that the deletion process was not critical to data integrity (its just cleanup) so I just spawned ‘lvremove’ background processes using ‘ionice’ without checking them. Wyng works much faster that way, although the rename (also slow) part of snapshot rotation can’t use a similar workaround.

OTOH, backups run faster on Btrfs even with that LVM-specific optimization, because deleting and renaming image files is so much faster than with thin LVs.

1 Like

As for Qubes VM start/stop, perhaps there are snapshot rotation steps that can be left in the background and considered non-critical. For example, if VM startup always checks for the situation where a volume has been left in its active unsettled state (which it already does), and that unsettled volume name has the newest timestamp, then you can recover that situation easily with zero risk. So there could be a benefit to treating VM shutdown rotations as an unimportant background process (also using ‘ionice’).

I added an entry using debian-11-minimal on 4.2-RC4, I hope it’s fine :smiley:

1 Like

Added my R4.2 results to the top. Times have returned back down to R4.0 levels.

Also, from here on assume that people reporting their times are using the latest available version of Debian for their release unless otherwise stated.

E.g. If someone posts for R4.1, then it’s assumed they’re using Debian 11. If for R4.2; Debian 12 until a template for Debian 13 is released.

I added my R4.2 result to the bottom of the list.
I was between 3.4 and 3.9, so I decided to go with 3.6 since most of my tests were close to it.

Update my result for v4.2 with Debian 12
(Specs: i9-13900K, DDR5 memory, Samsung evo 990 Pro NVMe drive)

Fastest boot time was 3.22
Fastest 10 run average was 3.38
Total average of 30 runs ~3.44

I tried doing the test with the Debian 11 template, and it’s a lot faster.

Fastest D11 boot time was 2.61
Total average of 20 runs ~2.93

2 Likes

Update my results for Qubes R4.2.0 with debian 12.

Fastest boot time was 3.390
10 run average was 3.496

R4.2 final release really has astonishing impact on individual qube boot time. Qmemman memory hotplug support since core-admin v4.2.18 (core-admin v4.2.18 (r4.2) · Issue #4121 · QubesOS/updates-status · GitHub) indeed boosted boot speed significantly, for qubes that have memory balancing on.

If we gotta figure out what makes debian-12 slower than debian-11, we may gain even further improvements.

I suggest grouping the big table in the original post by Qubes version, like three separate tables, for Qubes R4.0, R4.1, R4.2, respectively.

2 Likes

I took your suggestion and reformatted the entire post while I was at it. Thanks

Every release we get a step closer to the 1-second boot

2 Likes