Survey: CPU and VM boot time

renehoj · April 24, 2023, 3:26pm

Try enabling balancing, and lowering the max and init memory. I was playing around with the memory settings, they might have been lower than default.

augsch · April 24, 2023, 3:39pm

Disabling memory balancing does more for me. I can achieve an average of 3.2s with memory balancing turned off. However, I think the OP’s test scenario should be with memory balancing turned on.

fiftyfourthparallel · April 25, 2023, 2:45pm

I’m fine with variations as long as they’re clearly marked so people can compare, like what @51lieal did with SSD-4k.

In fact, it’s nice to see this topic slowly morph into a thread on how to min-max qube startup times. I think as Qubes evolves, this metric will become increasingly important, since the logical conclusion of its compartmentalization-via-VM approach to security is to have a fresh VM for every little task–so how quickly a computer can spin up a new instance will be key. Even without this extreme, the usability of Qubes (as your daily driver at least) is strongly connected to how fast your computer can spin up VMs.

So a big thank you to all who have contributed so far–please keep at it! I’d contribute if I had anything to give, but this isn’t my field

augsch · May 1, 2023, 3:48pm

I got interesting results running R4.2 Alpha on BTRFS and 4kn drive. The installer of R4.2 has new enough cryptsetup that can automatically select appropriate sector size. So I just need to switch my NVMe drive to 4kn mode ( using another LBA format sector size ).

So, booting debian-11-minimal without memory balancing, with its memory set to 400MB, takes ~2.5s. This is better than any previous results with different NVMe drives on my machine. Specifically, it’s ~0.6s faster than another BTRFS R4.2, which is installed on an identical drive, but without 4kn.

Now, the time between me clicking on the application launcher and chromium’s window showing up, is less than 5 seconds. And my laptop’s specs are far from “powerful”!

So for anyone interested in testing R4.2 and BTRFS, enabling 4kn mode perhaps will speed up your VMs booting!

There will be BTRFS optimization landing in kernel 6.3, and I’m hoping for another leap in performance.

51lieal · May 1, 2023, 4:56pm

try also 4kn template, i made the guide somewhere here, it should speed up vm by some

also since this fresh install, can you try 4kn drive with lvm ? is it still not fixed ?
i also found that amd is way faster than intel (boot speed), my current i7-10750h vs ryzen 5 5600h.

i havent do any benchmark

augsch · May 1, 2023, 5:34pm

Thanks for your guide! I’d love to try 4kn+lvm, but I just restored a ton of VMs ( to fully transit from 4.1 to 4.2, to enjoy the lightning speed ), so maybe I’ll try 4kn+lvm when I get a spare nvme drive.

Anyway, my previous install was lvm without 4kn, and VM boot time is 1.5 seconds longer than my install of btrfs without 4kn. So I doubt whether 4kn+lvm will have performance gains conpared to 4kn+brtfs.

I’ll build 4kn templates to test if they will boot even faster. I read somewhere that btrfs treats the underlying fs’s sector size differently from lvm does. Given that 4kn templates do help in booting faster on lvm ( from your posts ) , it’s really worth investigating whether 4kn templates will boot faster than legacy templates, on brtfs.

tasket · May 1, 2023, 7:26pm

I did some benchmarks on my main laptop before I converted it from tLVM to Btrfs-on-4k-luks. They are crude in that the only startup time measured is for Debian VM start + Firefox. However, I did also record VM shutdown times, and my experience is these really affect system performance when you are busily working on your system… the tLVM snapshot operations at shutdown hammer the dom0 storage layer often for 7+ seconds for medium sized volumes.

My plan is to re-do the benchmarks before long, so the restored VM volumes are still similar enough to make a fair comparison.

fiftyfourthparallel · May 2, 2023, 1:10am

@tasket Great to see you again! Sorry for (temporarily) hijacking the thread, but I’m a long-time user of the tools you’ve written for Qubes–e.g. halt-vm-by-window, system-stats-xen, and more importantly, vm-boot-protect. I recommend the various scripts in this Github repo to just about anyone looking to have a more fluid Qubes experience–especially halt-vm-by-window.

Thank you very much for making tools that I use tens–maybe hundreds of times a day.

Anyways, the reason I’m reaching out is because I’ve noticed that since R4.1, vm-boot-protect tends to fail when booting up. I know I’m not supposed to use it with disposables, but I have since R4.0 without issue… until R4.1. Now, there’s a high probability that a terminal window appears with a warning that vm-boot-protect was triggered and that the bad private files are in /mnt/xdvb (not accurate; recalling from memory).

I’d hugely appreciate it if you updated your tools for R4.1 (or even R4.2).

fiftyfourthparallel · May 2, 2023, 1:20am

With the work you and others are doing, a sub-1-second boot seems to be a fast-approaching reality. Hopefully with some template-side-tweaks (or even a native specialized template like debian-15-quick) this should be attainable within a few years.

Maybe I should start a thread asking people for ideas about what could be done with this–i.e. how to leverage the almost-negligible cost of booting (and shutting down) a VM to make Qubes more secure and more usable.

tasket · May 2, 2023, 2:54am

@fiftyfourthparallel Hi FFP! Feel free to open an issue when something like this happens. FWIW, when I upgraded to 4.1, I just kept using the same tools that are already there. Qubes-VM-hardening did get an update to the dom0 instructions.

fsflover · May 2, 2023, 1:14pm

Perhaps these two possibilities are relevant:

github.com/QubesOS/qubes-issues

Proposal and code for instantaneously started disposable VMs

opened 09:02AM - 13 Dec 15 UTC

qubesuser

T: enhancement help wanted C: core P: major release notes community dev S: needs review

Starting disposable VMs is faster than normal VMs, but it can often still take s…everal seconds and be a noticeable delay in the user experience. This proposes to solve this issue by keeping one or more disposable VMs always around runnning, but without qubes-guid started and thus "invisible". When the user requests a disposable VMs, the system takes one of those cached disposable VMs, adjusts them if necessary and starts qubes-guid, and then starts another cached disposable VMs for the next request. This allows instantaneously started DispVMs at the cost of losing 1.5-6 GB of RAM, which can be a good tradeoff at least for machines with >= 16GB RAM. There are two ways of doing this: the most flexible way would be to support any DispVM usage by starting the appropriate service on the cached DVM, and there is an inflexible but faster way that pre-starts the application as well, but only supports a limited number of DispVM applications started from dom0 (typically a web browser and a terminal). My code implements the "inflexible" way and offers two modes: a faster "separate" mode that keeps around a DispVM for each configured application, and a slower but less RAM hungry "unified" mode that keeps a DispVM with all the applications running, and kills the ones not needed at user request. You can find the implementation at: https://github.com/qubesuser/qubes-core-admin/tree/insta_dvm You'll need to create a configuration file in /etc/qubes/dvms like the one provided in the branch. The mode is chosen automatically depending on available RAM, but can be configured in /etc/qubes/cached-dvm-mode The branch is missing packaging for qubes-start-cached-dvm and the dvms config file, systemd integration for starting it at boot, and making dom0 start menu entries use it. It's also somewhat hackish overall and might need a rewrite in Python and adjustment to the new core code if shipped after that.

github.com/QubesOS/qubes-issues

Preloaded DisposableVMs

opened 08:28PM - 13 May 20 UTC

ejose19

T: enhancement C: core P: default

**The problem you're addressing (if any)** Disposable vms are very useful for t…he intent they're made, however one drawback they have is that usage is not instant as other appvms, and one need to wait for it to load before using it (varying from 7-20s depending on hardware) **Describe the solution you'd like** It would be great if there was an option to "preload" the dispvm (quantity to preload would be defined by the user and limited by hardware specs) so whenever you need to use a dispvm the target program launches automatically. **Where is the value to a user, and who might that user be?** It would be a great benefit in terms of speed and convenience depending on how much the user relies on dispvms **Additional context** Currently behaviour would be preserved, if you launch a program for a dispvm using the qubes menu, each call would use a different preloaded dispvm, no reuse would be made. Also when one dispvm is "used", another one would be preloaded to keep the defined amount always ready.

fiftyfourthparallel · May 2, 2023, 5:05pm

The idea of preloaded-dispVMs are nice, but both threads seem to indicate that the idea isn’t actively being worked on.

Either way, the thread I’m proposing is another one of my idea threads where people just throw things out there and hopefully something inspires someone. Kind of like my “Now you’re thinking with Qubes” thread. Might as well just go ahead and post it instead of navel-gazing.

fiftyfourthparallel · May 2, 2023, 5:06pm

Thanks, I’ll be sure to look at the changes once I have time.

kindagreen · May 26, 2023, 4:07am

I think my findings that I separately published in Speeing up VM startups will also be interesting to people here.

The “have minimal RAM to speed-up startup” is because there’s wasteful (cpu-time wise) clearing of already clear VM pages. I filed a ticket: Wasteful memory clearing at start of every (standard) qube kernel · Issue #8228 · QubesOS/qubes-issues · GitHub
Add kernel option init_on_free=off to your vm kernel options with something like this (this rewrites the options, but I imagine nobody has them set nowadays anyway):

qvm-prefs -s YOUR-VM-NAME kernelopts "init_on_free=off"

I measured 3+ seconds savings in the boot time on gen8 i7 cpu using the default 4G maxmem qube. slower CPUs will benefit more than faster ones.

The “don’t include in memory balancing” seems to be related to this:

[    2.327810] xen:balloon: Waiting for initial ballooning down having finished.
[    3.058150] xen:balloon: Initial ballooning down finished.

so you basically don’t load this balloon driver and get the savings from its init. Somebody could look into what the init does and actually optimize it I guess. Again slower CPUs benefit more here. but other factors seem to have big impact too. free xen ram?

[    1.309210] xen:balloon: Waiting for initial ballooning down having finished.
[    7.153167] xen:balloon: Initial ballooning down finished.

tempmail · May 26, 2023, 5:05am

There are preset kernelopts for sys-gui and I set them for all qubes (zswap.enabled=0, for example while using zram-tools), but it’s not a problem to actually add your option to existing ones. Disabling balooning indeed dramatically increases overall performance., including boot time.

My best debian-11-minimal boot time so far is 4.05s

kindagreen · May 26, 2023, 5:11am

while hero numbers are good, for day to day usability the normal times should be consistently low too.

Disabling balloon (if you don’t increase the initial RAM) also cuts your memory to init from 4G to 400M, but that’s not very realistic amount to run something like e.g. firefox nowadays I imagine.

tempmail · May 26, 2023, 5:14am

Of course. Manually setting each qube’s RAM is a part of Qubes’ setting routine. My services run with 300-800MB, while browser qubes runs on 2048MB. Nothing more than that except Win11 qube with 2118MB.
Sys-gui-gpu runs with 1000MB.

kindagreen · May 26, 2023, 5:38am

well, manually setting RAM is quite limiting in many cases. Sharing memory in general is good as that makes things faster.

Now of course every VM running their own kernel they cannot share crucial stuff like caches, so it’s quite a bit of waste, though with fast NVMe may be the cache is not all that much needed and could be dialed down substantially inside the VMs (I just looked into a random sampling of my VMs and I have substantial pagecaches there, sometimes in gigabytes, which probably means Xen balloon driver assumes the VM really needs a lot more memory than it actually does?), despite the “free ram is wasted ram” adage, it might not apply as well in case of qubes situation and keeping healthy amount of Xen RAM free comes a long way to speed up VMs startup. I’ll probably have to experiment some in this area, and thisis is probably getting somewhat offtopic for this particular thread.

brendanhoar · May 27, 2023, 4:15am

Is it the Snapshot operation or the blkdiscard? Or both?

B

fiftyfourthparallel · May 27, 2023, 1:20pm

What Tasket’s describing matches what I experience whenever I shut down a large VM (>100GB). The time for the shutdown operation to end is multiples longer than VMs with negligble storage (e.g. sys-net). However I think this can be sidestepped by compartmentalizing VMs or treating larger VMs as exceptions.

@kindagreen I don’t think your theses and experiments are off-topic at all (see this post)–I eagerly await your experiment results!

Also, this thread has reached 100 posts after 2.5 years–That’s a staggering 40 posts per year! Hopefully, with a combination of hardware advances, optimizations, and ingenuity, we can get boot times down to 1 second within the next hundred posts, and show the moderators what a good return on moderating investment looks like.