Survey: CPU and VM boot time

This is not really true. They even have a paid designer to improve the UX. Slow loading time is a UX-related bug like this one.

I’m pretty bummed. I have an HP Z8 G4 which I originally bought specifically to run Qubes, but ultimately settled on an Intel NUC as my Qubes-horse over the last few years. I just dusted the HP off and installed 4.0.4 on it. It is a dual CPU system with moderately specced XEON CPUs

Model name: Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz

but with 64GB ECC memory and 4 500GB NVMe PCIe attached SSDs.
Each CPU has 10 cores, and I have HT off, but with HT on I could have 40 cores total, rather than 20 total.

I am just using 2 of the 4 500G SSDs for Qubes, with btrfs handling the two luks encrypted devices. Under btrfs, data are raid0 and metadata are raid1, so reads should be PDQ. Running the script I get 10.9 average time, with a wild variance. Min was 6.8, max was 16. Yes, all VMs shut down.

I had been feeling that this was a bit less snappy than I was hoping. It takes forever to boot compared to the NUC. Perhaps I need to try without btrfs.

Maybe we need a perf test where we spin up N disposable VMs from the debian minimal, and use qvm-run to run ls -Rl / or something. Or give the template the same VCPU as the physical core count and qvm-run /bin/some-embarrassingly-parallel-programme to validate my choices.

Running fedora33 workstation with root on btrfs on the other two 500GB NVMe, also on luks encrypted devices, is indeed nice and snappy.

3 Likes

As a non-technical person I don’t have a clue about btrfs, raid, and PDQ, but welcome the idea of scripts running variants of the VM boot test.

Since results from these tests are collated using wiki-posts, it might be easier to have separate threads for each major variant. If there are enough variants, maybe group them under a separate portal thread or a new topic label.

I’m also curious about the cause of massive variances in some systems (usually not R4.1). Even though dom0 should be the only thing running (and there’s usually little going on in its background AFAIK), I saw spikes of nearly double my previous boot times in R4.0.4. Does anyone have any idea?

 


Scrolling up as I’m typing this, I just noticed @GWeck just added a trove of data to the wiki. Thank you!

Things are getting slightly better.
I diddled with the BIOS settings:

  • enabled Turbo mode
  • turned on NUMA, and selected some more performant NUMA setting
  • I think there was one other setting I enabled.

Now I am getting:

t0=8.87
t1=8.83
t2=8.61
t3=9.92
t4=9.61
mu=9.168 var=0.319 sigma=0.565

t0=6.18
t1=8.92
t2=9.47
t3=9.52
t4=8.47
mu=8.512 var=1.885 sigma=1.373

I made some adjustments to the script that was posted here, and placed it in
github

It is a work in progress. I intend on adding some other tests. Ignore the xentrace stuff for now. I actually had my best times to date when I was collecting xentrace data at the same time, but figuring out what to do with it…

By the way, the script can be slurped into dom0 using this handy script:

dom0# cat bin/passio
Usage=“passio vm src”
VM=${1?$Usage}
SRC=${2?$Usage}

qvm-run --pass-io ${VM} “cat ${SRC}”

As in bin/passio fromVM /home/user/Downloads/bench >./bench

1 Like

Here are some results from an Intel NUC7i5BNH.

i5-7260U @ 2.20GHz
kernel: 5.4.88-1
16GB mem

t0=5.19
t1=6.90
t2=7.42
t3=8.70
t4=6.46

mu=6.934 var=1.656 sigma=1.287

1 Like

Just assume that this might be of interest to the the people here: Regarding the performance issues in R4.1 @Jarrah gave the useful hint that this is about the CPU running on slower frequencies in Xen.

1 Like

@GWeck Case closed?

I have run the tests again with xen_acpi_processor enabled:

System

Dom0 Kernel: 5.10.21-1
CPU: AMD Ryzen Embedded V1605B (4 Cores/ 8 Threads @2.0Ghz with Turbo @3.6Ghz)
Storage: Samsung 970 Evo Plus (M.2 NVMe PCIe 3.0)
RAM: 32 GB DDR4 2400 Mhz (ECC)
Release: R4.1

Results

6.80
6.07
6.57
6.32
6.80
6.18
6.74
6.16
6.14
6.78
----------
Median: 6.445 (was: 6.305)
Mean: 6.456 (was: 6.488)
Variance: 0.096 (was: 0.258)

(Looking at the frequencies, I realized that it probably worked before.)

2 Likes

I presume it’s for R4.1?

If so, there’s probably something else slowing it down. I didn’t modify my xen_acpi_processor and I doubt @GWeck did either.

Since I no longer have R4.1, I sadly can’t test this theory

That’s of course on R4.1. I have added that fact to my post, because I wrongly presumed that this was implied.

Maybe AMD works better/ different than Intel?

I knew–just wanted to be explicit for the sake of clarity. Thanks for fixing that

2 Likes

Given Intel’s general poor performance these years, it wouldn’t surprise me.

I just enabled xen_acpi_processor for R4.1 - thanks to @gust .Major known 4.1 gaps? - and got the following values:

7,34
7,54
7,61
7,45
7,34
7,25
7,55
7,44
7,57
7,20
Mean: 7.43
Median: 7.45
Variance: 0.02

That’s pretty close to the values for R4.0, but without the large variances.

4 Likes

@GWeck

Do you have it running on a raspberry pi?? I know xen has already been ported and deployed for the raspberry pi, but i didn’t know anyone had ported qubes to it.

The answer is in the post above yours:
" I just enabled xen_acpi_processor for R4.1"

No - its an HP EliteBook 840 G4 - just a standard PC, originally intended for Windows 10.

5 posts were split to a new topic: Can’t Search Two-character Words on Forum

I ran the test for R4.1.0 using my i7-1065G7. The results weren’t as good as the original test on R4.0, but better than the R4.1 alpha tests.

I’ve updated the tables so all results that were run on the R4.1 Alpha ar clearly marked as such. Going forward, ‘R4.1’ refers to the stable release. I’ve also taken the liberty of removing minor version numbers for both kernels and versions since they just needlessly complicate things.

5.98
6.01
6.04
6.08
6.09
6.09
6.10
6.12
6.12
6.15

Mean: 6.08
Median: 6.09
Range: 5.98 - 6.15

Interesting thread, I wasn’t aware of it before.

But yes, 4.1 is slower according to my analysis as well [1], unfortunately for various reasons.

Actually I even wrote a tool just to measure it in more detail [2].

[1] 4.1 VM startup & qrexec performance issues · Issue #7075 · QubesOS/qubes-issues · GitHub
[2] GitHub - 3hhh/qubes-performance: Analyze Qubes OS VM startup performance.

1 Like

I’ve run several benchmark using fedora with several configuration, check here for interesting discussion

CPU : I7-10750H
Storage : WD SN 730 512GB
File System : LVM-XFS
Sector Size : 512b

Linux dom0 5.10.90-1.fc32.qubes.x86_64 #1 SMP Thu Jan 13 20:46:58 CET 2022 x86_64 x86_64 x86_64 GNU/Linux

# Boot speed
Startup finished in 4.897s (firmware) + 2.523s (loader) + 2.946s (kernel) + 8.787s (initrd) + 3.705s (userspace) = 22.861s
Startup finished in 4.868s (firmware) + 2.513s (loader) + 2.938s (kernel) + 8.817s (initrd) + 3.765s (userspace) = 22.902s
Startup finished in 4.874s (firmware) + 2.511s (loader) + 2.945s (kernel) + 8.255s (initrd) + 3.732s (userspace) = 22.318s

512b template fedora-34-full
# VM Boot 
6.24
4.81
4.68
5.14
4.84

# Cryptsetup-reencrypt
Finished, time 15:24.011, 486745 MiB written, speed 526.8 MiB/s
CPU : I7-10750H
Storage : WD SN 730 512GB
File System : BTRFS+blake2b
Sector Size : 4kn

Linux dom0 5.10.90-1.fc32.qubes.x86_64 #1 SMP Thu Jan 13 20:46:58 CET 2022 x86_64 x86_64 x86_64 GNU/Linux

Startup finished in 4.898s (firmware) + 2.499s (loader) + 2.878s (kernel) + 7.922s (initrd) + 3.489s (userspace) = 21.688s
Startup finished in 4.878s (firmware) + 1.405s (loader) + 2.882s (kernel) + 7.936s (initrd) + 3.523s (userspace) = 20.626s
Startup finished in 4.889s (firmware) + 1.405s (loader) + 2.881s (kernel) + 7.817s (initrd) + 3.524s (userspace) = 20.518s

512b template fedora-34-full
#VM Boot
5.72
4.48
4.62
4.48
4.62

# directio
Finished, time 11:17.770, 486745 MiB written, speed 718.2 MiB/s

# no directio
Finished, time 12:32.743, 486745 MiB written, speed 646.6 MiB/s
CPU : I7-10750H
Storage : WD SN 730 512GB
File System : LVM+XFS
Sector Size : 4kn

Linux dom0 5.10.90-1.fc32.qubes.x86_64 #1 SMP Thu Jan 13 20:46:58 CET 2022 x86_64 x86_64 x86_64 GNU/Linux

Startup finished in 8.740s (firmware) + 2.472s (loader) + 2.947s (kernel) + 7.879s (initrd) + 3.588s (userspace) = 25.628s
Startup finished in 8.720s (firmware) + 2.469s (loader) + 2.947s (kernel) + 7.896s (initrd) + 3.679s (userspace) = 25.713s
Startup finished in 5.331s (firmware) + 2.479s (loader) + 2.947s (kernel) + 8.438s (initrd) + 3.619s (userspace) = 22.816s 

512b template fedora-34-full
#VM Boot
5.75
4.60
4.59
4.61
4.59

4kn template fedora-35
#full-4096
3.77
3.89
4.03
3.86
3.90

#minimal-4096
3.68
3.67
3.62
3.79
3.68


#full-512b
3.62
3.70
3.82
3.58
3.62

# directio
Finished, time 13:27.041, 486745 MiB written, speed 603.1 MiB/s

# no directio
Finished, time 12:29.073, 486745 MiB written, speed 649.8 MiB/s
2 Likes