Survey: CPU and VM boot time

I should clarify for the record that both versions were run on the exact same hardware (default internal NVMe storage). Is it possible that your USB is fast enough (3.0 and above?) to eliminate the difference?

To add to unman: with SSDs so cheap nowadays and with performance gains so dramatic, upgrading storage should be the first place anyone on any platform should look at–it gives the most bang for the buck if you’re a HDD user.

You don’t even have to get a high capacity one–just stick all ‘chunky’ files like videos onto an HDD or thumb drive

Here for the stats are the result of my test inside vmware player on Win10.
I didn’t put it in the list as it runs inside a vm.

CPU: i7-4720HQ
HDD
Qubes 4.1

With Kernel 5.4:
12.5 sec
12.7 sec
12.6 sec
18 sec
12.4 sec

With Kernel 5.10
24 sec
30 sec

1 Like

My USB connection is 3.0, as far as the documentation tells.

Does 3.0 offer speeds fast enough to make an OS run as though it’s using SATA?

I’m asking because I have no idea. I know that latency shouldn’t be an issue for this test but there might be other factors

1 Like

I made several tests now and entered the results into the table above. The results for R4.0.4 are pretty close together. Neither the CPU type nor the type of disk connection (SATA or USB3.0) nor the kernel version make any big difference.

For R4.1, Qube startup is a lot (nearly 30 %) slower. The start times for R4.1 are close together and do not show the large variances that occured with R4.0.4:

9.21
9.38
9.34
9.24
9.18
9.22
9.16
9.31
9.00
9.01
Mean: 9.21
Median: 9.22
Variance: 0.02

5 Likes

Thanks for taking the time to test this. The next step would be to figure out if it’s the newer Xen that’s causing this. The easiest way to go about this is to install Xen 4.14 on R4.0.4 using --enablerepo=current-testing (I think). I’ll get around to it sometime this week.

That being said, I think the devs won’t worry too much about UX since the main purpose of the OS is security, with comfort an added bonus. Though if R4.1 pushes load times to unbearable levels for older computers, especially the X230, I think they might get on the case.

This is not really true. They even have a paid designer to improve the UX. Slow loading time is a UX-related bug like this one.

I’m pretty bummed. I have an HP Z8 G4 which I originally bought specifically to run Qubes, but ultimately settled on an Intel NUC as my Qubes-horse over the last few years. I just dusted the HP off and installed 4.0.4 on it. It is a dual CPU system with moderately specced XEON CPUs

Model name: Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz

but with 64GB ECC memory and 4 500GB NVMe PCIe attached SSDs.
Each CPU has 10 cores, and I have HT off, but with HT on I could have 40 cores total, rather than 20 total.

I am just using 2 of the 4 500G SSDs for Qubes, with btrfs handling the two luks encrypted devices. Under btrfs, data are raid0 and metadata are raid1, so reads should be PDQ. Running the script I get 10.9 average time, with a wild variance. Min was 6.8, max was 16. Yes, all VMs shut down.

I had been feeling that this was a bit less snappy than I was hoping. It takes forever to boot compared to the NUC. Perhaps I need to try without btrfs.

Maybe we need a perf test where we spin up N disposable VMs from the debian minimal, and use qvm-run to run ls -Rl / or something. Or give the template the same VCPU as the physical core count and qvm-run /bin/some-embarrassingly-parallel-programme to validate my choices.

Running fedora33 workstation with root on btrfs on the other two 500GB NVMe, also on luks encrypted devices, is indeed nice and snappy.

3 Likes

As a non-technical person I don’t have a clue about btrfs, raid, and PDQ, but welcome the idea of scripts running variants of the VM boot test.

Since results from these tests are collated using wiki-posts, it might be easier to have separate threads for each major variant. If there are enough variants, maybe group them under a separate portal thread or a new topic label.

I’m also curious about the cause of massive variances in some systems (usually not R4.1). Even though dom0 should be the only thing running (and there’s usually little going on in its background AFAIK), I saw spikes of nearly double my previous boot times in R4.0.4. Does anyone have any idea?

 


Scrolling up as I’m typing this, I just noticed @GWeck just added a trove of data to the wiki. Thank you!

Things are getting slightly better.
I diddled with the BIOS settings:

  • enabled Turbo mode
  • turned on NUMA, and selected some more performant NUMA setting
  • I think there was one other setting I enabled.

Now I am getting:

t0=8.87
t1=8.83
t2=8.61
t3=9.92
t4=9.61
mu=9.168 var=0.319 sigma=0.565

t0=6.18
t1=8.92
t2=9.47
t3=9.52
t4=8.47
mu=8.512 var=1.885 sigma=1.373

I made some adjustments to the script that was posted here, and placed it in
github

It is a work in progress. I intend on adding some other tests. Ignore the xentrace stuff for now. I actually had my best times to date when I was collecting xentrace data at the same time, but figuring out what to do with it…

By the way, the script can be slurped into dom0 using this handy script:

dom0# cat bin/passio
Usage=“passio vm src”
VM=${1?$Usage}
SRC=${2?$Usage}

qvm-run --pass-io ${VM} “cat ${SRC}”

As in bin/passio fromVM /home/user/Downloads/bench >./bench

1 Like

Here are some results from an Intel NUC7i5BNH.

i5-7260U @ 2.20GHz
kernel: 5.4.88-1
16GB mem

t0=5.19
t1=6.90
t2=7.42
t3=8.70
t4=6.46

mu=6.934 var=1.656 sigma=1.287

1 Like

Just assume that this might be of interest to the the people here: Regarding the performance issues in R4.1 @Jarrah gave the useful hint that this is about the CPU running on slower frequencies in Xen.

1 Like

@GWeck Case closed?

I have run the tests again with xen_acpi_processor enabled:

System

Dom0 Kernel: 5.10.21-1
CPU: AMD Ryzen Embedded V1605B (4 Cores/ 8 Threads @2.0Ghz with Turbo @3.6Ghz)
Storage: Samsung 970 Evo Plus (M.2 NVMe PCIe 3.0)
RAM: 32 GB DDR4 2400 Mhz (ECC)
Release: R4.1

Results

6.80
6.07
6.57
6.32
6.80
6.18
6.74
6.16
6.14
6.78
----------
Median: 6.445 (was: 6.305)
Mean: 6.456 (was: 6.488)
Variance: 0.096 (was: 0.258)

(Looking at the frequencies, I realized that it probably worked before.)

2 Likes

I presume it’s for R4.1?

If so, there’s probably something else slowing it down. I didn’t modify my xen_acpi_processor and I doubt @GWeck did either.

Since I no longer have R4.1, I sadly can’t test this theory

That’s of course on R4.1. I have added that fact to my post, because I wrongly presumed that this was implied.

Maybe AMD works better/ different than Intel?

I knew–just wanted to be explicit for the sake of clarity. Thanks for fixing that

2 Likes

Given Intel’s general poor performance these years, it wouldn’t surprise me.

I just enabled xen_acpi_processor for R4.1 - thanks to @gust .Major known 4.1 gaps? - and got the following values:

7,34
7,54
7,61
7,45
7,34
7,25
7,55
7,44
7,57
7,20
Mean: 7.43
Median: 7.45
Variance: 0.02

That’s pretty close to the values for R4.0, but without the large variances.

4 Likes