In fact, my goal is to find a better place for the e-core in the qubes so that they do their job, but without sacrificing overall performance.
You just have to pick qubes that canāt take advantage of the P cores and pin them to the E cores, freeing up more resources on the P cores for qubes that can use them.
I put whonix, Dom0, templates, sys-net, sys-firewall, sys-vpn, disp-mgmt on the E cores, I donāt think there is any meaningful way they can take advantage of running on faster cores.
It frees up the P cores for running my āuserā qubes.
Yes, you are right, using p cores for dom0 and test vm, I got the best time:
Min | Max | Mean | Median | pstdev |
---|---|---|---|---|
2.97 | 3.47 | 3.20 | 3.19 | 0.10 |
I have a few questions:
- how much dom0_max_vcpus are you using?
- Are you using any specific e cores or dynamically 16-23?
- Do you turn on governor performance for e cores?
Iām using 4 cores for Dom0, and 4 cores for the other system qubes.
Dom0 is 20-23, the rest is 16-19, it just let Xen do the placements itās too much work micromanaging the pinning on core level.
Iām doing some test to see what the power consumption difference is between ondemand and performance, until I have an idea of the difference I donāt want to just run everything in performance.
I think running dom0 in performance mode makes sense, but then you might also want to reduce the cores to just 2, I donāt think it really benefits from 4 dedicated cores.
I have been looking at cpumask and cpupool to control on what cores a new vm is created, the idea being that you could make a small group of cores what are running in performance mode to allow fast start up of new VMs, and once they are running they are moved to ondemand cores.
You canāt use cpumask, it doesnāt allow you to pin/use cores outside the mask.
It seems to work with cpupools, all new VMs are created in pool0, the only limitation I have found is that also dom0 has to be in pool0.
I was thinking about doing something like this
0-3 pool0 always in performance mode, for dom0 and starting VMs
4-15 p-cores
16-23 e-cores
It requires a bit of configuration by the boot scripts, you need to set up the pools every time the system boots, and you need to migrate the vm to the pool you want it to be running in.
Iāve tested xl cpupool a bit, overall itās not a bad solution. But I canāt figure out in which configuration file I can assign the default pool to vm, because you have to assign the pool before starting vm otherwise the whole thing would be pointless.
You can use xen-user.xml to assign the pool to the qubes, I havenāt tested it, but I think this should work.
I donāt think there is any other way to change the default pool, Pool-0 is the default pool used when no other pool is specified.
I want to configure pool0 to be just 4 p-cores in performance mode, and use that pool0 to start all new VMs, and when the VM is ready I want to migrate it to another pool with cores running ondemand.
I donāt know if this improves start up time, but ideally it gives all qubes the improved start up time, without the power consumption from running all P cores in performance mode.
Slightly offtopic: I havenāt had time to measure performance vs. ondemand power consumption yet but since Iām not always at my desk Iāve hacked together a python program that monitors logindās IdleHint (via dbus) and sets the governor accordingly (performance when IdleHint=False, powersave when True). That should definitely save some Wh.
I remember that logind took a bit of time to set that flag but I usually turn off my monitor with a keyboard shortcut when I know Iāll be away for some time (but not enough time to suspend/power off the PC); in that case ādpms offā immediately triggers a IdleHint=True dbus event and the python dbus listener immediately sets the governor to āpowersaveā.
I can paste the program and systemd unit if someoneās interested.
Iāve made the top post wiki, you can add it there is you want, could make it easier for people to find it in the future.
By the way - more on the topic - while monitoring dbus events and inadvertently starting a qube I noticed a bunch of dbus events just after qvm-start. If that wasnāt a coincidence that means you may be able to program whatever pinning logic immediately after startup instead of using cpupool, xen-user.xml, etc. and then re-pin the qube eg. to E-cores once itās started (or after a delay) by integrating @noskbās functions.
(I can give you a hand if youāre not comfortable with python)
[edit - that means it should also be possible to set the governor to performance only when a qube is starting]
[edit2] - it is probably possible to do the same with qubes python bindings actually.
I have no idea how to do that in Python, you would also need to get the vm object, or you are missing important data.
If you just want to pin as fast as possible, xen-user.xml seems like the best option, with that method you are telling xen which cores to use then the vm is created.
I personally prefer the method @noskb showed, because it happens after the vm has booted, which is advantages when you move from fast to slow cores. You can also add more commands to the same script if you want to change the governor when specific vms start, and expand the script to revert the state when the vm shutdown.
Iāve done some testing on the power consumption
6 hours ondemand: 0.735 kWh
6 hours performance: 0.985 kWh
This just from working on the PC for 6 hours, doing similar but not identical work.
The is not the most precise test, but the difference in consumption seems to somewhere around 25%.
How did you measure the consumption? For some reason I have p-core even in ondemand mode have 5.2GHz at idle. There is no overclocking as such in the bios. E-core is running at 800GHz at idle.
If you just want to pin as fast as possible, xen-user.xml seems like the best option, with that method you are telling xen which cores to use then the vm is created.
xen-user.xml is likely āas fast as possibleā - but the difference with pinning just after qvm-start
should be minimal (the āqube is startingā GUI notification is shown almost immediately after qvm-start
so I expect the difference to be < 1s).
I personally prefer the method @noskb showed [ā¦]
Itās still used for pinning a qube once itās started; the idea was to avoid having pinning configuration in several places. I see thereās a ādomain-spawnedā ā domain-pre-startā qubes handler so dbus isnāt even needed. So:
- the python program handles
ādomain-spawnedāā domain-pre-startā and ādomain-startā events qvm-start
triggers aādomain-spawnedāā domain-pre-startā event ā the python program pins the qube to performance P-cores so that it starts faster.- a ādomain-startā event is triggered once qrexec is functional ā the python program pins the qube to whatever itās configured to do (eg. pin to E-cores), optionally with a delay to let the qube start additional stuff after qrexec.
Iāll try that a bit later today
I used a wattmeter to measure the total consumption of the PC.
I do have some extra hardware 2xPCI USB controller, nvidia GPU, 2x HDD, 2xNVMe, so the total usage wouldnāt be the same on a system with different hardware, but the difference between performance and ondemand should be similar.
If the to add the pool to xen-user.xml does not work, you can remove all CPUs from Pool-0 when the host boots. We leave only CPUs that run in performance mode. All vm will use these CPUs for the first run, and afterwards we can do migration or pinning with the @noskb script. This is a theory, I havenāt tested it, but should work as a workaround if there is no way to add a pool in xen-user.xml.
You canāt just remove the cores.
You need to make the P and E core pools, then remove all cores from pool0, except for the cores you want to use in pool0. Then you can add the P and E cores to their pools, and they are ready for use. You have to do this each time you reboot the system.
I easily removed with:
xl cpupool-cpu-remove Pool-0 16
xl cpupool-cpu-remove Pool-0 17
xl cpupool-cpu-remove Pool-0 18
xl cpupool-cpu-remove Pool-0 19
xl cpupool-cpu-remove Pool-0 20
xl cpupool-cpu-remove Pool-0 21
xl cpupool-cpu-remove Pool-0 22
xl cpupool-cpu-remove Pool-0 23
I then added them to a new pool:
xl cpupool-cpu-add testing 16
xl cpupool-cpu-add testing 17
xl cpupool-cpu-add testing 18
xl cpupool-cpu-add testing 19
xl cpupool-cpu-add testing 20
xl cpupool-cpu-add testing 21
xl cpupool-cpu-add testing 22
xl cpupool-cpu-add testing 23
By pre-creating a new pool:
xl cpupool-create name=\"testing\"
All this can be done before starting vm at host startup with systemd for example. All vm will initially use pool-0, you can leave the necessary CPUs there. Then the pinning script will do its job By assigning the right cpus by tag, by p-cores, e-cores, etc.
FWIW the ādomain-pre-startā / ādomain-startā approach seems to work well, without needing to tweak xml files, cpu pools, etc.; PoC:
dom0 1:1 E-core pinning + dynamic qube vcpu->core pinning Ā· GitHub
Alg: pin any starting qube to P-cores; then pin started qubes that donāt have the performance tag to E-cores, after an optional 20s timeout to allow apps to start. Obviously any other logic can be implemented.
Eg. start a dispMedium
qube that has the āperformanceā tag set, and then immediately start vault
(that doesnāt have the tag set):
...:19 dom0 cpu_pinning.py[12309]: INFO:root:dispMedium: domain-pre-start
...:20 dom0 cpu_pinning.py[12309]: INFO:root:dispMedium: pinned to cores 0-5
...:22 dom0 cpu_pinning.py[12309]: INFO:root:vault: domain-pre-start
...:23 dom0 cpu_pinning.py[12309]: INFO:root:vault: pinned to cores 0-5
...:25 dom0 cpu_pinning.py[12309]: INFO:root:dispMedium: domain-start
...:30 dom0 cpu_pinning.py[12309]: INFO:root:vault: domain-start
...:50 dom0 cpu_pinning.py[12309]: INFO:root:vault: pinned to cores 6-13
btw hereās a gist of the dbus listener that sets performance/ondemand according to user presence/IdleHint:
(Iāll wait a little bit before updating the wiki page - looks like this whole thread will eventually morph into a full blown guide )
Iām a little confused as to what you mean? Are you explaining the @noscb script?
Or are you already using a modified one? Just the @noscb script uses xl vcpu-pin, and with it you canāt assign vcpus before starting vm, which wonāt actually make the moment vm starts on the correct vcpus performance, since it uses the domain-start event.