Patches seems to be applyied correctly.
Will try to upgrade stubdom-linux dependencies to the latest available and port the required patches
Upgraded the stubdom dependencies to the latest available.
I am now encoutering a Out Of Memory issue when I try to start any HVM or PV.
From the logs it seems it OOM when trying to copy ârootfsâ.
My ârootfsâ is much larger than the original qubes stubdom-linux (around 2 time bigger).
The reason it is bigger is that a lot of options have been added to qemu since the last upgrade.
I could disable them to reduce the size of rootfs (I suspect the issue is that the scripts initializing the rootfs is failing because the rootfs is too big), but not sure it is a good idea (I am still learning, or trying to learn how xen and qemu are interacting with each other. Many new qemu options seems interesting by their name (vfio/rdma/avx2/âŚ), but I do not understand yet if it have an impact of if somehow all of that is passed to xen and xen doing all this work).
So for the moment, looking how/if it is possible to use a big rootfs (77Mo uncompressed. Compared to the 32Mo uncompressed of the original qubesos rootfs)
The issue was indeed that rootfs was too big. Solution was to strip the embedded binary. Now new issues, still not able to launch a HVM or PV after the stubdom upgrade
Now back to the original issue of PCI passthrough, upgrading stubdom dependencies wasnât the solution.
I can change the crash error message by disabling or enabling this patch qubes-vmm-xen-stubdom-linux/0008-xen-fix-stubdom-PCI-addr.patch at master ¡ QubesOS/qubes-vmm-xen-stubdom-linux ¡ GitHub .
Without this patch the error when trying to do pci passthrough is âcould not open '/sys/bus/pci/devices/0000:01:00.0/config: No such file or directoryâ. Seems like the patch is designed to avoid this specific error.
When enabling this patch it crash with the error I previously posted âDomain 4:Offset 0x000e:0x49090000 expands past register size (1)â, âxen_pt_config_reg_init: Offset 0x000e mismatch! Emulated=0x0080, host=0x49090000, syncing to 0x49090000â.
Still no idea of what is the solution to fix this issue, but âwhat is the issueâ seems a bit clearer to me.
From the log a difference seems to appear between standard qubes & new xen.
The flag âPCI_BASE_ADDRESS_MEM_TYPE_64 0x04 " seems to be used. ( I see the type 0x04 in my custom build while on standard qubes os it seems to use PCI_BASE_ADDRESS_MEM_TYPE_32â. To be confirmed. Still no idea on what it means for the fix I need to do.
Update: This specific issue is fixed, I made some mistakes when upgrading the rpm spec for qubes-vmm-xen. Pci passthrough still doesnât work, but it crash a bit later in the initialization steps. Speaking about ârdm check flagâ, will try to learn what it is
Major update:
The libvirt error message was a bit misleading.
However the xen error message was quite explicit and directly suggested me to try to set the âpermissiveâ attribute.
I posted this message from my custom qubes build, with xen 4.16.2, libvirt 8.9.0, qemu 7.1.
( A lot more work is still required: testing, a lot of testing. Cleaning the code, trying to reduce the size of the diff between my fork and the official qubes os. Rewriting the git commit history (donât look at it, it was my try&die workflow ), and many other thing. But now I am certain that I will make it work as I want).
builder.conf:
# vim: ft=make ts=4 sw=4
# Ready to use config for full build of the latest development version Qubes OS (aka "master").
GIT_BASEURL ?= https://github.com
GIT_PREFIX ?= QubesOS/qubes-
NO_SIGN ?= 1
#BRANCH ?= release4.1
BACKEND_VMM=xen
DIST_DOM0 ?= fc37
DISTS_VM ?= fc36
VERBOSE ?= 1
DEBUG ?= 1
#DISTS_VM ?= bullseye fc36
MGMT_COMPONENTS = \
mgmt-salt \
mgmt-salt-base \
mgmt-salt-base-topd \
mgmt-salt-base-config \
mgmt-salt-dom0-qvm \
mgmt-salt-dom0-virtual-machines \
mgmt-salt-dom0-update
COMPONENTS ?= \
vmm-xen \
core-libvirt \
core-vchan-xen \
core-qubesdb \
core-qrexec \
linux-utils \
python-cffi \
python-xcffib \
python-hid \
python-u2flib-host \
python-qasync \
python-panflute \
rpm-oxide \
core-admin \
core-admin-client \
core-admin-addon-whonix \
core-admin-linux \
core-agent-linux \
intel-microcode \
linux-firmware \
linux-kernel \
artwork \
grub2-theme \
gui-common \
gui-daemon \
gui-agent-linux \
gui-agent-xen-hvm-stubdom \
app-linux-split-gpg \
app-thunderbird \
app-linux-pdf-converter \
app-linux-img-converter \
app-linux-input-proxy \
app-linux-usb-proxy \
app-linux-snapd-helper \
app-shutdown-idle \
app-yubikey \
app-u2f \
screenshot-helper \
$(MGMT_COMPONENTS) \
infrastructure \
repo-templates \
meta-packages \
pykickstart \
vmm-xen-stubdom-linux \
manager \
desktop-linux-common \
desktop-linux-kde \
desktop-linux-xfce4 \
desktop-linux-xfce4-xfwm4 \
desktop-linux-i3 \
desktop-linux-i3-settings-qubes \
desktop-linux-awesome \
desktop-linux-manager \
grubby-dummy \
dummy-psu \
dummy-backlight \
linux-gbulb \
linux-scrypt \
xdotool \
linux-template-builder \
installer-qubes-os \
qubes-release \
blivet \
lorax \
lorax-templates \
anaconda \
anaconda-addon \
linux-yum \
linux-deb \
tpm-extra \
trousers-changer \
antievilmaid \
xscreensaver \
remote-support \
builder \
builder-debian \
builder-rpm
#python-objgraph
#grub2
# vmm-xen-stubdom-legacy
# seabios
# linux-pvgrub2
# lvm2
# efitools
# tpm2-tss
# tpm2-tools
# sbsigntool
# windows-tools-cross
#
#
# alsa-lib
# alsa-utils
# alsa-sof-firmware
# xorg-x11-drv-intel
# xorg-x11-drv-amdgpu
BUILDER_PLUGINS = builder-rpm
#BUILDER_PLUGINS = builder-rpm builder-debian
BUILDER_PLUGINS += mgmt-salt
WINDOWS_COMPONENTS = \
vmm-xen-windows-pvdrivers \
windows-utils \
core-agent-windows \
gui-agent-windows \
installer-qubes-os-windows-tools \
builder-windows
# Uncomment this to enable windows tools build
#DISTS_VM += win7x64
#COMPONENTS += $(WINDOWS_COMPONENTS)
#BUILDER_PLUGINS += builder-windows
INSECURE_SKIP_CHECKING = linux-kernel vmm-xen core-libvirt core-qrexec vmm-xen-stubdom-linux anaconda installer-qubes-os qubes-release meta-packages core-admin lorax lorax-templates blivet linux-firmware pykickstart core-admin-linux core-vchan-xen anaconda-addon mgmt-salt-dom0-qvm mgmt-salt-base-topd mgmt-salt-base manager gui-agent-xen-hvm-stubdom
GIT_URL_gui_agent_xen_hvm_stubdom = https://github.com/neowutran/qubes-gui-agent-xen-hvm-stubdom.git
BRANCH_gui_agent_xen_hvm_stubdom = master
GIT_URL_manager = https://github.com/neowutran/qubes-manager.git
BRANCH_manager = master
GIT_URL_mgmt_salt_dom0_qvm = https://github.com/neowutran/qubes-mgmt-salt-dom0-qvm.git
BRANCH_mgmt_salt_dom0_qvm = master
GIT_URL_mgmt_salt_base_topd = https://github.com/neowutran/qubes-mgmt-salt-base-topd.git
BRANCH_mgmt_salt_base_topd = master
GIT_URL_mgmt_salt_base = https://github.com/neowutran/qubes-mgmt-salt-base.git
BRANCH_mgmt_salt_base = master
GIT_URL_core_vchan_xen = https://github.com/neowutran/qubes-core-vchan-xen.git
BRANCH_core_vchan_xen = master
GIT_URL_core_admin_linux = https://github.com/neowutran/qubes-core-admin-linux.git
BRANCH_core_admin_linux = master
GIT_URL_blivet = https://github.com/neowutran/qubes-blivet.git
BRANCH_blivet = master
GIT_URL_pykickstart = https://github.com/neowutran/qubes-pykickstart.git
BRANCH_pykickstart = master
GIT_URL_lorax = https://github.com/neowutran/qubes-lorax.git
BRANCH_lorax = master
GIT_URL_lorax_templates = https://github.com/neowutran/qubes-lorax-templates.git
BRANCH_lorax_templates = master
GIT_URL_installer_qubes_os = https://github.com/neowutran/qubes-installer-qubes-os.git
BRANCH_installer_qubes_os = master
GIT_URL_core_admin = https://github.com/neowutran/qubes-core-admin.git
BRANCH_core_admin = master
GIT_URL_qubes_release = https://github.com/neowutran/qubes-qubes-release.git
BRANCH_qubes_release = master
GIT_URL_meta_packages = https://github.com/neowutran/qubes-meta-packages.git
BRANCH_meta_packages = master
GIT_URL_vmm_xen_stubdom_linux = https://github.com/neowutran/qubes-vmm-xen-stubdom-linux.git
BRANCH_vmm_xen_stubdom_linux = master
#BRANCH_vmm_xen_stubdom_linux = alternative_try
GIT_URL_anaconda = https://github.com/neowutran/qubes-anaconda.git
BRANCH_anaconda = master
GIT_URL_anaconda_addon = https://github.com/neowutran/qubes-anaconda-addon.git
BRANCH_anaconda_addon = master
GIT_URL_core_qrexec = https://github.com/neowutran/qubes-core-qrexec.git
BRANCH_core_qrexec = master
GIT_URL_core_libvirt = https://github.com/neowutran/qubes-core-libvirt.git
BRANCH_core_libvirt = master
GIT_URL_vmm_xen = https://github.com/neowutran/qubes-vmm-xen.git
BRANCH_vmm_xen = xen-4.14
#INSECURE_SKIP_CHECKING = linux-kernel
GIT_URL_linux_kernel = https://github.com/neowutran/qubes-linux-kernel.git
BRANCH_linux_kernel = master
#GIT_URL_linux_firmware = https://github.com/neowutran/qubes-linux-firmware.git
#BRANCH_linux_firmware = master
BRANCH_linux_template_builder = master
BRANCH_linux_yum = master
BRANCH_linux_deb = master
BRANCH_app_linux_split_gpg = master
BRANCH_app_linux_tor = master
BRANCH_app_thunderbird = master
BRANCH_app_linux_pdf_converter = master
BRANCH_app_linux_img_converter = master
BRANCH_app_linux_input_proxy = master
BRANCH_app_linux_usb_proxy = master
BRANCH_app_linux_snapd_helper = master
BRANCH_app_shutdown_idle = master
BRANCH_app_yubikey = master
BRANCH_app_u2f = master
BRANCH_builder = master
BRANCH_builder_rpm = master
BRANCH_builder_debian = master
BRANCH_builder_archlinux = master
BRANCH_builder_github = master
BRANCH_builder_windows = master
BRANCH_infrastructure = master
BRANCH_template_whonix = master
BRANCH_template_kali = master
BRANCH_grubby_dummy = master
BRANCH_xorg_x11_drv_intel = master
BRANCH_linux_pvgrub2 = master
BRANCH_linux_scrypt = master
BRANCH_linux_gbulb = master
BRANCH_python_cffi = master
BRANCH_python_xcffib = master
BRANCH_python_quamash = master
BRANCH_python_objgraph = master
BRANCH_python_hid = master
BRANCH_python_u2flib_host = master
BRANCH_python_qasync = master
BRANCH_python_panflute = master
BRANCH_intel_microcode = master
BRANCH_xdotool = master
BRANCH_rpm_oxide = main
BRANCH_alsa_lib = main
BRANCH_alsa_utils = main
BRANCH_alsa_sof_firmware = main
BRANCH_efitools = main
BRANCH_sbsigntools = main
BRANCH_tpm2_tss = main
BRANCH_tpm2_tools = main
TEMPLATE_ROOT_WITH_PARTITIONS = 1
TEMPLATE_LABEL ?=
# Fedora
TEMPLATE_LABEL += fc34:fedora-34
TEMPLATE_LABEL += fc35:fedora-35
TEMPLATE_LABEL += fc36:fedora-36
TEMPLATE_LABEL += fc34+minimal:fedora-34-minimal
TEMPLATE_LABEL += fc35+minimal:fedora-35-minimal
TEMPLATE_LABEL += fc36+minimal:fedora-36-minimal
TEMPLATE_LABEL += fc34+xfce:fedora-34-xfce
TEMPLATE_LABEL += fc35+xfce:fedora-35-xfce
TEMPLATE_LABEL += fc36+xfce:fedora-36-xfce
# Debian
TEMPLATE_LABEL += stretch:debian-9
TEMPLATE_LABEL += stretch+standard:debian-9
TEMPLATE_LABEL += stretch+xfce:debian-9-xfce
TEMPLATE_LABEL += buster:debian-10
TEMPLATE_LABEL += buster+standard:debian-10
TEMPLATE_LABEL += buster+xfce:debian-10-xfce
TEMPLATE_LABEL += bullseye:debian-11
TEMPLATE_LABEL += bullseye+standard+firmware:debian-11
TEMPLATE_LABEL += bullseye+xfce:debian-11-xfce
TEMPLATE_LABEL += bookworm:debian-12
TEMPLATE_LABEL += bookworm+standard:debian-12
TEMPLATE_LABEL += bookworm+xfce:debian-12-xfce
# Ubuntu
TEMPLATE_LABEL += bionic+standard:bionic
TEMPLATE_LABEL += focal+standard:focal
# Whonix
TEMPLATE_LABEL += buster+whonix-gateway+minimal+no-recommends:whonix-gw-15
TEMPLATE_LABEL += buster+whonix-workstation+minimal+no-recommends:whonix-ws-15
TEMPLATE_LABEL += bullseye+whonix-gateway+minimal+no-recommends:whonix-gw-16
TEMPLATE_LABEL += bullseye+whonix-workstation+minimal+no-recommends:whonix-ws-16
# CentOS
TEMPLATE_LABEL += centos7:centos-7
TEMPLATE_LABEL += centos7+minimal:centos-7-minimal
TEMPLATE_LABEL += centos7+xfce:centos-7-xfce
TEMPLATE_LABEL += centos-stream8:centos-stream-8
TEMPLATE_LABEL += centos-stream8+minimal:centos-stream-8-minimal
TEMPLATE_LABEL += centos-stream8+xfce:centos-stream-8-xfce
TEMPLATE_ALIAS ?=
# Debian
TEMPLATE_ALIAS += stretch:stretch+standard
TEMPLATE_ALIAS += stretch+gnome:stretch+gnome+standard
TEMPLATE_ALIAS += stretch+minimal:stretch+minimal+no-recommends
TEMPLATE_ALIAS += buster:buster+standard
TEMPLATE_ALIAS += buster+gnome:buster+gnome+standard
TEMPLATE_ALIAS += buster+minimal:buster+minimal+no-recommends
TEMPLATE_ALIAS += bullseye:bullseye+standard+firmware
TEMPLATE_ALIAS += bullseye+gnome:bullseye+gnome+standard+firmware
TEMPLATE_ALIAS += bullseye+minimal:bullseye+minimal+no-recommends
TEMPLATE_ALIAS += bookworm:bookworm+standard
TEMPLATE_ALIAS += bookworm+gnome:bookworm+gnome+standard
TEMPLATE_ALIAS += bookworm+minimal:bookworm+minimal+no-recommends
# Ubuntu
TEMPLATE_ALIAS += bionic:bionic+standard
TEMPLATE_ALIAS += focal:focal+standard
# Whonix
TEMPLATE_ALIAS += whonix-gateway-15:buster+whonix-gateway+minimal+no-recommends
TEMPLATE_ALIAS += whonix-workstation-15:buster+whonix-workstation+minimal+no-recommends
TEMPLATE_ALIAS += whonix-gateway-16:bullseye+whonix-gateway+minimal+no-recommends
TEMPLATE_ALIAS += whonix-workstation-16:bullseye+whonix-workstation+minimal+no-recommends
# Uncomment this lines to enable CentOS template build
#DISTS_VM += centos-stream8
# Uncomment this lines to enable Whonix template build
#DISTS_VM += whonix-gateway whonix-workstation
#COMPONENTS += template-whonix
#BUILDER_PLUGINS += template-whonix
# Uncomment this lines to enable Debian 9 template build
#DISTS_VM += stretch
#COMPONENTS += builder-debian
#BUILDER_PLUGINS += builder-debian
# Uncomment this line to enable Archlinux template build
#DISTS_VM += archlinux
#COMPONENTS += builder-archlinux
#BUILDER_PLUGINS += builder-archlinux
about::
@echo "qubes-os-r4.1.conf"
build instruction: just the standard get-sources + qubes + iso
When installing, at the first boot, the anaconda addons will crash.
Need to issue the needed command manually qubes-anaconda-addon/qubes.py at master ¡ QubesOS/qubes-anaconda-addon ¡ GitHub
https://neowutran.ovh/qubes_xen4.16_v2.iso
md5sum 39b23367269631044c8439c94bd4bdae
( only for dev & testing ofc)
Wow, I hope a maintainer sees this and we can get these changes pushed in a officially supported iso.
Marmarek recently submitted a PR to QubesOS/qubes-vmm-xen
at github. The PR upgrades Xen version to 4.17-rc3, which I think is what next release of QubesOS will rely on.
Interesting, is there a test .iso with the new version of XEN yet? I understand that Qubes often has ISOâs under testing that can be downloaded.
Some update on my progress.
I was also able to build another ISO using builderv2 and only using official qubes repo + the marmarek repositories mentionned in the issue.
However I still have the same issue regarding the TSC clocksource Ryzen 7000 serie - #19 by neowutran .
On my asus x670 strix F + 7950x , I first need to add the âx2apic=falseâ in the kernel options to boot to qubes. For the TSC issue, the frequency found by the system is wrong.
In dom0, the TSC is calibrated to 4491.520 Mhz which is kind of correct (~ approximatly the frequency of the CPU. I need to read a bit more about TSC and why it try a static frequency on a CPU with dynamic frequency ).
In domU, the TSC is calibrated to 196Mhz, and printing â/dev/cpuinfoâ it seems that the domU system believe that 7950X is running at 196Mhz. It is wrong, it will run unusuably slow.
A work around I found it to manually override the configuration file used by libvirt/xen to start a domU.
Copy the libvirt configuration file to the qubes directory to override the configuration used:
cp /etc/virsh/libxl/DOMU_NAME.xml /etc/qubes/templates/libvirt/xen/by-name/
(create the directories if it doesnât exist yet)
The in the xml search for the âclockâ balise and force the TSC mode to âemulateâ instead of ânativeâ.
<clock offset='utc' adjustment='reset'>
<timer name='tsc' mode='emulate' />
</clock>
For a real fix for this issue, I have not idea yet.
I am not sure where is the issue, my first guess would be a bug in xen or libvirt.
It could also be a bug in the bios I think, a lot of things are broken in the bios
Will continue to dig deeper.
In an already installed Qubes you can set this via kernelopts
qvm-prefs -s VMName kernelopts âclocksource=tscâ
Hello, by default every vm is using the tsc clocksource (clocksource=tsc
has recently be added by default in the kernel option).
After spending a bit more time on the issue, the root cause seems to be because the cpu information provided to the domU are wrong (cpu frequency).
From some chat on xen IRC with a maintainer,
so this is a massive rats nest with virt. By default, VMs are created to be migrateable, and that means no Invariant TSC feature. Guests work fine, but report wonky values
if you donât plan to migrate the VM, you can set itsc=1 in your vm config file, and then the TSC clocksource ought to be happier
From my understanding QubesOS is already using invariant TSC with this option in libvirt <feature policy='require' name='invtsc'/>
.
For the moment not any real progress on finding what is the thing that are broken.
By what is âinvarient TSCâ should be and my issue, I am asking myself if it is not the invarient TSC itself that is broken.
I am now doing a bit of reading:
- Processor Programming Reference for AMD CPU, family 25 (0x19): https://www.amd.com/en/support/tech-docs?keyword=PPR
Invarient TSC is a feature of the CPU itself.
Qubes has never worked with a AMD cpu of family 25 before, can a bug specific to xen + family 25 + invarient TSC exist ?
Reading a bit the source code of xen, like this part xen/xen/arch/x86/cpu/amd.c at master ¡ xen-project/xen ¡ GitHub
c->x86
is the CPU family. Ryzen 1 is family 0x17 (One of my computer is Ryzen 1 and it work perfectly with Qubes). So I am searching for suspicious things related to the CPU family for AMD cpu.
Still no new answer from ASUS support about the BIOS, except that the problem is a bit more complex than expected and that it will take more time to understand.
A lot of new things to learn
Some more tests:
On my ryzen 1 computer the policy <feature policy='require' name='invtsc'/>
seems to have no influence, TSC is happy, /proc/cpuinfo is always correct. ( tried policy='require'
and policy='disable'
)
On my ryzen 4, it also seems to have no influence, TSC is not happy, /proc/cpuinfo is always wrong
Modified the BIOS parameters a bit to try to see what it do. After modification, frequency reported in /proc/cpuinfo have been modified from 196Mhz to 205.166Mhz. Donât know what specific parameter is responsible for that
I found the error. There is an integer overflow, most probably in xen hypervisor.
Hunting the thing
Nice. Iâll probably switch to a zen 3 cpu in the near future (once zen4 starts pushing down zen3 (second hand) prices), but itâs nice to know zen4 support will be there, so thanks. Sadly even though AMD iirc is a partner of Xen, there seem to be a few issues with the speed at which they add actual support to the HV, plus xen isnât exactly good about communicating about this sort of stuff.
For the moment no progress on my side.
From my IRC comment
For my issue, it seems to be a integer overflow. Somewhere there is a unsigned 32 bits integer storing the cpu frequency in Hz, this variable is responsible for passing the cpu frequency information to domU. When I downclock my CPU to below 4,294,967,295 Hz, the correct cpu frequency is passed to domU. After that it start back at 0 Hz. It explain why my domU is showing ~205 Mhz when my real CPU is running at ~4500Mhz. I am hunting for this integer to be switched to 64 bits integer. I am starting with the xen codebase, if someone have some hint on where to look specifically If not I will probably be able to find it, but going to take me a few days I think
Trying to understand âwhat does whatâ in the xen source, but it is going to take a while. Trying to find what is the part of the code that give the vcpu informations to a domU.
Also another issue for later, the tool âxenoprofâ doesnât support AMD family 25 ( explicit statement in the logs ).
Donât remember if it is because of the things I tryied to patch or if it was because I never tested it, but âPVâ work as expected with the correct cpu frequency.
Only the PVH and HVM are problematics.
I am not so sure now that the issue is in the xen hypervisor code base. Maybe it is in the linux kernel directly, in the xen specific part linux/arch/x86/xen at master ¡ torvalds/linux ¡ GitHub
Going to take some more time TT
Update: Another funny thing to note and to understand or fix later, when starting a PVH linux domU, the linux kernel understand it as being a HVM and not a PVH. This is already the case in a standard qubes on a supported hardware.
This line is printing âHypervisor detected: Xen HVMâ in case of Xen PVH.
Related code:
We see this global variable being reassigned:
just before calling âxen_pvh_domain()â which is defined as being a reading the global variable âxen_pvhâ:
âCONFIG_XEN_PVHâ is defined in the qubes kernel linux configuration, from what I see.
I donât know if it is an issue or not, but it feel weird that when using a PVH the linux kernel explicitly state that he think it is a HVM.
Update2: The kernel later understand that it is a PVH. So nothing to see here.
Anyway, that was not what I was trying to debug. The rabbit hole is deep.
My patches have nothing to do with the linux guest working correctly in PV mode
After some more tests:
This issue is specific to linux guest in HVM or PVH mode.
- Windows guest are working correctly in HVM mode, frequency is correct.
- Linux guest are working correctly in PV mode, frequency is correct
- Linux guest in HVM and PVH mode are not displaying the correct frequency, there is a integer overflow as mentionned previously
I certainly have no idea what Iâm doing. But searching âfrequencyâ in the xen codebase pulled up this:
Also here
Both are using unsigned int to store cpu frequency AFAIK (I donât program in C).
Did some more testing.
Tracked back the cpu frequency to here:
In case of PV mode (dom0 or guest):
tsc_shift = -2 ; tsc_to_system_mul: 3_824_888_891
In case of PVH or HVM mode:
tsc_shift = 3; tsc_to_system_mul: 2_730_337_484
The calculation done by pvclock_tsc_khz to determine the CPU frequency seems to be correct and without overflow. The input data (tsc_to_system_mul and tsc_shift) seems to be source of the issue.
More debug is needed to reach the source issue.
Difference between PVH and HVM mode:
In case of HVM, the CPU is correctly calibrated using the PIT method (correct frequency found using this method):
So calculated cpu frequency and tsc frequency are different
later in the code, the linux kernel prefere to use the tsc frequency instead of the cpu frequency.
That may explain why a Windows HVM guest work correctly and a linux HVM guest does not
UPDATE, more debug information:
Getting closer.
By applying thoses 3 lines (to reproduce the same behavior as PV in this function), PVH and HVM now start with the correct frequency. So getting way closer to the source issue.
For the PVH and HVM mode, the method
void set_time_scale(struct time_scale *ts, u64 ticks_per_sec)
receive an incorrect value for âticks_per_secâ
UPDATE 2
I think I found it:
âd->arch.tsc_khzâ is a unsigned integer. The value expected by set_time_scale is a u64.
Since there is no cast from u32 to u64, when it get multiplied by 1000 (from KHZ to HZ), it overflow.
With explicit cast to u64 it should work.
Testing it. Going to take some hours.
UPDATE 3
I confirm that this is the source issue. I fixed it on my side, all seems to work as expected.
Now need to make a nice patch and speak with xen developer to integrate it
UPDATE 4
Patch normally sent to the xen-devel mailing list.
Copy here:
From c1535eba0bba6fc1b91f975f434af0929d9d7c96 Mon Sep 17 00:00:00 2001
Message-Id: <c1535eba0bba6fc1b91f975f434af0929d9d7c96.1671298409.git.xen@neowutran.ovh>
From: Neowutran <xen@neowutran.ovh>
Date: Sat, 17 Dec 2022 17:17:03 +0100
Subject: [Patch v1] Bug fix - Integer overflow when cpu frequency > u32 max value.
xen/arch/x86/time.c: Bug fix - Integer overflow when cpu frequency > u32 max value.
What is was trying to do: I was trying to install QubesOS on my new computer
(AMD zen4 processor). Guest VM were unusably slow / unusable.
What is the issue: The cpu frequency reported is wrong for linux guest in HVM
and PVH mode, and it cause issue with the TSC clocksource (for example).
Why this patch solved my issue:
The root cause it that "d->arch.tsc_khz" is a unsigned integer storing
the cpu frequency in khz. It get multiplied by 1000, so if the cpu frequency
is over ~4,294 Mhz (u32 max value), then it overflow.
I am solving the issue by adding an explicit cast to u64 to avoid the overflow.
---
xen/arch/x86/time.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index b01acd390d..7c77ec8902 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -2585,7 +2585,7 @@ int tsc_set_info(struct domain *d,
case TSC_MODE_ALWAYS_EMULATE:
d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
- set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000);
+ set_time_scale(&d->arch.vtsc_to_ns, (u64)d->arch.tsc_khz * 1000);
/*
* In default mode use native TSC if the host has safe TSC and
--
2.38.1
Now, next issue, GPU passthrough
Thanks for your continued work, judging by the amount of âheartsâ on this thread there are several other people interested in this as well. It would not be an exaggeration to say I look at this a couple times a day to gauge the progress you and other have been making! thanks again.
Long story short - which Ryzen version is the highest that works perfectly (including its iGPU) with up-to-date current Qubes OS 4.1.1 (lets imaging user can install and update Qubes OS on different PC)? 5***, 4*** or what and how to select a Ryzen for this?
Is there any sense to buy 6*** or 7*** series at this point if user wants to make it work almost out of box on Qubes OS?