Ryzen 7000 serie

Patches seems to be applyied correctly.
Will try to upgrade stubdom-linux dependencies to the latest available and port the required patches

Upgraded the stubdom dependencies to the latest available.
I am now encoutering a Out Of Memory issue when I try to start any HVM or PV.
From the logs it seems it OOM when trying to copy “rootfs”.
My “rootfs” is much larger than the original qubes stubdom-linux (around 2 time bigger).
The reason it is bigger is that a lot of options have been added to qemu since the last upgrade.
I could disable them to reduce the size of rootfs (I suspect the issue is that the scripts initializing the rootfs is failing because the rootfs is too big), but not sure it is a good idea (I am still learning, or trying to learn how xen and qemu are interacting with each other. Many new qemu options seems interesting by their name (vfio/rdma/avx2/…), but I do not understand yet if it have an impact of if somehow all of that is passed to xen and xen doing all this work).

So for the moment, looking how/if it is possible to use a big rootfs (77Mo uncompressed. Compared to the 32Mo uncompressed of the original qubesos rootfs)

The issue was indeed that rootfs was too big. Solution was to strip the embedded binary. Now new issues, still not able to launch a HVM or PV after the stubdom upgrade

1 Like

Now back to the original issue of PCI passthrough, upgrading stubdom dependencies wasn’t the solution.

I can change the crash error message by disabling or enabling this patch qubes-vmm-xen-stubdom-linux/0008-xen-fix-stubdom-PCI-addr.patch at master ¡ QubesOS/qubes-vmm-xen-stubdom-linux ¡ GitHub .
Without this patch the error when trying to do pci passthrough is “could not open '/sys/bus/pci/devices/0000:01:00.0/config: No such file or directory”. Seems like the patch is designed to avoid this specific error.

When enabling this patch it crash with the error I previously posted “Domain 4:Offset 0x000e:0x49090000 expands past register size (1)”, “xen_pt_config_reg_init: Offset 0x000e mismatch! Emulated=0x0080, host=0x49090000, syncing to 0x49090000”.

Still no idea of what is the solution to fix this issue, but “what is the issue” seems a bit clearer to me.

From the log a difference seems to appear between standard qubes & new xen.
The flag “PCI_BASE_ADDRESS_MEM_TYPE_64 0x04 " seems to be used. ( I see the type 0x04 in my custom build while on standard qubes os it seems to use PCI_BASE_ADDRESS_MEM_TYPE_32”. To be confirmed. Still no idea on what it means for the fix I need to do.

Update: This specific issue is fixed, I made some mistakes when upgrading the rpm spec for qubes-vmm-xen. Pci passthrough still doesn’t work, but it crash a bit later in the initialization steps. Speaking about “rdm check flag”, will try to learn what it is

1 Like

Major update:
The libvirt error message was a bit misleading.
However the xen error message was quite explicit and directly suggested me to try to set the “permissive” attribute.

I posted this message from my custom qubes build, with xen 4.16.2, libvirt 8.9.0, qemu 7.1.

( A lot more work is still required: testing, a lot of testing. Cleaning the code, trying to reduce the size of the diff between my fork and the official qubes os. Rewriting the git commit history (don’t look at it, it was my try&die workflow ), and many other thing. But now I am certain that I will make it work as I want).

2 Likes

builder.conf:

# vim: ft=make ts=4 sw=4

# Ready to use config for full build of the latest development version Qubes OS (aka "master").

GIT_BASEURL ?= https://github.com
GIT_PREFIX ?= QubesOS/qubes-
NO_SIGN ?= 1
#BRANCH ?= release4.1

BACKEND_VMM=xen

DIST_DOM0 ?= fc37
DISTS_VM ?= fc36

VERBOSE ?= 1
DEBUG ?= 1
#DISTS_VM ?= bullseye fc36

MGMT_COMPONENTS = \
	mgmt-salt \
	mgmt-salt-base \
	mgmt-salt-base-topd \
	mgmt-salt-base-config \
	mgmt-salt-dom0-qvm \
	mgmt-salt-dom0-virtual-machines \
	mgmt-salt-dom0-update

COMPONENTS ?= \
    vmm-xen \
    core-libvirt \
    core-vchan-xen \
    core-qubesdb \
    core-qrexec \
    linux-utils \
    python-cffi \
    python-xcffib \
    python-hid \
    python-u2flib-host \
    python-qasync \
    python-panflute \
    rpm-oxide \
    core-admin \
    core-admin-client \
    core-admin-addon-whonix \
    core-admin-linux \
    core-agent-linux \
    intel-microcode \
    linux-firmware \
    linux-kernel \
    artwork \
    grub2-theme \
    gui-common \
    gui-daemon \
    gui-agent-linux \
    gui-agent-xen-hvm-stubdom \
    app-linux-split-gpg \
    app-thunderbird \
    app-linux-pdf-converter \
    app-linux-img-converter \
    app-linux-input-proxy \
    app-linux-usb-proxy \
    app-linux-snapd-helper \
    app-shutdown-idle \
    app-yubikey \
    app-u2f \
    screenshot-helper \
    $(MGMT_COMPONENTS) \
    infrastructure \
    repo-templates \
    meta-packages \
	pykickstart \
	vmm-xen-stubdom-linux \
    manager \
    desktop-linux-common \
    desktop-linux-kde \
    desktop-linux-xfce4 \
    desktop-linux-xfce4-xfwm4 \
    desktop-linux-i3 \
    desktop-linux-i3-settings-qubes \
    desktop-linux-awesome \
    desktop-linux-manager \
    grubby-dummy \
    dummy-psu \
    dummy-backlight \
    linux-gbulb \
    linux-scrypt \
    xdotool \
    linux-template-builder \
    installer-qubes-os \
    qubes-release \
    blivet \
    lorax \
    lorax-templates \
    anaconda \
    anaconda-addon \
    linux-yum \
    linux-deb \
    tpm-extra \
    trousers-changer \
    antievilmaid \
    xscreensaver \
    remote-support \
    builder \
    builder-debian \
    builder-rpm

#python-objgraph
#grub2
# vmm-xen-stubdom-legacy
# seabios
# linux-pvgrub2
# lvm2 
# efitools 
# tpm2-tss 
# tpm2-tools 
# sbsigntool
# windows-tools-cross 
#
#
# alsa-lib 
# alsa-utils 
# alsa-sof-firmware 
# xorg-x11-drv-intel 
# xorg-x11-drv-amdgpu 

BUILDER_PLUGINS = builder-rpm
#BUILDER_PLUGINS = builder-rpm builder-debian
BUILDER_PLUGINS += mgmt-salt

WINDOWS_COMPONENTS = \
                     vmm-xen-windows-pvdrivers \
                     windows-utils \
                     core-agent-windows \
                     gui-agent-windows \
                     installer-qubes-os-windows-tools \
                     builder-windows

# Uncomment this to enable windows tools build
#DISTS_VM += win7x64
#COMPONENTS += $(WINDOWS_COMPONENTS)
#BUILDER_PLUGINS += builder-windows


INSECURE_SKIP_CHECKING = linux-kernel vmm-xen core-libvirt core-qrexec vmm-xen-stubdom-linux anaconda installer-qubes-os qubes-release meta-packages core-admin lorax lorax-templates blivet linux-firmware pykickstart core-admin-linux core-vchan-xen anaconda-addon mgmt-salt-dom0-qvm mgmt-salt-base-topd mgmt-salt-base manager gui-agent-xen-hvm-stubdom


GIT_URL_gui_agent_xen_hvm_stubdom = https://github.com/neowutran/qubes-gui-agent-xen-hvm-stubdom.git
BRANCH_gui_agent_xen_hvm_stubdom = master

GIT_URL_manager = https://github.com/neowutran/qubes-manager.git
BRANCH_manager = master

GIT_URL_mgmt_salt_dom0_qvm = https://github.com/neowutran/qubes-mgmt-salt-dom0-qvm.git
BRANCH_mgmt_salt_dom0_qvm = master
GIT_URL_mgmt_salt_base_topd = https://github.com/neowutran/qubes-mgmt-salt-base-topd.git
BRANCH_mgmt_salt_base_topd = master
GIT_URL_mgmt_salt_base = https://github.com/neowutran/qubes-mgmt-salt-base.git
BRANCH_mgmt_salt_base = master














GIT_URL_core_vchan_xen = https://github.com/neowutran/qubes-core-vchan-xen.git
BRANCH_core_vchan_xen = master

GIT_URL_core_admin_linux = https://github.com/neowutran/qubes-core-admin-linux.git
BRANCH_core_admin_linux = master

GIT_URL_blivet = https://github.com/neowutran/qubes-blivet.git
BRANCH_blivet = master

GIT_URL_pykickstart = https://github.com/neowutran/qubes-pykickstart.git
BRANCH_pykickstart = master

GIT_URL_lorax = https://github.com/neowutran/qubes-lorax.git
BRANCH_lorax = master

GIT_URL_lorax_templates = https://github.com/neowutran/qubes-lorax-templates.git
BRANCH_lorax_templates = master

GIT_URL_installer_qubes_os = https://github.com/neowutran/qubes-installer-qubes-os.git
BRANCH_installer_qubes_os = master

GIT_URL_core_admin = https://github.com/neowutran/qubes-core-admin.git
BRANCH_core_admin = master

GIT_URL_qubes_release = https://github.com/neowutran/qubes-qubes-release.git
BRANCH_qubes_release = master

GIT_URL_meta_packages = https://github.com/neowutran/qubes-meta-packages.git
BRANCH_meta_packages = master

GIT_URL_vmm_xen_stubdom_linux = https://github.com/neowutran/qubes-vmm-xen-stubdom-linux.git
BRANCH_vmm_xen_stubdom_linux = master
#BRANCH_vmm_xen_stubdom_linux = alternative_try

GIT_URL_anaconda = https://github.com/neowutran/qubes-anaconda.git
BRANCH_anaconda = master

GIT_URL_anaconda_addon = https://github.com/neowutran/qubes-anaconda-addon.git
BRANCH_anaconda_addon = master

GIT_URL_core_qrexec = https://github.com/neowutran/qubes-core-qrexec.git
BRANCH_core_qrexec = master

GIT_URL_core_libvirt = https://github.com/neowutran/qubes-core-libvirt.git
BRANCH_core_libvirt = master

GIT_URL_vmm_xen = https://github.com/neowutran/qubes-vmm-xen.git
BRANCH_vmm_xen = xen-4.14

#INSECURE_SKIP_CHECKING = linux-kernel
GIT_URL_linux_kernel = https://github.com/neowutran/qubes-linux-kernel.git
BRANCH_linux_kernel = master

#GIT_URL_linux_firmware = https://github.com/neowutran/qubes-linux-firmware.git
#BRANCH_linux_firmware = master

BRANCH_linux_template_builder = master
BRANCH_linux_yum = master
BRANCH_linux_deb = master
BRANCH_app_linux_split_gpg = master
BRANCH_app_linux_tor = master
BRANCH_app_thunderbird = master
BRANCH_app_linux_pdf_converter = master
BRANCH_app_linux_img_converter = master
BRANCH_app_linux_input_proxy = master
BRANCH_app_linux_usb_proxy = master
BRANCH_app_linux_snapd_helper = master
BRANCH_app_shutdown_idle = master
BRANCH_app_yubikey = master
BRANCH_app_u2f = master
BRANCH_builder = master
BRANCH_builder_rpm = master
BRANCH_builder_debian = master
BRANCH_builder_archlinux = master
BRANCH_builder_github = master
BRANCH_builder_windows = master
BRANCH_infrastructure = master
BRANCH_template_whonix = master
BRANCH_template_kali = master
BRANCH_grubby_dummy = master
BRANCH_xorg_x11_drv_intel = master
BRANCH_linux_pvgrub2 = master
BRANCH_linux_scrypt = master
BRANCH_linux_gbulb = master
BRANCH_python_cffi = master
BRANCH_python_xcffib = master
BRANCH_python_quamash = master
BRANCH_python_objgraph = master
BRANCH_python_hid = master
BRANCH_python_u2flib_host = master
BRANCH_python_qasync = master
BRANCH_python_panflute = master
BRANCH_intel_microcode = master
BRANCH_xdotool = master

BRANCH_rpm_oxide = main

BRANCH_alsa_lib = main
BRANCH_alsa_utils = main
BRANCH_alsa_sof_firmware = main

BRANCH_efitools = main
BRANCH_sbsigntools = main
BRANCH_tpm2_tss = main
BRANCH_tpm2_tools = main

TEMPLATE_ROOT_WITH_PARTITIONS = 1

TEMPLATE_LABEL ?=
# Fedora
TEMPLATE_LABEL += fc34:fedora-34
TEMPLATE_LABEL += fc35:fedora-35
TEMPLATE_LABEL += fc36:fedora-36
TEMPLATE_LABEL += fc34+minimal:fedora-34-minimal
TEMPLATE_LABEL += fc35+minimal:fedora-35-minimal
TEMPLATE_LABEL += fc36+minimal:fedora-36-minimal
TEMPLATE_LABEL += fc34+xfce:fedora-34-xfce
TEMPLATE_LABEL += fc35+xfce:fedora-35-xfce
TEMPLATE_LABEL += fc36+xfce:fedora-36-xfce

# Debian
TEMPLATE_LABEL += stretch:debian-9
TEMPLATE_LABEL += stretch+standard:debian-9
TEMPLATE_LABEL += stretch+xfce:debian-9-xfce
TEMPLATE_LABEL += buster:debian-10
TEMPLATE_LABEL += buster+standard:debian-10
TEMPLATE_LABEL += buster+xfce:debian-10-xfce
TEMPLATE_LABEL += bullseye:debian-11
TEMPLATE_LABEL += bullseye+standard+firmware:debian-11
TEMPLATE_LABEL += bullseye+xfce:debian-11-xfce
TEMPLATE_LABEL += bookworm:debian-12
TEMPLATE_LABEL += bookworm+standard:debian-12
TEMPLATE_LABEL += bookworm+xfce:debian-12-xfce

# Ubuntu
TEMPLATE_LABEL += bionic+standard:bionic
TEMPLATE_LABEL += focal+standard:focal

# Whonix
TEMPLATE_LABEL += buster+whonix-gateway+minimal+no-recommends:whonix-gw-15
TEMPLATE_LABEL += buster+whonix-workstation+minimal+no-recommends:whonix-ws-15
TEMPLATE_LABEL += bullseye+whonix-gateway+minimal+no-recommends:whonix-gw-16
TEMPLATE_LABEL += bullseye+whonix-workstation+minimal+no-recommends:whonix-ws-16

# CentOS
TEMPLATE_LABEL += centos7:centos-7
TEMPLATE_LABEL += centos7+minimal:centos-7-minimal
TEMPLATE_LABEL += centos7+xfce:centos-7-xfce
TEMPLATE_LABEL += centos-stream8:centos-stream-8
TEMPLATE_LABEL += centos-stream8+minimal:centos-stream-8-minimal
TEMPLATE_LABEL += centos-stream8+xfce:centos-stream-8-xfce

TEMPLATE_ALIAS ?=
# Debian
TEMPLATE_ALIAS += stretch:stretch+standard
TEMPLATE_ALIAS += stretch+gnome:stretch+gnome+standard
TEMPLATE_ALIAS += stretch+minimal:stretch+minimal+no-recommends
TEMPLATE_ALIAS += buster:buster+standard
TEMPLATE_ALIAS += buster+gnome:buster+gnome+standard
TEMPLATE_ALIAS += buster+minimal:buster+minimal+no-recommends
TEMPLATE_ALIAS += bullseye:bullseye+standard+firmware
TEMPLATE_ALIAS += bullseye+gnome:bullseye+gnome+standard+firmware
TEMPLATE_ALIAS += bullseye+minimal:bullseye+minimal+no-recommends
TEMPLATE_ALIAS += bookworm:bookworm+standard
TEMPLATE_ALIAS += bookworm+gnome:bookworm+gnome+standard
TEMPLATE_ALIAS += bookworm+minimal:bookworm+minimal+no-recommends

# Ubuntu
TEMPLATE_ALIAS += bionic:bionic+standard
TEMPLATE_ALIAS += focal:focal+standard

# Whonix
TEMPLATE_ALIAS += whonix-gateway-15:buster+whonix-gateway+minimal+no-recommends
TEMPLATE_ALIAS += whonix-workstation-15:buster+whonix-workstation+minimal+no-recommends
TEMPLATE_ALIAS += whonix-gateway-16:bullseye+whonix-gateway+minimal+no-recommends
TEMPLATE_ALIAS += whonix-workstation-16:bullseye+whonix-workstation+minimal+no-recommends


# Uncomment this lines to enable CentOS template build
#DISTS_VM += centos-stream8

# Uncomment this lines to enable Whonix template build
#DISTS_VM += whonix-gateway whonix-workstation
#COMPONENTS += template-whonix
#BUILDER_PLUGINS += template-whonix

# Uncomment this lines to enable Debian 9 template build
#DISTS_VM += stretch
#COMPONENTS += builder-debian
#BUILDER_PLUGINS += builder-debian

# Uncomment this line to enable Archlinux template build
#DISTS_VM += archlinux
#COMPONENTS += builder-archlinux
#BUILDER_PLUGINS += builder-archlinux

about::
	@echo "qubes-os-r4.1.conf"

build instruction: just the standard get-sources + qubes + iso

When installing, at the first boot, the anaconda addons will crash.
Need to issue the needed command manually qubes-anaconda-addon/qubes.py at master ¡ QubesOS/qubes-anaconda-addon ¡ GitHub

https://neowutran.ovh/qubes_xen4.16_v2.iso
md5sum 39b23367269631044c8439c94bd4bdae

( only for dev & testing ofc)

4 Likes

Wow, I hope a maintainer sees this and we can get these changes pushed in a officially supported iso.

Marmarek recently submitted a PR to QubesOS/qubes-vmm-xen at github. The PR upgrades Xen version to 4.17-rc3, which I think is what next release of QubesOS will rely on.

1 Like

Interesting, is there a test .iso with the new version of XEN yet? I understand that Qubes often has ISO’s under testing that can be downloaded.

Some update on my progress.

I was also able to build another ISO using builderv2 and only using official qubes repo + the marmarek repositories mentionned in the issue.

However I still have the same issue regarding the TSC clocksource Ryzen 7000 serie - #19 by neowutran .

On my asus x670 strix F + 7950x , I first need to add the “x2apic=false” in the kernel options to boot to qubes. For the TSC issue, the frequency found by the system is wrong.
In dom0, the TSC is calibrated to 4491.520 Mhz which is kind of correct (~ approximatly the frequency of the CPU. I need to read a bit more about TSC and why it try a static frequency on a CPU with dynamic frequency ).
In domU, the TSC is calibrated to 196Mhz, and printing “/dev/cpuinfo” it seems that the domU system believe that 7950X is running at 196Mhz. It is wrong, it will run unusuably slow.

A work around I found it to manually override the configuration file used by libvirt/xen to start a domU.
Copy the libvirt configuration file to the qubes directory to override the configuration used:
cp /etc/virsh/libxl/DOMU_NAME.xml /etc/qubes/templates/libvirt/xen/by-name/
(create the directories if it doesn’t exist yet)

The in the xml search for the “clock” balise and force the TSC mode to “emulate” instead of “native”.

<clock offset='utc' adjustment='reset'>
<timer name='tsc' mode='emulate' />
</clock>

For a real fix for this issue, I have not idea yet.
I am not sure where is the issue, my first guess would be a bug in xen or libvirt.
It could also be a bug in the bios I think, a lot of things are broken in the bios

Will continue to dig deeper.

3 Likes

In an already installed Qubes you can set this via kernelopts

qvm-prefs -s VMName kernelopts ‘clocksource=tsc’

Hello, by default every vm is using the tsc clocksource (clocksource=tsc has recently be added by default in the kernel option).

After spending a bit more time on the issue, the root cause seems to be because the cpu information provided to the domU are wrong (cpu frequency).
From some chat on xen IRC with a maintainer,

so this is a massive rats nest with virt. By default, VMs are created to be migrateable, and that means no Invariant TSC feature. Guests work fine, but report wonky values
if you don’t plan to migrate the VM, you can set itsc=1 in your vm config file, and then the TSC clocksource ought to be happier

From my understanding QubesOS is already using invariant TSC with this option in libvirt <feature policy='require' name='invtsc'/>.

For the moment not any real progress on finding what is the thing that are broken.
By what is “invarient TSC” should be and my issue, I am asking myself if it is not the invarient TSC itself that is broken.

I am now doing a bit of reading:

Still no new answer from ASUS support about the BIOS, except that the problem is a bit more complex than expected and that it will take more time to understand.

A lot of new things to learn :slight_smile:

Some more tests:
On my ryzen 1 computer the policy <feature policy='require' name='invtsc'/> seems to have no influence, TSC is happy, /proc/cpuinfo is always correct. ( tried policy='require' and policy='disable')
On my ryzen 4, it also seems to have no influence, TSC is not happy, /proc/cpuinfo is always wrong

Modified the BIOS parameters a bit to try to see what it do. After modification, frequency reported in /proc/cpuinfo have been modified from 196Mhz to 205.166Mhz. Don’t know what specific parameter is responsible for that

3 Likes

I found the error. There is an integer overflow, most probably in xen hypervisor.
Hunting the thing

3 Likes

Nice. I’ll probably switch to a zen 3 cpu in the near future (once zen4 starts pushing down zen3 (second hand) prices), but it’s nice to know zen4 support will be there, so thanks. :slight_smile: Sadly even though AMD iirc is a partner of Xen, there seem to be a few issues with the speed at which they add actual support to the HV, plus xen isn’t exactly good about communicating about this sort of stuff.

2 Likes

For the moment no progress on my side.
From my IRC comment

For my issue, it seems to be a integer overflow. Somewhere there is a unsigned 32 bits integer storing the cpu frequency in Hz, this variable is responsible for passing the cpu frequency information to domU. When I downclock my CPU to below 4,294,967,295 Hz, the correct cpu frequency is passed to domU. After that it start back at 0 Hz. It explain why my domU is showing ~205 Mhz when my real CPU is running at ~4500Mhz. I am hunting for this integer to be switched to 64 bits integer. I am starting with the xen codebase, if someone have some hint on where to look specifically :slight_smile: If not I will probably be able to find it, but going to take me a few days I think

Trying to understand “what does what” in the xen source, but it is going to take a while. Trying to find what is the part of the code that give the vcpu informations to a domU.

Also another issue for later, the tool “xenoprof” doesn’t support AMD family 25 ( explicit statement in the logs ).

2 Likes

Don’t remember if it is because of the things I tryied to patch or if it was because I never tested it, but “PV” work as expected with the correct cpu frequency.

Only the PVH and HVM are problematics.
I am not so sure now that the issue is in the xen hypervisor code base. Maybe it is in the linux kernel directly, in the xen specific part linux/arch/x86/xen at master ¡ torvalds/linux ¡ GitHub

Going to take some more time TT

Update: Another funny thing to note and to understand or fix later, when starting a PVH linux domU, the linux kernel understand it as being a HVM and not a PVH. This is already the case in a standard qubes on a supported hardware.

This line is printing “Hypervisor detected: Xen HVM” in case of Xen PVH.

Related code:

We see this global variable being reassigned:

just before calling “xen_pvh_domain()” which is defined as being a reading the global variable “xen_pvh”:

“CONFIG_XEN_PVH” is defined in the qubes kernel linux configuration, from what I see.
I don’t know if it is an issue or not, but it feel weird that when using a PVH the linux kernel explicitly state that he think it is a HVM.

Update2: The kernel later understand that it is a PVH. So nothing to see here.

Anyway, that was not what I was trying to debug. The rabbit hole is deep.

3 Likes

My patches have nothing to do with the linux guest working correctly in PV mode

After some more tests:

This issue is specific to linux guest in HVM or PVH mode.

  • Windows guest are working correctly in HVM mode, frequency is correct.
  • Linux guest are working correctly in PV mode, frequency is correct
  • Linux guest in HVM and PVH mode are not displaying the correct frequency, there is a integer overflow as mentionned previously
1 Like

I certainly have no idea what I’m doing. But searching “frequency” in the xen codebase pulled up this:

Also here

Both are using unsigned int to store cpu frequency AFAIK (I don’t program in C).

2 Likes

Did some more testing.
Tracked back the cpu frequency to here:

In case of PV mode (dom0 or guest):
tsc_shift = -2 ; tsc_to_system_mul: 3_824_888_891

In case of PVH or HVM mode:
tsc_shift = 3; tsc_to_system_mul: 2_730_337_484

The calculation done by pvclock_tsc_khz to determine the CPU frequency seems to be correct and without overflow. The input data (tsc_to_system_mul and tsc_shift) seems to be source of the issue.

More debug is needed to reach the source issue.

Difference between PVH and HVM mode:
In case of HVM, the CPU is correctly calibrated using the PIT method (correct frequency found using this method):

So calculated cpu frequency and tsc frequency are different

later in the code, the linux kernel prefere to use the tsc frequency instead of the cpu frequency.
That may explain why a Windows HVM guest work correctly and a linux HVM guest does not

UPDATE, more debug information:
Getting closer.

By applying thoses 3 lines (to reproduce the same behavior as PV in this function), PVH and HVM now start with the correct frequency. So getting way closer to the source issue.

For the PVH and HVM mode, the method
void set_time_scale(struct time_scale *ts, u64 ticks_per_sec)

receive an incorrect value for “ticks_per_sec”

UPDATE 2
I think I found it:

“d->arch.tsc_khz” is a unsigned integer. The value expected by set_time_scale is a u64.
Since there is no cast from u32 to u64, when it get multiplied by 1000 (from KHZ to HZ), it overflow.
With explicit cast to u64 it should work.

Testing it. Going to take some hours.

UPDATE 3

I confirm that this is the source issue. I fixed it on my side, all seems to work as expected.
Now need to make a nice patch and speak with xen developer to integrate it

UPDATE 4
Patch normally sent to the xen-devel mailing list.
Copy here:

From c1535eba0bba6fc1b91f975f434af0929d9d7c96 Mon Sep 17 00:00:00 2001
Message-Id: <c1535eba0bba6fc1b91f975f434af0929d9d7c96.1671298409.git.xen@neowutran.ovh>
From: Neowutran <xen@neowutran.ovh>
Date: Sat, 17 Dec 2022 17:17:03 +0100
Subject: [Patch v1] Bug fix - Integer overflow when cpu frequency > u32 max value.

xen/arch/x86/time.c: Bug fix - Integer overflow when cpu frequency > u32 max value.

What is was trying to do: I was trying to install QubesOS on my new computer
(AMD zen4 processor). Guest VM were unusably slow / unusable.

What is the issue: The cpu frequency reported is wrong for linux guest in HVM
and PVH mode, and it cause issue with the TSC clocksource (for example).

Why this patch solved my issue:
The root cause it that "d->arch.tsc_khz" is a unsigned integer storing
the cpu frequency in khz. It get multiplied by 1000, so if the cpu frequency
is over ~4,294 Mhz (u32 max value), then it overflow.
I am solving the issue by adding an explicit cast to u64 to avoid the overflow.

---
 xen/arch/x86/time.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index b01acd390d..7c77ec8902 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -2585,7 +2585,7 @@ int tsc_set_info(struct domain *d,
     case TSC_MODE_ALWAYS_EMULATE:
         d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
         d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
-        set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000);
+        set_time_scale(&d->arch.vtsc_to_ns, (u64)d->arch.tsc_khz * 1000);

         /*
          * In default mode use native TSC if the host has safe TSC and
--
2.38.1

Now, next issue, GPU passthrough :smiley:

5 Likes

Thanks for your continued work, judging by the amount of “hearts” on this thread there are several other people interested in this as well. It would not be an exaggeration to say I look at this a couple times a day to gauge the progress you and other have been making! thanks again.

1 Like

Long story short - which Ryzen version is the highest that works perfectly (including its iGPU) with up-to-date current Qubes OS 4.1.1 (lets imaging user can install and update Qubes OS on different PC)? 5***, 4*** or what and how to select a Ryzen for this?

Is there any sense to buy 6*** or 7*** series at this point if user wants to make it work almost out of box on Qubes OS?