Ryzen 7000 serie

neowutran · November 17, 2022, 7:35pm

Patches seems to be applyied correctly.
Will try to upgrade stubdom-linux dependencies to the latest available and port the required patches

neowutran · November 20, 2022, 10:35am

Upgraded the stubdom dependencies to the latest available.
I am now encoutering a Out Of Memory issue when I try to start any HVM or PV.
From the logs it seems it OOM when trying to copy “rootfs”.
My “rootfs” is much larger than the original qubes stubdom-linux (around 2 time bigger).
The reason it is bigger is that a lot of options have been added to qemu since the last upgrade.
I could disable them to reduce the size of rootfs (I suspect the issue is that the scripts initializing the rootfs is failing because the rootfs is too big), but not sure it is a good idea (I am still learning, or trying to learn how xen and qemu are interacting with each other. Many new qemu options seems interesting by their name (vfio/rdma/avx2/…), but I do not understand yet if it have an impact of if somehow all of that is passed to xen and xen doing all this work).

So for the moment, looking how/if it is possible to use a big rootfs (77Mo uncompressed. Compared to the 32Mo uncompressed of the original qubesos rootfs)

The issue was indeed that rootfs was too big. Solution was to strip the embedded binary. Now new issues, still not able to launch a HVM or PV after the stubdom upgrade

neowutran · November 21, 2022, 8:37am

Now back to the original issue of PCI passthrough, upgrading stubdom dependencies wasn’t the solution.

I can change the crash error message by disabling or enabling this patch qubes-vmm-xen-stubdom-linux/0008-xen-fix-stubdom-PCI-addr.patch at master · QubesOS/qubes-vmm-xen-stubdom-linux · GitHub .
Without this patch the error when trying to do pci passthrough is “could not open '/sys/bus/pci/devices/0000:01:00.0/config: No such file or directory”. Seems like the patch is designed to avoid this specific error.

When enabling this patch it crash with the error I previously posted “Domain 4:Offset 0x000e:0x49090000 expands past register size (1)”, “xen_pt_config_reg_init: Offset 0x000e mismatch! Emulated=0x0080, host=0x49090000, syncing to 0x49090000”.

Still no idea of what is the solution to fix this issue, but “what is the issue” seems a bit clearer to me.

From the log a difference seems to appear between standard qubes & new xen.
The flag “PCI_BASE_ADDRESS_MEM_TYPE_64 0x04 " seems to be used. ( I see the type 0x04 in my custom build while on standard qubes os it seems to use PCI_BASE_ADDRESS_MEM_TYPE_32”. To be confirmed. Still no idea on what it means for the fix I need to do.

Update: This specific issue is fixed, I made some mistakes when upgrading the rpm spec for qubes-vmm-xen. Pci passthrough still doesn’t work, but it crash a bit later in the initialization steps. Speaking about “rdm check flag”, will try to learn what it is

neowutran · November 22, 2022, 8:14am

Major update:
The libvirt error message was a bit misleading.
However the xen error message was quite explicit and directly suggested me to try to set the “permissive” attribute.

I posted this message from my custom qubes build, with xen 4.16.2, libvirt 8.9.0, qemu 7.1.

( A lot more work is still required: testing, a lot of testing. Cleaning the code, trying to reduce the size of the diff between my fork and the official qubes os. Rewriting the git commit history (don’t look at it, it was my try&die workflow ), and many other thing. But now I am certain that I will make it work as I want).

neowutran · November 22, 2022, 9:25am

builder.conf:

# vim: ft=make ts=4 sw=4

# Ready to use config for full build of the latest development version Qubes OS (aka "master").

GIT_BASEURL ?= https://github.com
GIT_PREFIX ?= QubesOS/qubes-
NO_SIGN ?= 1
#BRANCH ?= release4.1

BACKEND_VMM=xen

DIST_DOM0 ?= fc37
DISTS_VM ?= fc36

VERBOSE ?= 1
DEBUG ?= 1
#DISTS_VM ?= bullseye fc36

MGMT_COMPONENTS = \
	mgmt-salt \
	mgmt-salt-base \
	mgmt-salt-base-topd \
	mgmt-salt-base-config \
	mgmt-salt-dom0-qvm \
	mgmt-salt-dom0-virtual-machines \
	mgmt-salt-dom0-update

COMPONENTS ?= \
    vmm-xen \
    core-libvirt \
    core-vchan-xen \
    core-qubesdb \
    core-qrexec \
    linux-utils \
    python-cffi \
    python-xcffib \
    python-hid \
    python-u2flib-host \
    python-qasync \
    python-panflute \
    rpm-oxide \
    core-admin \
    core-admin-client \
    core-admin-addon-whonix \
    core-admin-linux \
    core-agent-linux \
    intel-microcode \
    linux-firmware \
    linux-kernel \
    artwork \
    grub2-theme \
    gui-common \
    gui-daemon \
    gui-agent-linux \
    gui-agent-xen-hvm-stubdom \
    app-linux-split-gpg \
    app-thunderbird \
    app-linux-pdf-converter \
    app-linux-img-converter \
    app-linux-input-proxy \
    app-linux-usb-proxy \
    app-linux-snapd-helper \
    app-shutdown-idle \
    app-yubikey \
    app-u2f \
    screenshot-helper \
    $(MGMT_COMPONENTS) \
    infrastructure \
    repo-templates \
    meta-packages \
	pykickstart \
	vmm-xen-stubdom-linux \
    manager \
    desktop-linux-common \
    desktop-linux-kde \
    desktop-linux-xfce4 \
    desktop-linux-xfce4-xfwm4 \
    desktop-linux-i3 \
    desktop-linux-i3-settings-qubes \
    desktop-linux-awesome \
    desktop-linux-manager \
    grubby-dummy \
    dummy-psu \
    dummy-backlight \
    linux-gbulb \
    linux-scrypt \
    xdotool \
    linux-template-builder \
    installer-qubes-os \
    qubes-release \
    blivet \
    lorax \
    lorax-templates \
    anaconda \
    anaconda-addon \
    linux-yum \
    linux-deb \
    tpm-extra \
    trousers-changer \
    antievilmaid \
    xscreensaver \
    remote-support \
    builder \
    builder-debian \
    builder-rpm

#python-objgraph
#grub2
# vmm-xen-stubdom-legacy
# seabios
# linux-pvgrub2
# lvm2 
# efitools 
# tpm2-tss 
# tpm2-tools 
# sbsigntool
# windows-tools-cross 
#
#
# alsa-lib 
# alsa-utils 
# alsa-sof-firmware 
# xorg-x11-drv-intel 
# xorg-x11-drv-amdgpu 

BUILDER_PLUGINS = builder-rpm
#BUILDER_PLUGINS = builder-rpm builder-debian
BUILDER_PLUGINS += mgmt-salt

WINDOWS_COMPONENTS = \
                     vmm-xen-windows-pvdrivers \
                     windows-utils \
                     core-agent-windows \
                     gui-agent-windows \
                     installer-qubes-os-windows-tools \
                     builder-windows

# Uncomment this to enable windows tools build
#DISTS_VM += win7x64
#COMPONENTS += $(WINDOWS_COMPONENTS)
#BUILDER_PLUGINS += builder-windows


INSECURE_SKIP_CHECKING = linux-kernel vmm-xen core-libvirt core-qrexec vmm-xen-stubdom-linux anaconda installer-qubes-os qubes-release meta-packages core-admin lorax lorax-templates blivet linux-firmware pykickstart core-admin-linux core-vchan-xen anaconda-addon mgmt-salt-dom0-qvm mgmt-salt-base-topd mgmt-salt-base manager gui-agent-xen-hvm-stubdom


GIT_URL_gui_agent_xen_hvm_stubdom = https://github.com/neowutran/qubes-gui-agent-xen-hvm-stubdom.git
BRANCH_gui_agent_xen_hvm_stubdom = master

GIT_URL_manager = https://github.com/neowutran/qubes-manager.git
BRANCH_manager = master

GIT_URL_mgmt_salt_dom0_qvm = https://github.com/neowutran/qubes-mgmt-salt-dom0-qvm.git
BRANCH_mgmt_salt_dom0_qvm = master
GIT_URL_mgmt_salt_base_topd = https://github.com/neowutran/qubes-mgmt-salt-base-topd.git
BRANCH_mgmt_salt_base_topd = master
GIT_URL_mgmt_salt_base = https://github.com/neowutran/qubes-mgmt-salt-base.git
BRANCH_mgmt_salt_base = master














GIT_URL_core_vchan_xen = https://github.com/neowutran/qubes-core-vchan-xen.git
BRANCH_core_vchan_xen = master

GIT_URL_core_admin_linux = https://github.com/neowutran/qubes-core-admin-linux.git
BRANCH_core_admin_linux = master

GIT_URL_blivet = https://github.com/neowutran/qubes-blivet.git
BRANCH_blivet = master

GIT_URL_pykickstart = https://github.com/neowutran/qubes-pykickstart.git
BRANCH_pykickstart = master

GIT_URL_lorax = https://github.com/neowutran/qubes-lorax.git
BRANCH_lorax = master

GIT_URL_lorax_templates = https://github.com/neowutran/qubes-lorax-templates.git
BRANCH_lorax_templates = master

GIT_URL_installer_qubes_os = https://github.com/neowutran/qubes-installer-qubes-os.git
BRANCH_installer_qubes_os = master

GIT_URL_core_admin = https://github.com/neowutran/qubes-core-admin.git
BRANCH_core_admin = master

GIT_URL_qubes_release = https://github.com/neowutran/qubes-qubes-release.git
BRANCH_qubes_release = master

GIT_URL_meta_packages = https://github.com/neowutran/qubes-meta-packages.git
BRANCH_meta_packages = master

GIT_URL_vmm_xen_stubdom_linux = https://github.com/neowutran/qubes-vmm-xen-stubdom-linux.git
BRANCH_vmm_xen_stubdom_linux = master
#BRANCH_vmm_xen_stubdom_linux = alternative_try

GIT_URL_anaconda = https://github.com/neowutran/qubes-anaconda.git
BRANCH_anaconda = master

GIT_URL_anaconda_addon = https://github.com/neowutran/qubes-anaconda-addon.git
BRANCH_anaconda_addon = master

GIT_URL_core_qrexec = https://github.com/neowutran/qubes-core-qrexec.git
BRANCH_core_qrexec = master

GIT_URL_core_libvirt = https://github.com/neowutran/qubes-core-libvirt.git
BRANCH_core_libvirt = master

GIT_URL_vmm_xen = https://github.com/neowutran/qubes-vmm-xen.git
BRANCH_vmm_xen = xen-4.14

#INSECURE_SKIP_CHECKING = linux-kernel
GIT_URL_linux_kernel = https://github.com/neowutran/qubes-linux-kernel.git
BRANCH_linux_kernel = master

#GIT_URL_linux_firmware = https://github.com/neowutran/qubes-linux-firmware.git
#BRANCH_linux_firmware = master

BRANCH_linux_template_builder = master
BRANCH_linux_yum = master
BRANCH_linux_deb = master
BRANCH_app_linux_split_gpg = master
BRANCH_app_linux_tor = master
BRANCH_app_thunderbird = master
BRANCH_app_linux_pdf_converter = master
BRANCH_app_linux_img_converter = master
BRANCH_app_linux_input_proxy = master
BRANCH_app_linux_usb_proxy = master
BRANCH_app_linux_snapd_helper = master
BRANCH_app_shutdown_idle = master
BRANCH_app_yubikey = master
BRANCH_app_u2f = master
BRANCH_builder = master
BRANCH_builder_rpm = master
BRANCH_builder_debian = master
BRANCH_builder_archlinux = master
BRANCH_builder_github = master
BRANCH_builder_windows = master
BRANCH_infrastructure = master
BRANCH_template_whonix = master
BRANCH_template_kali = master
BRANCH_grubby_dummy = master
BRANCH_xorg_x11_drv_intel = master
BRANCH_linux_pvgrub2 = master
BRANCH_linux_scrypt = master
BRANCH_linux_gbulb = master
BRANCH_python_cffi = master
BRANCH_python_xcffib = master
BRANCH_python_quamash = master
BRANCH_python_objgraph = master
BRANCH_python_hid = master
BRANCH_python_u2flib_host = master
BRANCH_python_qasync = master
BRANCH_python_panflute = master
BRANCH_intel_microcode = master
BRANCH_xdotool = master

BRANCH_rpm_oxide = main

BRANCH_alsa_lib = main
BRANCH_alsa_utils = main
BRANCH_alsa_sof_firmware = main

BRANCH_efitools = main
BRANCH_sbsigntools = main
BRANCH_tpm2_tss = main
BRANCH_tpm2_tools = main

TEMPLATE_ROOT_WITH_PARTITIONS = 1

TEMPLATE_LABEL ?=
# Fedora
TEMPLATE_LABEL += fc34:fedora-34
TEMPLATE_LABEL += fc35:fedora-35
TEMPLATE_LABEL += fc36:fedora-36
TEMPLATE_LABEL += fc34+minimal:fedora-34-minimal
TEMPLATE_LABEL += fc35+minimal:fedora-35-minimal
TEMPLATE_LABEL += fc36+minimal:fedora-36-minimal
TEMPLATE_LABEL += fc34+xfce:fedora-34-xfce
TEMPLATE_LABEL += fc35+xfce:fedora-35-xfce
TEMPLATE_LABEL += fc36+xfce:fedora-36-xfce

# Debian
TEMPLATE_LABEL += stretch:debian-9
TEMPLATE_LABEL += stretch+standard:debian-9
TEMPLATE_LABEL += stretch+xfce:debian-9-xfce
TEMPLATE_LABEL += buster:debian-10
TEMPLATE_LABEL += buster+standard:debian-10
TEMPLATE_LABEL += buster+xfce:debian-10-xfce
TEMPLATE_LABEL += bullseye:debian-11
TEMPLATE_LABEL += bullseye+standard+firmware:debian-11
TEMPLATE_LABEL += bullseye+xfce:debian-11-xfce
TEMPLATE_LABEL += bookworm:debian-12
TEMPLATE_LABEL += bookworm+standard:debian-12
TEMPLATE_LABEL += bookworm+xfce:debian-12-xfce

# Ubuntu
TEMPLATE_LABEL += bionic+standard:bionic
TEMPLATE_LABEL += focal+standard:focal

# Whonix
TEMPLATE_LABEL += buster+whonix-gateway+minimal+no-recommends:whonix-gw-15
TEMPLATE_LABEL += buster+whonix-workstation+minimal+no-recommends:whonix-ws-15
TEMPLATE_LABEL += bullseye+whonix-gateway+minimal+no-recommends:whonix-gw-16
TEMPLATE_LABEL += bullseye+whonix-workstation+minimal+no-recommends:whonix-ws-16

# CentOS
TEMPLATE_LABEL += centos7:centos-7
TEMPLATE_LABEL += centos7+minimal:centos-7-minimal
TEMPLATE_LABEL += centos7+xfce:centos-7-xfce
TEMPLATE_LABEL += centos-stream8:centos-stream-8
TEMPLATE_LABEL += centos-stream8+minimal:centos-stream-8-minimal
TEMPLATE_LABEL += centos-stream8+xfce:centos-stream-8-xfce

TEMPLATE_ALIAS ?=
# Debian
TEMPLATE_ALIAS += stretch:stretch+standard
TEMPLATE_ALIAS += stretch+gnome:stretch+gnome+standard
TEMPLATE_ALIAS += stretch+minimal:stretch+minimal+no-recommends
TEMPLATE_ALIAS += buster:buster+standard
TEMPLATE_ALIAS += buster+gnome:buster+gnome+standard
TEMPLATE_ALIAS += buster+minimal:buster+minimal+no-recommends
TEMPLATE_ALIAS += bullseye:bullseye+standard+firmware
TEMPLATE_ALIAS += bullseye+gnome:bullseye+gnome+standard+firmware
TEMPLATE_ALIAS += bullseye+minimal:bullseye+minimal+no-recommends
TEMPLATE_ALIAS += bookworm:bookworm+standard
TEMPLATE_ALIAS += bookworm+gnome:bookworm+gnome+standard
TEMPLATE_ALIAS += bookworm+minimal:bookworm+minimal+no-recommends

# Ubuntu
TEMPLATE_ALIAS += bionic:bionic+standard
TEMPLATE_ALIAS += focal:focal+standard

# Whonix
TEMPLATE_ALIAS += whonix-gateway-15:buster+whonix-gateway+minimal+no-recommends
TEMPLATE_ALIAS += whonix-workstation-15:buster+whonix-workstation+minimal+no-recommends
TEMPLATE_ALIAS += whonix-gateway-16:bullseye+whonix-gateway+minimal+no-recommends
TEMPLATE_ALIAS += whonix-workstation-16:bullseye+whonix-workstation+minimal+no-recommends


# Uncomment this lines to enable CentOS template build
#DISTS_VM += centos-stream8

# Uncomment this lines to enable Whonix template build
#DISTS_VM += whonix-gateway whonix-workstation
#COMPONENTS += template-whonix
#BUILDER_PLUGINS += template-whonix

# Uncomment this lines to enable Debian 9 template build
#DISTS_VM += stretch
#COMPONENTS += builder-debian
#BUILDER_PLUGINS += builder-debian

# Uncomment this line to enable Archlinux template build
#DISTS_VM += archlinux
#COMPONENTS += builder-archlinux
#BUILDER_PLUGINS += builder-archlinux

about::
	@echo "qubes-os-r4.1.conf"

build instruction: just the standard get-sources + qubes + iso

When installing, at the first boot, the anaconda addons will crash.
Need to issue the needed command manually qubes-anaconda-addon/qubes.py at master · QubesOS/qubes-anaconda-addon · GitHub

https://neowutran.ovh/qubes_xen4.16_v2.iso
md5sum 39b23367269631044c8439c94bd4bdae

( only for dev & testing ofc)

Cpotts · November 23, 2022, 1:53am

Wow, I hope a maintainer sees this and we can get these changes pushed in a officially supported iso.

augsch · November 23, 2022, 2:07am

Marmarek recently submitted a PR to QubesOS/qubes-vmm-xen at github. The PR upgrades Xen version to 4.17-rc3, which I think is what next release of QubesOS will rely on.

Cpotts · November 23, 2022, 2:08am

Interesting, is there a test .iso with the new version of XEN yet? I understand that Qubes often has ISO’s under testing that can be downloaded.

neowutran · December 3, 2022, 8:54pm

Some update on my progress.

github.com/QubesOS/qubes-issues

Hardware support: Ryzen 7000 series / Zen 4 / AM5

opened 04:45PM - 02 Nov 22 UTC

neowutran

T: bug C: installer C: Xen P: default hardware support needs diagnosis

[How to file a helpful issue](https://www.qubes-os.org/doc/issue-tracking/) #…## Qubes OS release 4.1 ### Brief summary Qubes OS does not support Ryzen 7000 series More specifically, Xen panic saying that it doesn't known the CPU family n° 25 From my understanding, supporting recent hardware would require upgrading xen+libvirt+dom0 ( Started few day ago a work to try to upgrade that, but unsure if I will suceed ) ### Steps to reproduce Try to run the qubes installer on a computer with a ryzen 7000 serie ### Expected behavior Qubes installer is running ### Actual behavior Xen panic

I was also able to build another ISO using builderv2 and only using official qubes repo + the marmarek repositories mentionned in the issue.

However I still have the same issue regarding the TSC clocksource Ryzen 7000 serie - #19 by neowutran .

On my asus x670 strix F + 7950x , I first need to add the “x2apic=false” in the kernel options to boot to qubes. For the TSC issue, the frequency found by the system is wrong.
In dom0, the TSC is calibrated to 4491.520 Mhz which is kind of correct (~ approximatly the frequency of the CPU. I need to read a bit more about TSC and why it try a static frequency on a CPU with dynamic frequency ).
In domU, the TSC is calibrated to 196Mhz, and printing “/dev/cpuinfo” it seems that the domU system believe that 7950X is running at 196Mhz. It is wrong, it will run unusuably slow.

A work around I found it to manually override the configuration file used by libvirt/xen to start a domU.
Copy the libvirt configuration file to the qubes directory to override the configuration used:
cp /etc/virsh/libxl/DOMU_NAME.xml /etc/qubes/templates/libvirt/xen/by-name/
(create the directories if it doesn’t exist yet)

The in the xml search for the “clock” balise and force the TSC mode to “emulate” instead of “native”.

<clock offset='utc' adjustment='reset'>
<timer name='tsc' mode='emulate' />
</clock>

For a real fix for this issue, I have not idea yet.
I am not sure where is the issue, my first guess would be a bug in xen or libvirt.
It could also be a bug in the bios I think, a lot of things are broken in the bios

Will continue to dig deeper.

enmus · December 3, 2022, 11:24pm

In an already installed Qubes you can set this via kernelopts

qvm-prefs -s VMName kernelopts ‘clocksource=tsc’

neowutran · December 7, 2022, 7:36pm

Hello, by default every vm is using the tsc clocksource (clocksource=tsc has recently be added by default in the kernel option).

After spending a bit more time on the issue, the root cause seems to be because the cpu information provided to the domU are wrong (cpu frequency).
From some chat on xen IRC with a maintainer,

so this is a massive rats nest with virt. By default, VMs are created to be migrateable, and that means no Invariant TSC feature. Guests work fine, but report wonky values
if you don’t plan to migrate the VM, you can set itsc=1 in your vm config file, and then the TSC clocksource ought to be happier

From my understanding QubesOS is already using invariant TSC with this option in libvirt <feature policy='require' name='invtsc'/>.

For the moment not any real progress on finding what is the thing that are broken.
By what is “invarient TSC” should be and my issue, I am asking myself if it is not the invarient TSC itself that is broken.

I am now doing a bit of reading:

Processor Programming Reference for AMD CPU, family 25 (0x19): https://www.amd.com/en/support/tech-docs?keyword=PPR
Invarient TSC is a feature of the CPU itself.
Qubes has never worked with a AMD cpu of family 25 before, can a bug specific to xen + family 25 + invarient TSC exist ?
Reading a bit the source code of xen, like this part xen/xen/arch/x86/cpu/amd.c at master · xen-project/xen · GitHub
c->x86 is the CPU family. Ryzen 1 is family 0x17 (One of my computer is Ryzen 1 and it work perfectly with Qubes). So I am searching for suspicious things related to the CPU family for AMD cpu.

Still no new answer from ASUS support about the BIOS, except that the problem is a bit more complex than expected and that it will take more time to understand.

A lot of new things to learn

Some more tests:
On my ryzen 1 computer the policy <feature policy='require' name='invtsc'/> seems to have no influence, TSC is happy, /proc/cpuinfo is always correct. ( tried policy='require' and policy='disable')
On my ryzen 4, it also seems to have no influence, TSC is not happy, /proc/cpuinfo is always wrong

Modified the BIOS parameters a bit to try to see what it do. After modification, frequency reported in /proc/cpuinfo have been modified from 196Mhz to 205.166Mhz. Don’t know what specific parameter is responsible for that

neowutran · December 7, 2022, 11:12pm

I found the error. There is an integer overflow, most probably in xen hypervisor.
Hunting the thing

fjdh · December 10, 2022, 11:20am

Nice. I’ll probably switch to a zen 3 cpu in the near future (once zen4 starts pushing down zen3 (second hand) prices), but it’s nice to know zen4 support will be there, so thanks. Sadly even though AMD iirc is a partner of Xen, there seem to be a few issues with the speed at which they add actual support to the HV, plus xen isn’t exactly good about communicating about this sort of stuff.

neowutran · December 10, 2022, 7:38pm

For the moment no progress on my side.
From my IRC comment

For my issue, it seems to be a integer overflow. Somewhere there is a unsigned 32 bits integer storing the cpu frequency in Hz, this variable is responsible for passing the cpu frequency information to domU. When I downclock my CPU to below 4,294,967,295 Hz, the correct cpu frequency is passed to domU. After that it start back at 0 Hz. It explain why my domU is showing ~205 Mhz when my real CPU is running at ~4500Mhz. I am hunting for this integer to be switched to 64 bits integer. I am starting with the xen codebase, if someone have some hint on where to look specifically If not I will probably be able to find it, but going to take me a few days I think

Trying to understand “what does what” in the xen source, but it is going to take a while. Trying to find what is the part of the code that give the vcpu informations to a domU.

Also another issue for later, the tool “xenoprof” doesn’t support AMD family 25 ( explicit statement in the logs ).

neowutran · December 12, 2022, 8:17am

Don’t remember if it is because of the things I tryied to patch or if it was because I never tested it, but “PV” work as expected with the correct cpu frequency.

Only the PVH and HVM are problematics.
I am not so sure now that the issue is in the xen hypervisor code base. Maybe it is in the linux kernel directly, in the xen specific part linux/arch/x86/xen at master · torvalds/linux · GitHub

Going to take some more time TT

Update: Another funny thing to note and to understand or fix later, when starting a PVH linux domU, the linux kernel understand it as being a HVM and not a PVH. This is already the case in a standard qubes on a supported hardware.

github.com

torvalds/linux/blob/master/arch/x86/kernel/cpu/hypervisor.c#L79

      
        
            			continue;
            
            
		pri = (*p)->detect();
            		if (pri > max_pri) {
            			max_pri = pri;
            			h = *p;
            		}
            	}
            
            
	if (h)
            		pr_info("Hypervisor detected: %s\n", h->name);
            
            
	return h;
            }
            
            
static void __init copy_array(const void *src, void *target, unsigned int size)
            {
            	unsigned int i, n = size / sizeof(void *);
            	const void * const *from = (const void * const *)src;
            	const void **to = (const void **)target;

This line is printing “Hypervisor detected: Xen HVM” in case of Xen PVH.

Related code:

github.com

torvalds/linux/blob/master/arch/x86/xen/enlighten_hvm.c#L109

      
        
            	base = xen_cpuid_base();
            	eax = cpuid_eax(base + 1);
            
            
	major = eax >> 16;
            	minor = eax & 0xffff;
            	printk(KERN_INFO "Xen version %d.%d.\n", major, minor);
            
            
	xen_domain_type = XEN_HVM_DOMAIN;
            
            
	/* PVH set up hypercall page in xen_prepare_pvh(). */
            	if (xen_pvh_domain())
            		pv_info.name = "Xen PVH";
            	else {
            		u64 pfn;
            		uint32_t msr;
            
            
		pv_info.name = "Xen HVM";
            		msr = cpuid_ebx(base + 2);
            		pfn = __pa(hypercall_page);
            		wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32));
            	}

We see this global variable being reassigned:

github.com

torvalds/linux/blob/master/arch/x86/xen/enlighten_hvm.c#L106

      
        
            	int major, minor;
            	uint32_t eax, ebx, ecx, edx, base;
            
            
	base = xen_cpuid_base();
            	eax = cpuid_eax(base + 1);
            
            
	major = eax >> 16;
            	minor = eax & 0xffff;
            	printk(KERN_INFO "Xen version %d.%d.\n", major, minor);
            
            
	xen_domain_type = XEN_HVM_DOMAIN;
            
            
	/* PVH set up hypercall page in xen_prepare_pvh(). */
            	if (xen_pvh_domain())
            		pv_info.name = "Xen PVH";
            	else {
            		u64 pfn;
            		uint32_t msr;
            
            
		pv_info.name = "Xen HVM";
            		msr = cpuid_ebx(base + 2);

just before calling “xen_pvh_domain()” which is defined as being a reading the global variable “xen_pvh”:

github.com

torvalds/linux/blob/master/include/xen/xen.h#L26

      
        
            
            
#ifdef CONFIG_XEN_PVH
            extern bool xen_pvh;
            #else
            #define xen_pvh			0
            #endif
            
            
#define xen_domain()		(xen_domain_type != XEN_NATIVE)
            #define xen_pv_domain()		(xen_domain_type == XEN_PV_DOMAIN)
            #define xen_hvm_domain()	(xen_domain_type == XEN_HVM_DOMAIN)
            #define xen_pvh_domain()	(xen_pvh)
            
            
#include <linux/types.h>
            
            
extern uint32_t xen_start_flags;
            
            
#include <xen/interface/hvm/start_info.h>
            extern struct hvm_start_info pvh_start_info;
            
            
#ifdef CONFIG_XEN_DOM0
            #include <xen/interface/xen.h>

“CONFIG_XEN_PVH” is defined in the qubes kernel linux configuration, from what I see.
I don’t know if it is an issue or not, but it feel weird that when using a PVH the linux kernel explicitly state that he think it is a HVM.

Update2: The kernel later understand that it is a PVH. So nothing to see here.

Anyway, that was not what I was trying to debug. The rabbit hole is deep.

neowutran · December 12, 2022, 2:14pm

My patches have nothing to do with the linux guest working correctly in PV mode

After some more tests:

This issue is specific to linux guest in HVM or PVH mode.

Windows guest are working correctly in HVM mode, frequency is correct.
Linux guest are working correctly in PV mode, frequency is correct
Linux guest in HVM and PVH mode are not displaying the correct frequency, there is a integer overflow as mentionned previously

remedy · December 14, 2022, 12:20am

I certainly have no idea what I’m doing. But searching “frequency” in the xen codebase pulled up this:

github.com

xen-project/xen/blob/c805ceb0b26a643c7e47f01f2dbc50555d93cce8/xen/arch/x86/acpi/cpufreq/cpufreq.c

/*
 *  cpufreq.c - ACPI Processor P-States Driver ($Revision: 1.4 $)
 *
 *  Copyright (C) 2001, 2002 Andy Grover <andrew.grover@intel.com>
 *  Copyright (C) 2001, 2002 Paul Diefenbaugh <paul.s.diefenbaugh@intel.com>
 *  Copyright (C) 2002 - 2004 Dominik Brodowski <linux@brodo.de>
 *  Copyright (C) 2006        Denis Sadykov <denis.m.sadykov@intel.com>
 *
 *  Feb 2008 - Liu Jinsong <jinsong.liu@intel.com>
 *      porting acpi-cpufreq.c from Linux 2.6.23 to Xen hypervisor
 *
 * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 *
 *  This program is free software; you can redistribute it and/or modify
 *  it under the terms of the GNU General Public License as published by
 *  the Free Software Foundation; either version 2 of the License, or (at
 *  your option) any later version.
 *
 *  This program is distributed in the hope that it will be useful, but
 *  WITHOUT ANY WARRANTY; without even the implied warranty of

This file has been truncated. show original

Also here

github.com

xen-project/xen/blob/c805ceb0b26a643c7e47f01f2dbc50555d93cce8/xen/drivers/cpufreq/cpufreq.c

/*
 *  Copyright (C) 2001, 2002 Andy Grover <andrew.grover@intel.com>
 *  Copyright (C) 2001, 2002 Paul Diefenbaugh <paul.s.diefenbaugh@intel.com>
 *  Copyright (C) 2002 - 2004 Dominik Brodowski <linux@brodo.de>
 *  Copyright (C) 2006        Denis Sadykov <denis.m.sadykov@intel.com>
 *
 *  Feb 2008 - Liu Jinsong <jinsong.liu@intel.com>
 *      Add cpufreq limit change handle and per-cpu cpufreq add/del
 *      to cope with cpu hotplug
 *
 * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 *
 *  This program is free software; you can redistribute it and/or modify
 *  it under the terms of the GNU General Public License as published by
 *  the Free Software Foundation; either version 2 of the License, or (at
 *  your option) any later version.
 *
 *  This program is distributed in the hope that it will be useful, but
 *  WITHOUT ANY WARRANTY; without even the implied warranty of
 *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU

This file has been truncated. show original

Both are using unsigned int to store cpu frequency AFAIK (I don’t program in C).

neowutran · December 17, 2022, 12:28pm

Did some more testing.
Tracked back the cpu frequency to here:

github.com

neowutran/linux/blob/overflow/arch/x86/xen/time.c#L41

      
        
            /* Minimum amount of time until next clock event fires */
            #define TIMER_SLOP	100000
            
            
static u64 xen_sched_clock_offset __read_mostly;
            
            
/* Get the TSC speed from Xen */
            static unsigned long xen_tsc_khz(void)
            {
            	struct pvclock_vcpu_time_info *info =
            		&HYPERVISOR_shared_info->vcpu_info[0].time;
            	pr_err("XEN_TSC_KHZ: tsc_shift: %d ; tsc_to_system_mul: %u", info->tsc_shift, info->tsc_to_system_mul);
            	setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
            	return pvclock_tsc_khz(info);
            }
            
            
static u64 xen_clocksource_read(void)
            {
                    struct pvclock_vcpu_time_info *src;
            	u64 ret;
            
            
	preempt_disable_notrace();

In case of PV mode (dom0 or guest):
tsc_shift = -2 ; tsc_to_system_mul: 3_824_888_891

In case of PVH or HVM mode:
tsc_shift = 3; tsc_to_system_mul: 2_730_337_484

The calculation done by pvclock_tsc_khz to determine the CPU frequency seems to be correct and without overflow. The input data (tsc_to_system_mul and tsc_shift) seems to be source of the issue.

More debug is needed to reach the source issue.

Difference between PVH and HVM mode:
In case of HVM, the CPU is correctly calibrated using the PIT method (correct frequency found using this method):

github.com

neowutran/linux/blob/ca943594e4db5d472221d41b0b86d85546d84488/arch/x86/kernel/tsc.c#L889

      
        
            	pr_err("ENTERING NATIVE_CALIBRATE_CPU\n");
            	unsigned long flags, fast_calibrate = cpu_khz_from_cpuid();
            
            
	if (!fast_calibrate){
            		pr_err("FAST_CALIBRATE USING CPUID FAILED. TRYING CPU_KHZ_FROM_MSR\n");
            		fast_calibrate = cpu_khz_from_msr();
            	}
            	if (!fast_calibrate) {
            		pr_err("FAST_CALIBRATE USING CPUID AND MSR FAILED. TRYING QUICK_PIT_CALIBRATE\n");
            		local_irq_save(flags);
            		fast_calibrate = quick_pit_calibrate();
            		local_irq_restore(flags);
            	}
            	return fast_calibrate;
            }
            
            

            
/**
             * native_calibrate_cpu - calibrate the cpu
             */
            static unsigned long native_calibrate_cpu(void)

So calculated cpu frequency and tsc frequency are different

github.com

neowutran/linux/blob/ca943594e4db5d472221d41b0b86d85546d84488/arch/x86/kernel/tsc.c#L1466

      
        
            	if (early) {
            		pr_err("DETERMINE_CPU_TSC_FREQENCIES: EARLY");
            		cpu_khz = x86_platform.calibrate_cpu();
            		if (tsc_early_khz){
            			pr_err("DETERMINE_CPU_TSC_FREQENCIES: TSC_EARLY_KHZ");
            			tsc_khz = tsc_early_khz;
            		}else{
            			pr_err("DETERMINE_CPU_TSC_FREQENCIES: NO TSC_EARLY_KHZ");
            			tsc_khz = x86_platform.calibrate_tsc();
            		}
            		pr_err("DETERMINE_CPU_TSC_FREQENCIES: cpu %u ; tsc: %u", cpu_khz, tsc_khz);
            
            
        
            	} else {
            		pr_err("DETERMINE_CPU_TSC_FREQENCIES: NO EARLY");
            		/* We should not be here with non-native cpu calibration */
            		WARN_ON(x86_platform.calibrate_cpu != native_calibrate_cpu);
            		cpu_khz = pit_hpet_ptimer_calibrate_cpu();
            		pr_err("DETERMINE_CPU_TSC_FREQENCIES: cpu %u", cpu_khz);
            
            
	}

later in the code, the linux kernel prefere to use the tsc frequency instead of the cpu frequency.
That may explain why a Windows HVM guest work correctly and a linux HVM guest does not

UPDATE, more debug information:
Getting closer.

By applying thoses 3 lines (to reproduce the same behavior as PV in this function), PVH and HVM now start with the correct frequency. So getting way closer to the source issue.

For the PVH and HVM mode, the method
void set_time_scale(struct time_scale *ts, u64 ticks_per_sec)

receive an incorrect value for “ticks_per_sec”

UPDATE 2
I think I found it:

github.com

neowutran/xen/blob/f6530befb695a8dd1945ab405dd75bef507b0c08/xen/arch/x86/time.c#L2605

      
        
                d->arch.vtsc = 0;
                return 0;
            }
            
            
switch ( tsc_mode )
            {
            case TSC_MODE_DEFAULT:
            case TSC_MODE_ALWAYS_EMULATE:
                d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
                d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
                set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000);
            
            
    /*
                 * In default mode use native TSC if the host has safe TSC and
                 * host and guest frequencies are the same (either "naturally" or
                 * - for HVM/PVH - via TSC scaling).
                 * When a guest is created, gtsc_khz is passed in as zero, making
                 * d->arch.tsc_khz == cpu_khz. Thus no need to check incarnation.
                 */
                if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() &&
                     (d->arch.tsc_khz == cpu_khz ||

“d->arch.tsc_khz” is a unsigned integer. The value expected by set_time_scale is a u64.
Since there is no cast from u32 to u64, when it get multiplied by 1000 (from KHZ to HZ), it overflow.
With explicit cast to u64 it should work.

Testing it. Going to take some hours.

UPDATE 3

I confirm that this is the source issue. I fixed it on my side, all seems to work as expected.
Now need to make a nice patch and speak with xen developer to integrate it

UPDATE 4
Patch normally sent to the xen-devel mailing list.
Copy here:

From c1535eba0bba6fc1b91f975f434af0929d9d7c96 Mon Sep 17 00:00:00 2001
Message-Id: <c1535eba0bba6fc1b91f975f434af0929d9d7c96.1671298409.git.xen@neowutran.ovh>
From: Neowutran <xen@neowutran.ovh>
Date: Sat, 17 Dec 2022 17:17:03 +0100
Subject: [Patch v1] Bug fix - Integer overflow when cpu frequency > u32 max value.

xen/arch/x86/time.c: Bug fix - Integer overflow when cpu frequency > u32 max value.

What is was trying to do: I was trying to install QubesOS on my new computer
(AMD zen4 processor). Guest VM were unusably slow / unusable.

What is the issue: The cpu frequency reported is wrong for linux guest in HVM
and PVH mode, and it cause issue with the TSC clocksource (for example).

Why this patch solved my issue:
The root cause it that "d->arch.tsc_khz" is a unsigned integer storing
the cpu frequency in khz. It get multiplied by 1000, so if the cpu frequency
is over ~4,294 Mhz (u32 max value), then it overflow.
I am solving the issue by adding an explicit cast to u64 to avoid the overflow.

---
 xen/arch/x86/time.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index b01acd390d..7c77ec8902 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -2585,7 +2585,7 @@ int tsc_set_info(struct domain *d,
     case TSC_MODE_ALWAYS_EMULATE:
         d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
         d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
-        set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000);
+        set_time_scale(&d->arch.vtsc_to_ns, (u64)d->arch.tsc_khz * 1000);

         /*
          * In default mode use native TSC if the host has safe TSC and
--
2.38.1

Now, next issue, GPU passthrough

Cpotts · December 17, 2022, 8:11pm

Thanks for your continued work, judging by the amount of “hearts” on this thread there are several other people interested in this as well. It would not be an exaggeration to say I look at this a couple times a day to gauge the progress you and other have been making! thanks again.

balko · December 18, 2022, 2:58pm

Long story short - which Ryzen version is the highest that works perfectly (including its iGPU) with up-to-date current Qubes OS 4.1.1 (lets imaging user can install and update Qubes OS on different PC)? 5***, 4*** or what and how to select a Ryzen for this?

Is there any sense to buy 6*** or 7*** series at this point if user wants to make it work almost out of box on Qubes OS?