4.3 RC2 badly broken

Sorry for the poor bug report but I’ll summarize best I can:

  1. In place upgrade path is broken, doesn’t work following instructions even on a fresh 4.2.4 install. This should be reproducible by anyone.
  2. Installing the 4.3 RC2 ISO resulted in serious instability on hardware which works perfectly on Qubes 4.2.4 (HCL: HCL - AMD Ryzen 9 9950X with ASUS ROG Strix X670E-A), specifically
    a) numerous kernel panics which didn’t even journal any logs to disk
    b) filesystem instability, resulting in the root volume being remounted read-only (ext4), “failed to convert unwritten extents to written extents”, I/O errors
    c) restores from backups sometimes didn’t work, filling up the root volume resulting in “no space left on device” and “unexpected EOF”, leaving partially restored files in QubesIncoming
    c) actual data loss, under btrfs after a kernel crash /var/lib/qubes/vm-templates/debian-12/root.img was deleted, spent time trying to learn how to troubleshoot btrfs, there was something about two different states or heights but it was impossible to go further with an online volume

Anyway I’ve spent half a week doing 6-7 reinstalls, thinking it was my hardware, running prime95/memtest/disk benchmarks, switching back and forth with Ubuntu, and stress testing under Qubes 4.2.4 which is rock solid on the same hw.

I wish I’d noted down more specifics but as I don’t have a separate development build or much time to dedicate it’s going to be hard as I can’t keep nuking my main machine when backup/restore takes 8 hours.

What’s a recommended monthly donation for a long term Qubes power user? Got a lot of benefit from the project, paying less than a MSFT licence seems rude. Thanks for all the work.

Did you try to use the same kernel version in both Qubes OS 4.2 and Qubes OS 4.3?
Maybe it’s a hardware-specific bug in kernel that caused all these issues.

1 Like

That’s quite possible, I installed kernel-latest on all but my uname -a is showing the original kernel, not latest, on 4.2.4. So it’s possible I was running the stock kernel in 4.3.

(CC @marmarek: RC2 feed back)

My stable kernel is: Linux dom0 6.15.10-1.qubes.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Aug 19 00:47:23 GMT 2025 x86_64 x86_64 x86_64 GNU/Linux

Have just installed 4.3RC2 on another machine and kernel latest here is:
Linux dom0 6.15.10-1.qubes.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Aug 19 01:01:39 UTC 2025 x86_64 GNU/Linux

So yes, Linux 6.15.10-1 was installed in both 4.2.4 which is stable and 4.3 RC2 which was not.

It’s very badly broken due to bad drivers with many 4.3 packages that cannot properly communicate to any of my backup USB HDD to restore its glory. Just try to fix that in Release Candidate 3, mate.

1 Like

FWIW I’ve just been able to take snapshots using wyng from a Qubes 4.2.4 machine and restore them to 4.3RC2. But I have an open question on how we clean up the excess metadata in wyng-util-qubes as it doesn’t have the “monitor” command. Will post it tomorrow as my account is rate limited for being new.

Haven’t tested bluetooth or fingerprint but everything else works on 4.3RC2, didn’t try doing a qubes restore as that has failed a fair few times but got wyng working instead to transfer ~700GB.

---
layout:
  'hcl'
type:
  'Notebook'
hvm:
  'yes'
iommu:
  'yes'
slat:
  'yes'
tpm:
  'unknown'
remap:
  'yes'
brand: |
  LENOVO
model: |
  20W1S30V00
bios: |
  N34ET65W (1.65 )
cpu: |
  11th Gen Intel(R) Core(TM) i5-1145G7 @ 2.60GHz
cpu-short: |
  FIXME
chipset: |
  Intel Corporation Tiger Lake-UP3/H35 4 cores Host Bridge/DRAM Registers [8086:9a14] (rev 01)
chipset-short: |
  FIXME
gpu: |
  Intel Corporation TigerLake-LP GT2 [Iris Xe Graphics] [8086:9a49] (rev 01) (prog-if 00 [VGA controller])
gpu-short: |
  FIXME
network: |
  Intel Corporation Wi-Fi 6 AX201 [8086:a0f0] (rev 20)
memory: |
  40663
scsi: |

usb: |
  4
certified:
  'no'
versions:
  - works:
      'yes'
    qubes: |
      R4.3-rc2
    xen: |
      4.19.3
    kernel: |
      6.15.10-1
    remark: |
      FIXME
    credit: |
      FIXAUTHOR
    link: |
      FIXLINK

2 Likes

It would be useful to have more details here, as it definitely works for some. At which stage it fails? Do you remember any details?

Based on described symptoms indeed it looks something is terribly wrong there. Since you tested the same kernel in 4.2 and 4.3, it’s more likely related to the kernel version. This is z Zen 5 CPU which I think didn’t got much testing yet, but it could be also related to some other component.
Regarding kernel panics, any chance of capturing some, maybe with a photo or something? If you have a serial port on your board (and another computer to capture it), that would be the most reliable way, but IIUC this board doesn’t have this feature… There are also some other ideas at Ease debugging Xen issues · Issue #6834 · QubesOS/qubes-issues · GitHub (especially the comment I’m linking here).

3 Likes

@Ryzen9950x … Oooof this is scary!

Using the same kernel and cpu/graphics, except i7 instead of i5, and rc1 instead of rc2. I was planning on upgrading to rc2 today, but it looks like I dodged a bullet!

Have some graphics and video playback issues but have not noticed any total kernel panics or segfaults yet - except in Atril Document Viewer when reading an epub and attempting switch to dark mode (on Qubes readthedocs epub export)
which segfaults (was going to recreate a github account and submit bug but have not got around to it yet)

@Ryzen9950x, you mentioned you upgraded from 4.2.2 to 4.3rc2? (I did rc1 ISO install)

  • Does it kernel panic before Xorg/GUI loads? What drivers is LoadModule using in dom0:/var/log/Xorg.0.log? (Intel/Iris or other?)
  • What module is the kernel using on dom0? i915?

If I had to guess it might be something graphics related, like retained an old config file from 4.2.2 in place and breaks in 4,3rc2.

1 Like

The in place upgrade didn’t work, so all the RC2 issues I encountered were on a clean install. It’s my main workstation so had to revert to 4.2.4 so I’m limited in what I can try to troubleshoot. Does seem to be hardware specific though because no problems so far on my Lenovo T14.

I wonder how many of these issues are caused by BTRFS. I use BTRFS in production on my non-qubes work machine, but it is a very finicky filesystem that can go (and at my workplace at least a couple of times has gone) very wrong if not handled properly. One has to be very careful to avoid exhausting disk space, disk space can become exhausted even when it looks like you have gigabytes of space left, performance issues can result if files become fragmented even on an SSD, space on the disk can become “unreachable” and very difficult to recovery, and snapshots can cause disk space usage to increase massively during defragmentation. My workplace spent months taming it to the point where we were comfortable with it and didn’t have to think about it much. I’ve also seen some action in the kernel surrounding BTRFS indicating there have been some worrying bugs there.

If you do a Qubes R4.3 setup with LVM instead, do you have the same issues? My primary Qubes machine is LVM-based, runs R4.3, and it works like a charm. Granted, it’s a very different machine than yours (Intel i5-13500H rather than Ryzen 9 9950X), but it does work.

2 Likes

LVM appears to be optimized for servers, which results in several shortcomings:

  • Space exhaustion is handled poorly, requiring manual recovery. This recovery may sometimes fail.
  • It is not possible to shrink a thin pool.
  • Thin pools slow down system startup and shutdown.

Additionally, LVM thin pools do not support checksums. This can be achieved via dm-integrity, but that does not support TRIM.

Hmm, it looks like the Qubes developers have a lot of my concerns with BTRFS already taken into consideration. At least based on my experience working with BTRFS, I’d still prefer LVM by far because of some of the horror stories I’ve heard and ran into while working with it, but I guess I can see this working.

In any event, I’d still recommend that OP try using LVM and see if things get any better. BTRFS support isn’t fully smoothed out yet it doesn’t seem, and if OP is running into data loss, that could very well be the culprit.

2 Likes

I will keep using ext4+lvm until I have enough evidence from wide-scale use by other QubesOS users that btrfs is good to use.

Been a QubesOS user for 4 years now, and never had a problem with ext4+lvm. QubesOS is a complicated enough system itself. I have the notion that at least I gotta keep the filesystem base stable and rock solid. Ext4 has been battle tested for ages now.

1 Like

While I don’t have the exact machine mentioned by the OP, I have two machines with Qubes that use BTRFS. I have updated one of them to r4.3-rc1 when it came out and the second one to r4.3-rc2 recently and I’ve had no problems that could be pinned on the fact that I use BTRFS for my rootfs (including varlibqubes).

1 Like

To be clear, I’m not blaming the fs: ext4 crashed too (failed writes, remounting itself ro; it just didn’t delete a file).

1 Like

That sounds bad and could indeed be caused by the new kernel. It could also be caused by failing hardware (coincidentally after an upgrade), have you checked SMART for your OS drive?

Yes I checked smart, all the NVME tools, read and wrote several TB from Ubuntu using ‘fio’ (?), the installed Qubes 4.2.4 and restored over 1T of data, did a bunch of snapshots, I/O heavy stuff and the hardware is rock solid.

I didn’t get to note down all the details and logs, especially kernel panics, when I was running RC2 but the experience was so awful I thought I ought to post it here for some further investigation if possible.