[qubes-users] qubes update -- how to hold an old kernel ??

haaber · June 10, 2022, 6:55am

Recent QSB made me run the qubes-update. Regrettably, it wants to remove
a kernel version that I need to hold (in case of foreseeable problems
with newer ones). How can I freeze that older version and forbid its
uninstall?

best, Bernhard

Demi · June 10, 2022, 9:10pm

Which kernel version do you need to hold? You can update a subset of
packages by giving them as arguments to qubes-dom0-update, but I would
like to know what the forseeable problems are. I am not aware of any
way to do what you want given how DNF works.

- --
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

haaber · June 10, 2022, 11:09pm

Which kernel version do you need to hold? You can update a subset of
packages by giving them as arguments to qubes-dom0-update, but I would
like to know what the forseeable problems are.

The reason is simple: all (!) 5.x xen kernels I tested so far
crash/freeze my system in less than 5 minutes, often only seconds (open
issue on github since 18 months). Therefore I keep a 4.19 kernel for xen
(only) -- until now the updater respected that: it installed some new
5.x kernel and kernel-latest. Every single time, I bravely try them out,
and each time they crash: each time I can revert back to 4.19 by a
linux-life usb hack.

Last kernel update wants to remove my 4.19 kernel, and no way I can
accept that, given the history. ( again a curse on Intel and Dell for
their buggy hardware ).

best, Bernhard

Demi · June 11, 2022, 6:55am

Try removing one of the newer kernels on your system. Also, would you
be willing to try disabling panic_on_oops? That won’t fix the bug, but
it has a chance of leaving the system running afterwards. Adding
kernel.panic_on_oops=0 and kernel.panic_on_warn=0 to /etc/sysctl.conf
should do the trick.

- --
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

Peter_Palensky · June 11, 2022, 10:00am

Same here (Dell XPS13). The only usable dom0 kernels are 4.x and 5.4.88 (already gone :-0) and 5.4.175 (please let me keep that!).

Everything else either crashes dom0 (e.g., 5.15) or stalls sys-usb (e.g. 5.12.).

It says “00:14.0 USB controller problem”, might be a usb3.0 problem, tried various things, nothing helped, my BIOS has no option to disable xHCI.

Demi · June 11, 2022, 5:16pm

(FYI, it looks like you forgot to CC the mailing list)

Dear Marie,

> > > Which kernel version do you need to hold? You can update a subset of
> > > packages by giving them as arguments to qubes-dom0-update, but I would
> > > like to know what the forseeable problems are.
>
> > The reason is simple: all (!) 5.x xen kernels I tested so far
> > crash/freeze my system in less than 5 minutes, often only seconds (open
> > issue on github since 18 months). Therefore I keep a 4.19 kernel for xen
> > (only) -- until now the updater respected that: it installed some new
> > 5.x kernel and kernel-latest. Every single time, I bravely try them out,
> > and each time they crash: each time I can revert back to 4.19 by a
> > linux-life usb hack.
>
> > Last kernel update wants to remove my 4.19 kernel, and no way I can
> > accept that, given the history. ( again a curse on Intel and Dell for
> > their buggy hardware ).
>
> Try removing one of the newer kernels on your system.
That did not help. But I could use the output and manually install all
the lines qubes-update suggests, but not remove the old-kernel.
qubes-update did download them all. I would have to make sure not to
introduce a security flaw (not checking signatures), and invoking maybe
dnf directly with the correct full package name?: the (terminal
qubes-update) line

qubes-dom0-update unpacks the packages in a temporary directory. They
are then copied to an anonymous temporary file in the final directory by
rpmcanon, and only linked into the filesystem if rpmcanon verifies the
signature successfully. The signature is then checked *again* by
rpmkeys before you even get the “Is this ok [y/N]” prompt.

In short, no, you will not introduce a security flaw doing this.

microcode_ctl.x86_64 2.1-35.qubes1.fc25 qubes-dom0-cached

would have to become one word, but I do not know the "linking char": a
dot? like

microcode_ctl.x86_64.2.1-35.qubes1.fc25

or underscore? like

microcode_ctl.x86_64_2.1-35.qubes1.fc25

You need to move the architecture to the end, and then use a dash to
join the name and epoch/version. In your example, you would run

$ sudo dnf upgrade microcode_ctl-2.1-35.qubes1.fc32.x86_64

But I am unsure if that is a "safe way" to go ....

It is.

> Also, would you
> be willing to try disabling panic_on_oops? That won’t fix the bug, but
> it has a chance of leaving the system running afterwards. Adding
> kernel.panic_on_oops=0 and kernel.panic_on_warn=0 to /etc/sysctl.conf
> should do the trick.

I'll try that, of course. I'll keep you informed ...

That would be great.

- --
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

Demi · June 11, 2022, 5:18pm

That would likely leave you without the needed kernel module packages,
which would be bad.

- --
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

Demi · June 11, 2022, 5:21pm

I am hesitant to ask, since it would require running unsigned code
(yuck!), but would you be comfortable doing a kernel git bisection?
That would allow figuring out exactly which commit caused the problem,
and would vastly improve the likelihood of the bug being fixed.

- --
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

Peter_Palensky · June 12, 2022, 9:57am

Aaehm… It is my work computer, i need it every day and can not risk anything…

Is there a safe/standard procedure in qubes to compile the bisects, add them to grub without removing the working kernel, etc.?

unman · June 12, 2022, 12:37pm

Hi Bernhard

There are a number of things you can do: the simplest -
Increase the number of kernel packages that are retained:
In /etc/dnf/dnf.conf change installonly_limit=3 to some higher number.
Then manually delete kernel packages that are intermediate.
That way you keep the working version *and* get the updates so you can
try them as they come in.

There used to be a plugin to lock dnf updates to a specific version, but
I think that disappeared a few years ago.

You can try `dnf mark install kernel-VERSION` which *should* hold that
package version on the system, but that hasn't always worked for me.

There is another simple approach - run the update while booted in to the
kernel you want to hold. dnf wont remove the running kernel, and will
uninstall newer versions to stay within the installonly_limit you have
set.

Some combination of these should allow you to hold (some) older kernel
version while still allowing you to try updated kernels.

unman

haaber · June 12, 2022, 8:01pm

I did that. The 5.16.18 kernel freezes, as all 5.x ones, but here is a
funny detail: I froze on login, and I just kept typing the password and
hit enter.Nothing happened. So I forced a cold boot. BUT: the journal
contains the line

Jun 12 21:27:49 dom0 runuser[3526]: pam_unix(runuser:session): session
opened for user USER by (uid=0)

and some other lines. Which means that the pwd was recognised and
accepted - and only the screen freezes. Which brings the suspecions
closed to the f*cked up intel graphics card. Alone, using modesetting
driver does not save 5.x kernels. So it is more complicated than that.

Bernhard

Demi · June 12, 2022, 9:05pm

Not that I am aware of, sadly. Marek (CCd) might have suggestions.

- --
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

marmarek · June 12, 2022, 10:05pm

>
>
>
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA256
> >
> > >
> > >
> > > > > Which kernel version do you need to hold? You can update a subset of
> > > > > packages by giving them as arguments to qubes-dom0-update, but I
> > would
> > > > > like to know what the forseeable problems are.
> > > >
> > > > The reason is simple: all (!) 5.x xen kernels I tested so far
> > > > crash/freeze my system in less than 5 minutes, often only seconds
> > (open
> > > > issue on github since 18 months). Therefore I keep a 4.19 kernel for
> > xen
> > > > (only) -- until now the updater respected that: it installed some new
> > > > 5.x kernel and kernel-latest. Every single time, I bravely try them
> > out,
> > > > and each time they crash: each time I can revert back to 4.19 by a
> > > > linux-life usb hack.
> > > >
> > > > Last kernel update wants to remove my 4.19 kernel, and no way I can
> > > > accept that, given the history. ( again a curse on Intel and Dell for
> > > > their buggy hardware ).
> > > >
> > > > best, Bernhard
> > > >
> > > >
> > > Same here (Dell XPS13). The only usable dom0 kernels are 4.x and 5.4.88
> > > (already gone :-0) and 5.4.175 (please let me keep that!).

There are a couple more options to choose from - for LTS kernels we keep
some of them updated, even after the default is switched to the next
one. For R4.0 there is for example kernel-419. You can check available
options via `qubes-dom0-update --action=search kernel`.

> > > Everything else either crashes dom0 (e.g., 5.15) or stalls sys-usb (e.g.
> > > 5.12.).
> > >
> > > It says "00:14.0 USB controller problem", might be a usb3.0 problem,
> > tried
> > > various things, nothing helped, my BIOS has no option to disable xHCI.
> >
> > I am hesitant to ask, since it would require running unsigned code
> > (yuck!), but would you be comfortable doing a kernel git bisection?
> > That would allow figuring out exactly which commit caused the problem,
> > and would vastly improve the likelihood of the bug being fixed.
>
> Aaehm... It is my work computer, i need it every day and can not risk
> anything...
> Is there a safe/standard procedure in qubes to compile the bisects, add
> them to grub without removing the working kernel, etc.?

Not that I am aware of, sadly. Marek (CCd) might have suggestions.

For any tests, I usually place kernel+initramfs under some arbitrary
name that does not interfere with version-based entries. And do that by
installing kernel "manually", exactly to avoid dnf/rpm removing older
packages. For the grub entry, I usually edit
/boot/efi/EFI/qubes/grub.cfg manually (copy existing section and just
replace file names). But regenerating it with grub2-mkconfig should be
safe too.
This does require manual cleaning after testing is finished,
though...

Here is example script to build and install kernel in dom0:

#!/bin/sh

    set -e
    make olddefconfig
    make -j2
    kver=$(make kernelrelease)
    sudo make modules_install
    sudo cp arch/x86/boot/bzImage /boot/vmlinuz-test
    sudo dracut -f --kver="$kver" /boot/initramfs-test.img

it can be launched from kernel sources.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

haaber · June 13, 2022, 8:28pm

Dear Marek,

kernel testing would be so much easier if the xen.cfg would allow an
option like

default=menuselect

to get a boot menu -- instead of

default=[5.16.whatever]

which makes it actually necessary to "hack" the xen.cfg via a
live-linux-usb intrusion if a kernel should fail to work ... that
produces an attack-vector & is annoying.

Maybe such a function exists already? If not that would be a feature
request!

Thank you, Bernhard

marmarek · June 13, 2022, 8:32pm

That's the main reason why Qubes 4.1 doesn't use xen.cfg at all. There
is standard grub, where you have menu, editor etc.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

haaber · June 14, 2022, 11:51am

Dear Marek,

kernel testing would be so much easier if the xen.cfg would allow an
option like default=menuselect
to get a boot menu -- instead of
Maybe such a function exists already? If not that would be a feature
request!

That's the main reason why Qubes 4.1 doesn't use xen.cfg at all. There
is standard grub, where you have menu, editor etc.

Brilliant. And I'd love to re-install 4.1 for that. But the 5.x kernel
on the iso fails either on boot, or at latest while install... is there
a grub on the 4.1-iso as well? (i.e. possibility to manually add a
kernel like 4.19?) If so: is the procedure explained somewhere? 'Cause
grub-hacking is very unpleasant, as well Thank you, Bernhard

Peter_Palensky · August 5, 2022, 3:00pm

Update: I can use newer kernels if I remove device “Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader” from sys-usb VM.
If it is attached to that VM, the entire computer crashes upon sys-usb start (when newer kernels are in use, it is fine with [very] old ones).

So I guess not having the card reader is my solution towards upgrading to 4.1.1…

tripleh · August 6, 2022, 9:45am

Hm interesting. That's quite similar to my experience [1].

Maybe some Xen code wrt attaching Express Card Readers broke?
Last somewhat good kernel for me was 5.10.109-1.

[1] https://github.com/QubesOS/qubes-issues/issues/7637