[qubes-users] fedora-40-minimal install - message about fstrim

Boryeu_Mao · October 14, 2024, 3:40pm

For the template install command on Qubes release 4.2.3

sudo qubes-dom0-update qubes-template-fedora-40-minimal

I received a message that

fstrim: /var/tmp/tmpsd1ns61v/var/lib/qubes/vm-template: the discard operation is not supported

The template appears to be running normally, so perhaps this is a warning message. Please advise if there is anything to be done or to watch out for. Thanks.

rustybird · October 15, 2024, 10:59am

Boryeu Mao:

For the template install command on Qubes release 4.2.3

sudo qubes-dom0-update qubes-template-fedora-40-minimal

I received a message that

fstrim: /var/tmp/tmpsd1ns61v/var/lib/qubes/vm-template: the discard
operation is not supported

Did you maybe mount a tmpfs at /var/tmp? That would explain fstrim not
working. It also wouldn't matter then.

The template appears to be running normally, so perhaps this is a warning
message.

Pretty much. The fstrim invocation was added to inform the underlying
storage (LVM Thin by default) of the filesystem hosting /var/tmp that
the space previously used for temporary image files extracted during
the installation process can be freed:

But it doesn't affect the installed template.

Rusty

Boryeu_Mao · October 15, 2024, 6:22pm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Boryeu Mao:

For the template install command on Qubes release 4.2.3

sudo qubes-dom0-update qubes-template-fedora-40-minimal

I received a message that

fstrim: /var/tmp/tmpsd1ns61v/var/lib/qubes/vm-template: the discard
operation is not supported

Did you maybe mount a tmpfs at /var/tmp? That would explain fstrim not
working. It also wouldn’t matter then.

It was the stock qubes-dom0-update, so all tmpfs operations are what the stock script would do, no manual tmpfs mount.

The template appears to be running normally, so perhaps this is a warning
message.

Pretty much. The fstrim invocation was added to inform the underlying
storage (LVM Thin by default) of the filesystem hosting /var/tmp that
the space previously used for temporary image files extracted during
the installation process can be freed:

https://github.com/QubesOS/qubes-core-admin-client/commit/4a9b57f91fdf3a2b35a5cf707970d05bf9cadba7

But it doesn’t affect the installed template.

In the qvm_template_postprocess.py (which the above link points to), fstrim is called only if the root user does the template install. So I’d need to figure out:

(1) why is it that there is no need to free space in the lvm thin volume if a non-root user does the install
(2) how would a root-installed template be different from one installed by a non-root user

And if fstrim warning means the space didn’t get freed as it should, then perhaps I need to keep an eye on lvm volume usage.

Rusty

Thank you very much for helping.

rustybird · October 16, 2024, 3:03pm

Boryeu Mao:

> Boryeu Mao:
> > For the template install command on Qubes release 4.2.3
> >
> > sudo qubes-dom0-update qubes-template-fedora-40-minimal
> >
> > I received a message that
> >
> > fstrim: /var/tmp/tmpsd1ns61v/var/lib/qubes/vm-template: the discard
> > operation is not supported
>
> Did you maybe mount a tmpfs at /var/tmp?

[...] no manual tmpfs mount.

I assume you're seeing the same "not supported" message if you run:

$ sudo fstrim /var/tmp/

The only thing I can think of is that you have custom partitioning,
and the storage layer immediately underneath the filesystem hosting
/var/tmp/ is dm-crypt (unusual for an LVM Thin installation), and
dm-crypt has been mapped with discard disabled.

Your storage tree (showing discard support) can be printed with:

$ lsblk --output +DISC-MAX

> qvm-template-postprocess: call fstrim after removing image file · QubesOS/qubes-core-admin-client@4a9b57f · GitHub

In the qvm_template_postprocess.py (which the above link points to), fstrim
is called only if the root user does the template install.

To me this looks like something that was missed in the move to
qvm-template:

Previously, qubes-dom0-update (which had to be run as root) would
install templates as normal RPM packages. I guess the logic to skip
fstrim for non-root users might have been put there to ease testing
the qvm-template-postprocess tool? CCing Marek

Then qvm-template was created (which like other qvm- tools usually
runs as a regular user) and now fstrim is skipped unless someone
happens to invoke qvm-template as root. Skipping seems like a bug, but
on R4.2 systems it's mitigated by the installer adding the 'discard'
mount option for the dom0 root filesystem, making fstrim redundant.
Except for people who installed via qubes-dist-upgrade or removed the
mount option. For those, there's still the systemd fstrim.timer that
should release the space to LVM, hopefully soon enough (weekly).

Finally, you've used qubes-dom0-update, which nowadays calls
qvm-template for template related stuff. For this, qubes-dom0-update
can actually be run as non-root, but you ran it with sudo, so fstrim
was *not* skipped. (Which then failed on on your system.)

Thank you very much for helping.

Happy to. It's interesting

Rusty

marmarek · October 16, 2024, 3:36pm

Boryeu Mao:
> > Boryeu Mao:
> > > For the template install command on Qubes release 4.2.3
> > >
> > > sudo qubes-dom0-update qubes-template-fedora-40-minimal
> > >
> > > I received a message that
> > >
> > > fstrim: /var/tmp/tmpsd1ns61v/var/lib/qubes/vm-template: the discard
> > > operation is not supported
> >
> > Did you maybe mount a tmpfs at /var/tmp?

> [...] no manual tmpfs mount.

I assume you're seeing the same "not supported" message if you run:

$ sudo fstrim /var/tmp/

The only thing I can think of is that you have custom partitioning,
and the storage layer immediately underneath the filesystem hosting
/var/tmp/ is dm-crypt (unusual for an LVM Thin installation), and
dm-crypt has been mapped with discard disabled.

Your storage tree (showing discard support) can be printed with:

$ lsblk --output +DISC-MAX

> > qvm-template-postprocess: call fstrim after removing image file · QubesOS/qubes-core-admin-client@4a9b57f · GitHub

> In the qvm_template_postprocess.py (which the above link points to), fstrim
> is called only if the root user does the template install.

To me this looks like something that was missed in the move to
qvm-template:

Previously, qubes-dom0-update (which had to be run as root) would
install templates as normal RPM packages. I guess the logic to skip
fstrim for non-root users might have been put there to ease testing
the qvm-template-postprocess tool? CCing Marek

Maybe? You do need root for calling fstrim. And not calling it isn't
really huge deal, as you explain below. And it failing shouldn't
interrupt install anyway (subprocess.call, not subprocess.check_call).
But the error message indeed may be confusing.
Theoretically, sudo could be used for this call and that would be fine
in dom0, but possibly less so in a qube (yes, you can install templates
via Admin API from a qube), especially is passwordless-root package is
not installed...

Then qvm-template was created (which like other qvm- tools usually
runs as a regular user) and now fstrim is skipped unless someone
happens to invoke qvm-template as root. Skipping seems like a bug, but
on R4.2 systems it's mitigated by the installer adding the 'discard'
mount option for the dom0 root filesystem, making fstrim redundant.
Except for people who installed via qubes-dist-upgrade or removed the
mount option. For those, there's still the systemd fstrim.timer that
should release the space to LVM, hopefully soon enough (weekly).

Finally, you've used qubes-dom0-update, which nowadays calls
qvm-template for template related stuff. For this, qubes-dom0-update
can actually be run as non-root, but you ran it with sudo, so fstrim
was *not* skipped. (Which then failed on on your system.)

> Thank you very much for helping.

Happy to. It's interesting

Rusty

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

rustybird · October 16, 2024, 4:28pm

Marek Marczykowski-Górecki:

Maybe? You do need root for calling fstrim. And not calling it isn't
really huge deal, as you explain below. And it failing shouldn't
interrupt install anyway (subprocess.call, not subprocess.check_call).
But the error message indeed may be confusing.
Theoretically, sudo could be used for this call and that would be fine
in dom0, but possibly less so in a qube (yes, you can install templates
via Admin API from a qube), especially is passwordless-root package is
not installed...

Ah okay, that answers why not 'sudo fstrim'. Also, on file-reflink
systems, where the dom0 root filesystem is storing (possibly many
terabytes worth of) VM volumes, fstrim can take really long. E.g.
here on my main Btrfs system, which is otherwise quite fast:

# time fstrim /var/tmp/
real 4m29.240s

So now I'm thinking fstrim is overkill just to install a template.
Instead, maybe Salt or something could ensure that everyone (including
people who installed via qubes-dist-upgrade) has the 'discard' mount
option (or 'discard=async' for Btrfs, where that would be the default
on modern kernels if not overridden by 'discard[=sync]') unless a user
has explicitly added 'nodiscard'.

> Then qvm-template was created (which like other qvm- tools usually
> runs as a regular user) and now fstrim is skipped unless someone
> happens to invoke qvm-template as root. Skipping seems like a bug,
> but on R4.2 systems it's mitigated by the installer adding the
> 'discard' mount option for the dom0 root filesystem, making fstrim
> redundant. Except for people who installed via qubes-dist-upgrade
> or removed the mount option. For those, there's still the systemd
> fstrim.timer that should release the space to LVM, hopefully soon
> enough (weekly).

Rusty

marmarek · October 16, 2024, 5:53pm

Marek Marczykowski-Górecki:
> Maybe? You do need root for calling fstrim. And not calling it isn't
> really huge deal, as you explain below. And it failing shouldn't
> interrupt install anyway (subprocess.call, not subprocess.check_call).
> But the error message indeed may be confusing.
> Theoretically, sudo could be used for this call and that would be fine
> in dom0, but possibly less so in a qube (yes, you can install templates
> via Admin API from a qube), especially is passwordless-root package is
> not installed...

Ah okay, that answers why not 'sudo fstrim'. Also, on file-reflink
systems, where the dom0 root filesystem is storing (possibly many
terabytes worth of) VM volumes, fstrim can take really long. E.g.
here on my main Btrfs system, which is otherwise quite fast:

# time fstrim /var/tmp/
real 4m29.240s

But that takes long only if there is really a lot of data to discard,
no? If there is fstrim.timer, there shouldn't be that much to discard
(can be some GB, but that shouldn't take this long). Is it different on
btrfs?

So now I'm thinking fstrim is overkill just to install a template.
Instead, maybe Salt or something could ensure that everyone (including
people who installed via qubes-dist-upgrade) has the 'discard' mount
option (or 'discard=async' for Btrfs, where that would be the default
on modern kernels if not overridden by 'discard[=sync]') unless a user
has explicitly added 'nodiscard'.

If anything, maybe the dist-upgrade tool could take care of it.

> > Then qvm-template was created (which like other qvm- tools usually
> > runs as a regular user) and now fstrim is skipped unless someone
> > happens to invoke qvm-template as root. Skipping seems like a bug,
> > but on R4.2 systems it's mitigated by the installer adding the
> > 'discard' mount option for the dom0 root filesystem, making fstrim
> > redundant. Except for people who installed via qubes-dist-upgrade
> > or removed the mount option. For those, there's still the systemd
> > fstrim.timer that should release the space to LVM, hopefully soon
> > enough (weekly).

Rusty

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

rustybird · October 16, 2024, 7:42pm

Marek Marczykowski-Górecki:

> Also, on file-reflink
> systems, where the dom0 root filesystem is storing (possibly many
> terabytes worth of) VM volumes, fstrim can take really long. E.g.
> here on my main Btrfs system, which is otherwise quite fast:
>
> # time fstrim /var/tmp/
> real 4m29.240s

But that takes long only if there is really a lot of data to discard,
no?

    # for i in 1 2 3; do time fstrim /var/tmp/; done 2>&1 | grep real
    real 4m24.308s
    real 4m34.060s
    real 4m29.806s

I don't see anything in Btrfs tracking which unused blocks it has
already issued discards for. Or in ext4, but it doesn't matter with
the small ext4 dom0 root fs in an LVM Thin installation. So a large fs
that's neither almost empty nor almost full has to at least generate a
gigantic list of (due to fragmentation) probably rather small ranges
of blocks to be discarded in response to every fstrim and forward it
through the block subsystem (which I don't think is keeping track
either?) to the drive. Only after all of that overhead, I guess the
drive might respond faster if it had already done most of the work
last time.

Rusty

rustybird · October 16, 2024, 8:03pm

'Rusty Bird' via qubes-users:

Marek Marczykowski-Górecki:
> > Also, on file-reflink
> > systems, where the dom0 root filesystem is storing (possibly many
> > terabytes worth of) VM volumes, fstrim can take really long. E.g.
> > here on my main Btrfs system, which is otherwise quite fast:
> >
> > # time fstrim /var/tmp/
> > real 4m29.240s
>
> But that takes long only if there is really a lot of data to discard,
> no?

    # for i in 1 2 3; do time fstrim /var/tmp/; done 2>&1 | grep real
    real 4m24.308s
    real 4m34.060s
    real 4m29.806s

I don't see anything in Btrfs tracking which unused blocks it has
already issued discards for. Or in ext4, but it doesn't matter with
the small ext4 dom0 root fs in an LVM Thin installation.

Actually ext4 does keep track:

So a large fs
that's neither almost empty nor almost full has to at least generate a
gigantic list of (due to fragmentation) probably rather small ranges
of blocks to be discarded in response to every fstrim and forward it
through the block subsystem (which I don't think is keeping track
either?) to the drive. Only after all of that overhead, I guess the
drive might respond faster if it had already done most of the work
last time.

Rusty

Ulrich_Windl1 · October 17, 2024, 8:31am

Hi!

Of course if fstrim fails, it has the same amount of block to trim on the next run.

Ulrich

rustybird · October 17, 2024, 4:52pm

Ulrich Windl:

Of course if fstrim fails, it has the same amount of block to trim
on the next run.

But if 'fstrim --verbose' prints a number of trimmed bytes at all and
not an error, then apparently the trimming didn't fail (this time).

To test the different behavior of filesystems like ext4 that keep
track of already discarded blocks, and filesystems like XFS and Btrfs
that don't (or not fully), here's a little script:

gist.github.com

https://gist.github.com/rustybird/750a5b28e7b285669fe90851e6f48b32

fstrimtest

#!/bin/bash

set -euo pipefail

fstype=$1

_du() { du -BM img; }
_fstrim() { fstrim --verbose mnt; }
_sync() { sync; sync; }

This file has been truncated. show original

It creates a filesystem on a 5 GiB loop device, writes three 1 GiB
files inside the mountpoint, deletes two of them; and runs fstrim
three times while looking at the disk usage of the backing file after
each fstrim run. Results:

# ./fstrimtest ext4
3139M img
mnt: 3.8 GiB (4122611712 bytes) trimmed
1091M img
mnt: 0 B (0 bytes) trimmed
1091M img
mnt: 0 B (0 bytes) trimmed
1091M img

# ./fstrimtest xfs
3137M img
mnt: 3.9 GiB (4227661824 bytes) trimmed
1089M img
mnt: 3.9 GiB (4227661824 bytes) trimmed
1089M img
mnt: 3.9 GiB (4227661824 bytes) trimmed
1089M img

# ./fstrimtest btrfs
3084M img
mnt: 3.5 GiB (3766091776 bytes) trimmed
1028M img
mnt: 3 GiB (3255435264 bytes) trimmed
1028M img
mnt: 3 GiB (3255435264 bytes) trimmed
1028M img

Rusty

Boryeu_Mao · October 23, 2024, 6:41pm

I assume you’re seeing the same “not supported” message if you run:

$ sudo fstrim /var/tmp/

Yes

The only thing I can think of is that you have custom partitioning,
and the storage layer immediately underneath the filesystem hosting
/var/tmp/ is dm-crypt (unusual for an LVM Thin installation), and
dm-crypt has been mapped with discard disabled.

My Qubes OS was pretty much a vanilla install from the R4.1 iso (then updated in place to 4.2). During installation, I opted to encrypt the main partition, which I thought was rather standard as well. In any event, it seems that my situation can’t easily be rectified without perhaps a system re-installation from scratch.

Your storage tree (showing discard support) can be printed with:

$ lsblk --output +DISC-MAX

Yes this is helpful. Thx.

Finally, you’ve used qubes-dom0-update, which nowadays calls
qvm-template for template related stuff. For this, qubes-dom0-update
can actually be run as non-root, but you ran it with sudo, so fstrim
was not skipped. (Which then failed on on your system.)

So it looks like I’d just stick with running the template install as root, as I won’t gain much as non-root.

Thank you again!