Fstrim not permanent even if discard seems to be enabled everywhere

pablob · September 4, 2025, 12:05pm

I got hit with a “phantom” low-space warning in a couple of VMs, where the occupied space in the LV is much higher than the one in the filesystem,
After some looking, it seems that for some reason the filesystem trimming is not working, and deleted space in the filesystem is not being reclaimed for the thin-provisioned LV.

The recommended was to solve this was to run “fstrim /rw” inside the VM, but I found that while this trims the filesystem, the space is not reclaimed in the LV. If I try to trim again, I get the same amount of space trimmed every time I boot up the VM.

I tried trimming from dom0, to see if that made a difference, and it does not. Every time the filesystem is mounted again, the supposedly trimmed space is trimmed again! Like this:

[@dom0 ~]$ sudo mount /dev/qubes_dom0/vm-vmname-private /mnt
[@dom0 ~]$ sudo fstrim -v /mnt
/mnt: 371 MiB (388997120 bytes) trimmed
[@dom0 ~]$ sudo umount /mnt
[@dom0 ~]$ sudo mount /dev/qubes_dom0/vm-vmname-private /mnt
[@dom0 ~]$ sudo fstrim -v /mnt
/mnt: 371 MiB (388997120 bytes) trimmed
[@dom0 ~]$ sudo umount /mnt

(this could be repeated ad infinitum with no change).

I checked with lsblk -D, and all devices above the LV seem to have discards enabled

NAME                                                                    DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
└─nvme0n1pX                                                                   0      512B       2T         0
  └─luks-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx              0      512B       2T         0
    ├─qubes_dom0-vm--pool_tmeta                                   0      512B       2T         0
    │ └─qubes_dom0-vm--pool-tpool                                  0      512B       2T         0
    │   ├─qubes_dom0-vm--vmname--private                     0        2M       2G         0

I also enabled discards in /etc/lvm.conf in dom0 (although that should not affect this, based on what I read).

Nevertheless, the filesystems seem to never get trimmed, and this leads to VMs starting to run out of space. I must be missing something within LVM, but I can’t figure out what. Any help will be much appreciated!

Thanks!!

tanky0u · September 4, 2025, 12:36pm

Interested in general trim functionality as well. Even though I use QubesOS for 4 years now, I am still not sure if I got trim working, or if the trim functionality is enabled by default or not.

rustybird · September 4, 2025, 1:17pm

That’s normal. ext4, the filesystem used inside of VM volumes, does not keep track of already trimmed space across mounts. I think it wouldn’t even make sense to do that, because between mounts you might have imaged the filesystem and transferred it to different storage hardware. So it redoes the trimming, just in case.

If you’re not seeing an immediate increase in free disk space in the storage pool, it’s usually due to old revisions of the volume still being present on disk. You can check how many revisions are kept using qvm-volume info VMNAME:private. Shutting down the VM will delete the oldest revision and create a new one from the now trimmed data.

pablob · September 4, 2025, 1:52pm

I made sure the VM was not running at the time, and it looks like the trimming did not affect the drive, as the VM boots (although it is true that it could be a security risk, I am not too worried about that particular VM).

The system is keeping two revisions (the default). But I tried booting up the VM, running ‘fstrim -v /rw’, and shutting down three times. That should be freeing the space, as all the old revisions should have been trimmed. However, the “usage” as reported by qvm-volume info does not change at all.

This is qvm-volume info before running the trimming cycle:

pool               vm-pool
vid                qubes_dom0/vm-vmname-private
rw                 True
source             
save_on_stop       True
snap_on_start      False
size               2147483648
usage              1988140361
revisions_to_keep  2
ephemeral          False
is_outdated        False
List of available revisions (for revert):
  1756993370-back
  1756993390-back

This is the qvm-volume info after the 3 fstrim cycles:

pool               vm-pool
vid                qubes_dom0/vm-vmname-private
rw                 True
source             
save_on_stop       True
snap_on_start      False
size               2147483648
usage              1988140361
revisions_to_keep  2
ephemeral          False
is_outdated        False
List of available revisions (for revert):
  1756993469-back
  1756993500-back

You can see that the revisions changed but the usage still is the same. And the file system has a lot less data occupied than the reported usage:

Filesystem         1K-blocks     Used Available Use% Mounted on
/dev/xvdb            1992552   786356   1189812  40% /rw

My guess is that somehow the trimming is not releasing the free space in the filesystem to the thin pool, but I am at loss on what I am missing.

rustybird · September 4, 2025, 1:58pm

Does systemctl restart qubesd change the usage reported by qvm-volume info?

rustybird · September 4, 2025, 2:10pm

It’s also interesting that fstrim -v only requested trimming of 371 MiB, even though the available space is 1162 MiB. Does it store many small files? I could maybe see that resulting in an allocation pattern where most of the the holes in the filesystem are too small (less then 4 MiB) for LVM Thin to reclaim.

pablob · September 4, 2025, 2:13pm

It does not change the reported usage (I ran in in dom0, there was no such service in the VM).

pablob · September 4, 2025, 2:22pm

This could be not impossible. The VM is used only for web browsing, so I assume that the data “churn” is the web cache and other information such as cookies. But I don’t feel that such a use would lead to such a large number of small holes.

I found the e2freefrag utility and ran a report:

Blocksize: 4096 bytes
Total blocks: 524288
Free blocks: 301549 (57.5%)

Min. free extent: 4 KB 
Max. free extent: 51608 KB
Avg. free extent: 196 KB
Num. free extent: 6054

HISTOGRAM OF FREE EXTENT SIZES:
Extent Size Range :  Free extents   Free Blocks  Percent
    4K...    8K-  :           479           479    0.16%
    8K...   16K-  :           983          2414    0.80%
   16K...   32K-  :          1104          6006    1.99%
   32K...   64K-  :          1422         15385    5.10%
   64K...  128K-  :           773         17032    5.65%
  128K...  256K-  :           457         20010    6.64%
  256K...  512K-  :           281         24600    8.16%
  512K... 1024K-  :           235         41808   13.86%
    1M...    2M-  :           216         79353   26.32%
    2M...    4M-  :            93         59496   19.73%
    4M...    8M-  :             8         11186    3.71%
    8M...   16M-  :             1          2166    0.72%
   32M...   64M-  :             2         21614    7.17%

It would seem that a majority of free extents are on the small side. I wonder if there is any way to defragment the free space. I see that e4defrag can defragment files, but not free space. In any case, the trimming should recover at least 300 Mb, but it is not doing it.

rustybird · September 4, 2025, 2:27pm

Hmm it’s requesting it, but dom0 might still not be able to fulfill the request very well if a lot of it is not aligned to full 4 MiB chunks.

I’m out of ideas, sorry to say. As a workaround, you could try copying the VM data to a new VM.

mo-zel-ti · September 7, 2025, 7:23pm

Does your browser installed from flatpack by any chance? I was able to “workaround” in similar situation via copying / deleting directory in AppVM => forum link

pablob · September 7, 2025, 9:08pm

No, just a “clean” qube with the browser installed in the template.
I assume that copying the directory would work, but your experiments with the “.var” directory made me curious.

I tried the same thing you did to the “.cache” directory, and it freed up some space in the LV. That is quite interesting…

Edited to add: I did this to some of the other hidden dirs in the /home/user directoruy, and it cleaned up a lot of space in the LV. So, while not a clean solution this is a decent workaround.

I wonder if this is a bug (where? QubesOS? LVM?) or some LVM quirk I don’t understand.

Thanks for the tip!

rustybird · September 8, 2025, 11:27am

Btrfs should be able to discard smaller holes, they only need to be at least 4 KiB.

It’s definitely more for LVM. 4 MiB is LVM’s physical extent size by default, and it would make sense that freeing a physical extent is all or nothing, and (assuming correct alignment) the math checks out for

pablob:

Blocksize: 4096 bytes
[...]
HISTOGRAM OF FREE EXTENT SIZES:
Extent Size Range :  Free extents   Free Blocks  Percent
[...]
    4M...    8M-  :             8         11186    3.71%
    8M...   16M-  :             1          2166    0.72%
   32M...   64M-  :             2         21614    7.17%

136 MiB in free/discardable (ext4) extents, + a few more MiB in never written to filesystem housekeeping areas, plausibly adds up to your volume’s size - usage = 152 MiB.

What does cat /sys/block/xvdb/queue/discard_granularity show in the fragmented VM? Somehow that always seems to be less than 4 MiB, which apparently causes fstrim to request more trimming (371 MiB total) than can be fulfilled. I don’t understand why the discard granularity isn’t also 4 MiB.

pablob · September 10, 2025, 10:40pm

It shows 2097152, 2 MiB. So that could be the case. Would this be a mismatch between the LVM discard granularity and the one in the VM?

The workaround of copying the files and deleting the old ones probably works because the copied versions are more “compacted”, and removing the old ones does free up larger chunks of space that could be reclaimed.

rustybird · September 11, 2025, 10:06am

Those 2 MiB are what LVM claims is the discard granularity. It’s the same number for the dom0 dm-n block device. The number is passed through to the VM block device. I just don’t get why LVM isn’t claiming that it’s 4 MiB in the first place. (Not that it would make a difference in the amount of actually trimmed data, but the fstrim output would look less confusing.)

Yeah I guess so. Does it free up more if you repeat this procedure?