Qubes btrfs VDO (compression + dedup) recipe

Disclaimer

This guide aims to assist anyone interested in exploring the compression and deduplication features of Btrfs within Qubes OS, while also serving as my personal reference.

The description about btrfs and Qubes OS are deliberately incomplete and biased and only summarize the required aspect to achieve effective compression & deduplication effects. Refer to the real documentation if you have any question on it.

Please ensure you back up all data before experimenting.

Introduction to Btrfs

Btrfs, despite its numerous features, is more akin to ext4 than LVM. It organizes files and directories in a tree structure, rooted at /, and can be mounted almost anywhere.

Compression and deduplication in Btrfs can be enabled at runtime, rather than during filesystem creation.

Btrfs supports compression, with zstd being the most efficient method. By default, compression is disabled and must be activated via a mount flag.

For offline deduplication, Btrfs does not require a mount flag. Instead, a userspace tool (such as duperemove) is necessary to identify duplicate blocks across files using sophisticated hashing algorithms and data structures. This tool then informs the Btrfs module of any duplicates, which, after verification, may be deduplicated, potentially reducing disk usage.

Btrfs in Qubes OS: An Overview

(Primarily based on my experiences with R4.2)

Qubes OS’s default partition setup with Btrfs includes:

  • An EFI partition (/boot/efi, vfat)
  • A boot partition (/boot, ext4)
  • Encrypted swap (luks)
  • An encrypted Btrfs root partition (/, luks+btrfs)

Dom0 utilizes the entire Btrfs partition.

VM images (root.img, private.img) are stored as regular, albeit typically large, files under /var/lib/qubes and are in raw format.

Unlike with LVM, where dom0 root and VM images are separate volumes, here they are indistinguishable from other regular files in dom0. This distinction makes compression and deduplication in Qubes OS using Btrfs largely similar to their use in other OSes.

Storage driver behavior on btrfs

Template VM store their root.img and private.img under /var/lib/qubes/vm-templates/<vm_name>/; AppVM store their private.img under /var/lib/qubes/appvms/<vm_name>/.

Usually you do not only see root.img , private.img and some necessary config file, but other versions of corresponding images.

The files ending in Z like private.img.23@2024-05-06T07:08:09Z are revisions (that can be restored later, with qvm-volume revert in standard ways, and mv in non-standard ways :smiley: ).

The files ending in -precache.img are image file used in Qubes OS VM start at runtime. They are safe to remove when no VM is up, and should be removed if you manually edited the base image (for example when you changed private.img, you should remove private-precache.img`).

In my case no version tracking is required, and I will remove both two types of files before compression and deduplication.

My Recipe

  1. Reinstall Qubes OS (R4.2). Make sure you have backed up any important VMs. It’s recommended to perform a TRIM on your disk before actual installation, since it may be slower for some disk device to write on a dirty block already written before.
    After installation, don’t restore the backups immediately. Follow the steps to optimize the filesystem first.
  2. Change mount flags to enable compression. Amend /etc/fstab to include compress=zstd or compress-force=zstd (the former is recommended; if unsatisfied, you can later use chattr -m file; chattr +c file to force compression on specific files), then reboot.
    Note:
  • Compression will only apply to files created or modified after this flag is set.
  1. Install btrfs utilities. Execute sudo qubes-dom0-update, followed by sudo qubes-dom0-update compsize duperemove to install utilities for deduplication (with duperemove) and to assess the effectiveness of compression and deduplication (with compsize).
  2. Compress existing files in dom0. Run:
sudo btrfs filesystem defragment -v -r -czstd /usr /etc /srv /var/lib/dnf /var/log /var/lib/qubes/vm-kernels

We deliberately excludes VM images for now due to their size.

  1. Recompress the file incorrectly marked incompressible. Verify compression status with sudo lsattr <path>. A file marked with the “m” flag indicates it was deemed incompressible after the initial compression attempt.
  • Force compression if necessary using sudo chattr -m <path> && sudo chattr +c <path>.
  • Recompress them by redo btrfs filesystem defragment with specific directory
  • Files commonly shown incompressible includes various library files (libLLVMGold.so) and the kernel module disk (/var/lib/qubes/vm-kernels/*/modules.img)
  1. Evaluate your compression efforts. Run sudo compsize -x /, which reveals the actual disk usage, including both compressed and uncompressed portions, and indicates any deduplication achieved.
  2. Shutdown VMs and clean up stale image files. Prior to further operations, shut down all VMs and clean up temporary VM image files in /var/lib/qubes/{appvms,vm-templates}/*/*{Z,-precache.img}. Usually the only large files there should be “root.img” and “private.img”.
    The step is important because:
  • The alternative versions of vm images mostly share data with the mainline version of image, which is effectively already deduplicated, but btrfs filesystem defragment undo deduplication.
  • More file cause duperemove to take longer time on deduplication, and might cause a larger hash file that takes more space and hurts disk further.
  1. Fill compression of filesystem. Now you are familiar to the compression steps. Compress the filesystem entirely with:
sudo btrfs filesystem defragment -v -r -czstd /

If VM images remain uncompressed, force their compression as step 5 says, and repeat the process.
9. Deduplication. Proceed with deduplication:

sudo duperemove -r -d -v --hashfile=/var/duperemove.hashfile --dedupe-options=same,partial --exclude=/var/duperemove.hashfile /

Note:

  • Avoid using the -A option as it does not work.
  • The same option is crucial, while partial is more time-consuming but can enhance deduplication.
  • The hashfile acts as a cache, useful for subsequent deduplications.
  1. Review deduplication results with sudo compsize -x / or sudo btrfs filesystem df /.
  2. After rebooting, install and update new templates, then deduplicate again (compression is automatic).
  3. Consider defragmenting your template VM roots with e4defrag; further testing on this is pending: is a template requiring ext4 defragmentation? is ext4 defragmentation more friendly to btrfs deduplication?
  4. Finally, restore your backups and perform another round of deduplication.

Comment

This guide should be updated later with more experiment.

Reference

https://btrfs.readthedocs.io/en/latest/Introduction.html
https://wiki.archlinux.org/title/Btrfs

5 Likes

The files ending in Z like private.img.23@2024-05-06T07:08:09Z are revisions (that can be restored with qvm-volume revert), rather than temporary files.

It’s true that you can and probably should delete them before defragmenting the volume file. But they contain historical VM data that might still be useful sometimes, e.g. accidentally deleted files.

@rustybird @logoerthiner : why remove those files? If the idea is to deduplicate while keeping features par with OS (revert, rotations) then cost of running dedup tool would simply be higher (longer runtime, cpu cost) with end result of leaving things as expected by OS. I wouldn’t personally choose shorter time of deduplication (would prefer either daemon (bees) vs duperemove so that its done in the background when system activity is pretty low, yet again prefer bees hashtables vs duperemove here so that daemon is aware of changes and also apply changes live vs offline without loosing revert capabilities of QOS.

I have not had the time to investigate further bees building through qubes builder v2 yet unfortunately, but I think bees is definitely more interesting feature-wise than duperemove for the task, while duperemove is there to install today.

You are right. I will add this info in the guide.

  1. btrfs is a CoW filesystem. If I think it right, in principle you could do revision and light-weight backup manually by copying the private.imgs and root.imgs elsewhere with cp. I did not know what Qubes OS did more sophisticated in his revision file. Qubes OS storage drivers are written in python anyway.
  2. As long as the VM is shut down, these file are not useful any more, other than revision uses.
  3. duplicate files incur more time cost and space cost on dedup with duperemove.