[qubes-users] Btrfs (file-reflink): Why is the CoW on a volatile.img enabled?

449f09c92 · March 2, 2023, 7:01pm

I have /dev/xvdc configured as a 10GB swap and had to edit the relevant code to disable CoW when volatile.img is created to avoid overloading dom0 by checksum calculation when swapping out occurs in the VM.

Is there any reason why copy-on-write is enabled on volatile volumes that are mostly used as swap? Just curious.

rustybird · March 4, 2023, 1:05pm

449f09c92:

had to edit the relevant code to disable CoW when volatile.img is
created

file-reflink doesn't inherently do CoW for volatile volumes, it just
defaults to whatever the underlying location on the filesystem does.
For Btrfs, to get nocow non-checksummed volatile volumes you could set
that up like:

# mkdir /var/lib/very-volatile
# chattr +C /var/lib/very-volatile
# qvm-pool add -o dir_path=/var/lib/very-volatile very-volatile file-reflink
# qubes-prefs default_pool_volatile very-volatile

Although it will only apply to *new* VMs created after that. To point
*existing* VMs' volatile volumes to the new pool, you'd currently have
to shut down qubesd and manually edit /var/lib/qubes/qubes.xml
(because the property is not exposed through 'qvm-volume config').

Is there any reason why copy-on-write is enabled on volatile volumes
that are mostly used as swap?

Disabling CoW and hence checksums (besides being specific to Btrfs -
file-reflink is filesystem agnostic) means losing protection against
on-disk bit rot. But storing data on the volatile volume doesn't mean
it is unimportant or even short-lived: It's not that unusual to have a
long-running VM with weeks of uptime. Corruption in its swapped memory
(or in diverged 'root' volume data, which too is stored on the
'volatile' volume) could be devastating.

Rusty

rustybird · March 4, 2023, 2:15pm

Rusty Bird:

Disabling CoW and hence checksums (besides being specific to Btrfs -
file-reflink is filesystem agnostic)

Although for volatile volumes in particular it might be possible to
get away with (optionally, configured per-volume) attempting to set
the nocow flag and ignoring any failures. Not sure if even that is
worth implementing though, when it's already possible to configure a
dedicated nocow pool for those volumes.

The filesystem specificity I was thinking of is a bigger issue with
other (snap_on_start or save_on_stop) volume types. E.g. on Btrfs you
can only do a reflink ioctl if the source and destination files have
the same nocow status - a notion that is perfectly captured by making
the whole pool directory nocow or not, without any convoluted logic in
the file-reflink driver.

Rusty

449f09c92 · March 4, 2023, 6:04pm

Thank you for your clarification.
Also, many thanks for maintaining the file-reflink storage driver.