Is Wyng adequate for VM backups?

Vlad · May 3, 2024, 3:47pm

Unfortunately, current Wyng storage model is not very convenient for file-based cloud syncing. Wyng stores all chunks as individual files, resulting in millions of small files. I.e., in wyng issue 179 alvinstarr reported that he has 224 million chunk files and 937526 subdirectories in the backup directory.

If I back up my Qubes via Wyng with chunk size of 64 KB (to improve deduplication efficiency) - normally I have 5-6+ million files in Wyng backup folder (including hardlinked ones). That practically means dedicating a separate partition to backup folder, with a simpler filesystem (ext4 or xfs). When I kept the backup folder on a larger btrfs partition (on a single 7200rpm HDD), the time to traverse the whole filesystem (i.e., search) blew out of proportion. When I was syncing the Wyng backup folder via Syncthing - its on-disk database swelled to 4.6 GB, and I observed its memory usage spike up to 14 GB RAM. That’s not practical for me, unfortunately - not all my NAS instances have that much RAM to spare. If syncing the Wyng backup as partition - blocksync’ing a 150 GB partition to an image stored on Keybase takes 2+ days, that’s also not practical for me, and not safe - a interrupted transfer may leave the filesystem on the remote image replica in inconsistent state.

I suppose it would be desirable to get a more scalable storage model in Wyng - merging chunks into pack files of configurable size (i.e., 16-100 MB), the way Restic does. (Another more theoretical option could be storing the chunks not in filesystem, but in a database of some appropriate sort - databases are better suited to a use case of millions of small records than filesystems.) Also would be good to have dynamically adjusting directory depth to ensure a reasonable number of files per directory.

For the time being the solution proposed by @solene remains more practical for me, though

tasket · May 3, 2024, 10:33pm

I sense a raft of unsupported assumptions here. The main issue is that its Wyng’s job to directly manage the archive data, not the users.

the time to traverse the whole filesystem (i.e., search) blew out of proportion.

Don’t include Wyng directories in searches.

When I was syncing the Wyng backup folder via Syncthing - its on-disk database swelled to 4.6 GB

Syncthing seems to have a bug. A simple rsync -aH --delete will do a far better job using a fraction of the memory (and who knows what other resources – probably network – it is mis-managing).

I suppose it would be desirable to get a more scalable storage model in Wyng - merging chunks into pack files of configurable size (i.e., 16-100 MB)

And yet, Syncthing can’t handle a lot of files. That would be the definition of ‘unscalable’. It also follows that if Wyng could efficiently put the files there in the first place, and rsync can handle them well, then there is something missing from the other programs’ approach.

Most filesystems already pack together the ends of multiple small files for efficiency. I saw fit to put the filesystem to work instead of duplicating their efforts for the sake of I really don’t know what.

BTW, I chose 128KB chunks as the default to reduce overhead while still retaining good deduplication rates. The difference in dedup between 128KB and 64KB seems to be about 3% on average.

tasket · May 4, 2024, 3:08am

To help put Wyng’s files-in-subfolders method in perspective, it is not uncommon to see Linux filesystems containing millions of files… many of them quite small. To backup such volumes, one can turn to any number tools that (interestingly) also result in millions of files in the archive, with btrfs-send and zfs-send being two of them, while others like rsync, rsnapshot and even Time Machine would also fall into this category.

When comparing their respective archives alongside Wyng’s, you might be surprised to see that the Wyng archive has the fewest files in such cases, in fact less than the original filesystem. Of course, when backing up volumes containing large files, then the Wyng archive will have many more, smaller files than the source… but that will also make backing up any incremental changes from the source much, much quicker.

Insurgo · May 4, 2024, 4:11pm

If we try to manage the “user” here, or the apps dealing with wyng archives, there are workarounds.

One could decide to use a loop block device, mounted by remote qube and exposed, which dom0 uses in its destination qubes:// or qubes-ssh:// path to that archive. But the qube would still see millions of files when that loopndevice is mounted." It can be hidden, but should be neglected. Therefore that workaround hides it if the loopback device is then used to be backed up by other means, that’s all. Just like qubes backup archive today could hide millions of files under a single archive if it was implemented differently: it just hides implementation details. If qubes backup permitted to mount its archived volumes in a qube, doing a find in the mounted dir could expose millions of files! The fact that wyng does what it does to do its work efficiently should not be done less efficiently because of a find corner case.

The result to mitigate this here could be one raw loopback block device and not millions of files if remote qube is to be used for other things then backup purposes. Of course this would add complexity and some IO overhead for that qube, but that block device, used as archive fs, could then be exposed to other tools, including rsync and maybe even syncthing, relaying on their internals to test efficiency to deal properly with a single file having changes to sync upon destination file. I know rsync would deal properly with it with --whole-file, --partial and --in-place rsync command line options, for example. Have not digged into syncthing internals but that could be a workaround if millions of files is not a desired thing over qube exposed fs, in the goal of dealing with non-mounted fs file to sync it externally. Again agreed that find should exclude the mounted loopback path, for which there is no other workaround.

ext3/ext4 filesystems have a known implementation limitation (DoS possibility) when it comes to inode exhaustion, where millions of small files created on a fs where max inode is fixated at fs creation time, is a security issue: there is a maximum of possible created files in such filesystem, where small files will exhaust inode but not space. Again, it’s a ext3/ext4 implementation limitation (feature vs bug), hidden to most “users”, but a real issue. An issue one would never encounter by choosing more relevant filesystem like BTRFS or XFS, where there is no such thing as inode exhaustion and where inodes expend dynamically upon needs and managed transparently by the fs, without “users” to be aware of hidden maximum.

In practice, and this is a real thing people; users (tooling here) only checks for free space available and report only on that fact, where the inode free count is never reported nor checked prior of its exhaustion. Just pointing this out because we tend to confuse “users” to be only humans, where in this case, the users are all fs users and jeopardize usage, hence becoming a DoS vulnerability, for real.

TLDR:
@tasket I think this “millions of tiny files” from “users” is a real issue, potentially leading to a qube that won’t boot eventually. The mitigation is unfortunately documentation based for the moment, where BTRFS qubesos installation would resolve (DoS there being impossible by inode exhaustion being impossible, vs ext3/ext4) the “availability security issue” altogether

I could provide simple PoC code here to prove the point, but this is quite dangerous, resulting in non bootable qube, phone etc hence DoS. Static max inodes on fs is enemy of small files and should be considered as a real potential problem, without question.

Vlad · May 4, 2024, 10:14pm

Using key-value databases for “millions of small records” scenario, rather than POSIX filesystems, is an established usage pattern in the industry, for example:

* Why use a database instead of just saving your data to disk?

* Which File System or solution is best to store and retrieve (efficiently) tons -millions, maybe billions- of small files?

The rest - me reporting the measurements.

Syncthing is a useful and popular example of BYOC, it has attractive properties. For example, self-healing, connectivitity will work even if both endpoints are behind restrictive firewalls (they’ll connect over relays), there’s an Android client, etc.

Syncthing can scale, according to their stats - people use it to to manage up to 78M files and up to 227 TB of data. However, for larger use cases one should be willing to allocate more resources to Syncthing - which in this case I’m not, my use case is backing up a home laptop, not crunching big data )

Well, why am I extolling on the virtues of Syncthing? It’s a representative example of current-gen cloud sync solutions - a continuous sync, fire-and-forget solution, point it to folders, and it’ll take care of syncing them from there on. Whereas with more manual solutions like rsync one has to schedule them, ensure reliable network connectivity between endpoints, etc.

P.S.: Right now for replicating wyng backup archives there’s no middle ground - either as a folder with millions of files, or as a partition image in the hundreds of GB. From user POV, it would be desirable to have some middle ground, backup archive stored split into part files, at least 100 MB - 1 GB ones (this is a common, pretty standard feature for backup solutions). One of the reasons would be to have better protection against the “torn page” problem: if replicating backups to remote cloud gets interrupted, and some part files will be fully replicated by then, while others won’t be - this way the user will have a better chance that the remote backup replica will be left in a well-defined state - and should be able to recover easily, without having to learn BTRFS or XFS filesystem recovery methods.

Vlad · May 4, 2024, 10:15pm

You mean - keeping the storage volume, on which wyng backup is stored, formatted in BTRFS fs, not ext4? (Whether it’s directly attached to dom0 or used via qubes://).

As I see, the qubes still get their /dev/xvdb as ext4 by default - whether QubesOS is installed with BTRFS or LVM thin. (Also XFS may be a better fs than BTRFS for millions-of-files use case - benchmarking needed).

unman · May 5, 2024, 12:52am

It is scalable, as you say.

What would be the effect on Wyng backups of such lost or damaged files?

The major issues I have with Wyng are that a qube has to be shut down
before backup, and there seems to be no sensible way of file
identification and recovery.
(Restoring multiple volumes to pick out a specific version of a file
does not count as sensible.)
I have repeatedly asked about these without any response.

I never presume to speak for the Qubes team. When I comment in the Forum I speak for myself.

tasket · May 5, 2024, 2:11am

I appreciate your wanting to use Wyng with your favorite tools, and that you are commenting from actual experience (which is refreshing)… However this is the first time I recall someone complaining about this aspect of Wyng’s archives. What I have seen is people using not only rsync but also cp, tar and other backup utilities to replicate archives without issues (note that tar can do this job just fine without creating an actual tar file; just stream it to another tar via stdio).

As for using traditional databases to be more proper: who even does that for backing up non-DBMS? In terms of established backup tools, I’m not seeing it.

OTOH, if you want middle-ground-ness to help skirt Syncthing’s shortcomings (or more likely, bug), the best I can recommend is to increase the archive chunk size when creating an archive. The caveat is that having larger file sizes of any sort will have its own downsides.

tasket · May 5, 2024, 2:22am

The major issues I have with Wyng are that a qube has to be shut down before backup

No, it doesn’t. Not any more than than it would with qvm-backup. OTOH, if you want live data from a running system, then you risk having an inconsistent state or corruption in the archive version.

What would be the effect on Wyng backups of such lost or damaged files?

What is the effect on files due to lost blocks? They would not validate.

I have repeatedly asked about these without any response.

Oh, OK. I generally stop responding to people who lob only pure criticism at me or my projects. Everything, it seems, is completely unacceptable. And why keep demanding answers to questions that are already answered in thread… repeatedly for years? That smacks of sea-lioning.

One has to wonder why Qubes has virtually zero prospects for creating a thriving ecosystem.

Vlad · May 5, 2024, 7:04am

I rarely if ever complain. In this case how can I complain - you provide a useful service to the world for free. alvinstarr said in issue 179: “I am really impressed with the work you have done”, I second that - I also appreciate your work. The fact that wyng is the only non-scanning option on the table is also not lost on me.

You’re the wyng dev - I brought up the corner case that bit me.

I was thinking more of NoSQL object stores (like Ceph, MongoDB, etc.), but in any case it’s more of theoretical brainstorming.

OTOH Keybase comes to mind as a real-life example, although it’s a network fs, not a backup solution - if configured to cache network folders locally, it stores the files inside LevelDB databases locally (and performance is dismal).

But was really thinking more in terms of storing chunks packed together in some pack file format. Since Borg is in Python and its license looks sufficiently open-source - perhaps you could incorporate their repo storage implementation? …then you may want to run a helper service in the qube containing wyng storage (to support fast seeking within pack files), wyng running in dom0 would bidirectionally communicate with it over qrexec channel - that should improve latency and likely throughput over current state (at least one qvm-run call per chunk), but also will likely increase dom0 attack surface.

Alternatively could mount the wyng archive folder from wyng qube in dom0 over qubes.ConnectTCP, via something like Samba in the qube and rclone (userspace program written in Golang, memory-safe language) to mount it in dom0. To raise the barrier, could run rclone in dom0 under non-privileged account, forbidden from sudo’ing. Attack surface in this scenario would include the network stack in dom0 kernel.

Ack, thank you - will benchmark wyng with large chunks vs qvm-backup | restic once I upgrade to Qubes 4.2.

Bearillo · May 8, 2024, 12:19pm

I’ll add my 2 c.: haven’t yet used wyng thoroughly, but alone that the initial archive creation (+full backup) took less than a third of the time as regular qvm-backup and that subsequent (incremental) backups are done within minutes is, of course, a huge selling point. So from me also a big thank you to @tasket; I’ll be testing it further and maybe someday replace qvm-backup with it for good (aside from dom0 home backups, I suppose).

kenosen · May 8, 2024, 3:17pm

Second this. Running 4.2. I exclusively backup now with wyng to a remote server (self-managed, at home office) running a proxmox container dedicated to backups. I’ve completely ditched qvm-backup since I started using wyng two months ago, backing up daily rather than weekly or biweekly, the way I was with qvm-backup … which I wasn’t diligent about simply because it took so long, and the qvm-restore process doesn’t offer a valuable time-to-completion counter. Wyng speed is phenomenal. Restoring a qube is simple and seamless. I have no need to sift through the archive or its containing directory; it’s mostly fire-and-forget. I admit I haven’t tried to move the archive itself, and --remap if I want to backup to a local or external source, which is rare. And at first glance into the archive, the file and directory structure was bewildering.

As for other options discussed in the forum, maiming the script for qvm-backup seemed excessive (–no-compression is a flag worth trying, but still a slow process) and not worth having to remember to restore that script for past backups. Just my experience.