If you are willing to sacrifice the x16 PCIe slot, you can do x4 x4 x4 x4 bifurcation and run 4 5.0 drives in raid 0, it should be four times faster than a single drive. I personally think it would be a waste of money, and you will not be able to use a GPU as a display device.
With AMD mainboards, you have 24 CPU connected PCIe lanes, 16 are used for the GPU and 8 are general purpose. The last 8 lanes are either used for two 5.0 NVMe drives, or one NVMe drive and one extra x4 PCI slot.
If you want to use two GPUs, you can only have a single 5.0 drive, limiting the total amount of disk space at 5.0 speed.
That’s interesting. I observed that qvm-clone duration is roughly proportional to the size of the VM or template being cloned and assumed on-disk size was the key cause of the wait, but sounds like some/much of that extra time is rather because larger VMs/templates tend to have more exposed apps → more qvm-appmenu activity.
A waste because in the general case storage IO tends not to be a substantive performance bottleneck?
One thing you may be overlooking here: alternatives to PCIe 5.0 NVME drives.
The (now very old) Intel Optane drives are known to have extraordinarily low latency, still unbeaten by any NVME storage as far as I’m aware. I believe this is true of both the PCIe 3.0 and PCIe 4.0 generations of the Optane technology (“3d xpoint”).
These drives (especially the last generation) also have significantly more write endurance than consumer and (many) enterprise/datacenter drives.
Many caveats come along with them.
They’re not cheap - around $1USD/GB for 400GB models. Much more on larger sizes
It’s difficult to find any beyond a certain size (400GB is readily available on online marketplace sites)
You will probably still want your modern drives for sequential access/raw throughput, so you’ll need a scheme to make good use of the Optane drives only where suitable. The conventional approach is ZFS, with Octane drives as “special” vdevs within a pool for small blocks.
If you’re interested, look into p4800x and p5800x. You will find many cults online discussing them, including in the context of Xen. I will warn you it can be a rabbit-hole.
I’m not an expert in this area, but there is a lot of information out there. You will want to pay attention to the low queue depth random read/write. The difference is not marginal