SSD maximal performance : native sector size, partition alignment

Only this pull request:

1 Like

@rustybird Sorry I was not more specific: I meant for the root and private volumes creation: was that tested working?

So if I understand well, I could apply your patch and have volatile volume fixed. But for creating root volumes and private volumes, I would need to build ISO, or patch stage 1 and stage 2 install so that when templates are decompressed, those are fixed to create a working system to be able to compare performance properly with/without the fixes.

I was looking for next steps to get main devs attention in seeing actual performance losses/ differences in this thread.

Otherwise, people are trying to get away of LVM thin provisioning model at install as of now. Some wants ZFS,XFS/BRTFS since speed differences are quite important.

One example of that is from @Sven at https://forum.qubes-os.org/t/ext4-vs-btrfs-performance-on-qubes-os-installs as an example of that, showing gains of ~300mb/s write speed by choosing BRTFS at install vs thin provisioning default:

Fixing LUKS+LVM thin provisioning would be great. Otherwise LVM is blamed for performance losses as of now where other implementations are simply not suffering from the same implementation flaws that LVM thin provisioning is suffering from, per Qubes implementation of volatile, private and root volumes creation.

@Demi maybe? I think @rustybird showed where love is needed here: SSD maximal performance : native sector size, partition alignment - #30 by rustybird

2 Likes

Not sure that I understand your question, but standard (i.e. not in like a standalone HVM) private volumes are already sector-size agnostic in their content, so compatibility wise it doesn’t matter whether they are presented to the VM as 512B or 4KiB block devices.

Standard root volumes have sector-size specific content, and I don’t think it’s feasible to dynamically patch that volume content (specifically, the partition table) in dom0, because it contains untrusted and potentially malicious VM controlled data.

Backward compatibility is a real headache here. It seems like the existing root and private volumes should simply be presented to the VM as 512B devices by default for now. In the case of an LVM installation layout, that might even entail forcing 512B sectors for the whole LUKS device - unless there’s a good way to set an independent sector size for the LVM pool or ideally per LVM volume.

1 Like

Cross-referencing important post by @tasket (one filesystem knowledgeable person with a lot of hands on experiment, behind wyng):

I’d like to share some info about my playing with the 4kn drive.
I went through 51lieal’s instructions (using stock r4.1.1 iso). I got two main volumes, one is “varlibqubes” using file-reflink driver, the other is “vm” using lvm-thin driver.

When I was installing templates, I found that stock 512b templates didn’t boot on “vm”, but booted on “varlibqubes”. My self-built 4kn template booted on both two volumes.

However, since I upgraded dom0 to testing-latest, all my VMs refuse to boot, and the error messages are “libxenlight failed to create domain xxx”, just like those when I try to boot 512b templates on “vm”.

I can confirm this isn’t a kernel issue, because downgrading kernel version doesn’t help. I suspect this is somthing xen related.

Edit:
Even stock “LVM” installation ( because I cannot proceed with “LVM thin” option) without those modifications to luks on a 4kn drive will lead to the same result, that VMs refuse to boot when testing-repos are enabled.

1 Like

Was that tested with xmm-xen from trsting repos? (With without directio patch?)

I am confused since that patch was removed in latest xmm-xen, which is why I’m asking.

Testing thread Xen + xmm-xen fixes to test (suspend/resume fixes, directio loopback devices, sys-usb fails to start, slowness vs bare metal) - #5 by Insurgo

1 Like

Sorry just saw that this issue was referred under VMs don't boot on 4Kn drive · Issue #7828 · QubesOS/qubes-issues · GitHub

Important discussions happening there.

1 Like

Important advancements here VM partition layout expects 512B sector size, VMs fail to start from 4096B drive · Issue #4974 · QubesOS/qubes-issues · GitHub

Props to @rustybird ! Thanks!!!

What is your SSD? 102mb/s for cryptsetup-reencrypt seems a bit on the lower end of speed?

1 Like

Eh, it’s an SATA Samsung 850 Pro on an old ThinkPad T420. :person_shrugging:

I’ve not using xfs+lvm-thin anymore so i don’t have idea, currently using btrfs, and there’s no problem with it (i’m using testing repo).

Yes, Yes, Marmarek just reverted “Use direct-io for loop device” in xen 4.14.5-10, which should be in testing repo now.

$ cat /sys/block/nvme0n1/queue/physical_block_size
512
and
$ cat /sys/block/nvme1n1/queue/physical_block_size
512

QubesOS 4.1 on two nvme SSD’s. Disk model WDC WDS500G2B0C-00PXH0 (x2)
Notebook model: NH5x_NH7xHP serial: N/A UEFI: INSYDE v: 1.07.01
CPU: Single Core 11th Gen Intel Core i7-11800H speed: 2304 MHz

Extreme slow boot, something like 6 minutes:
@dom0 ~]$ systemd-analyze blame
49.894s qubes-vm@sys-firewall.service >
43.064s qubes-vm@sys-usb.service >
32.900s dracut-initqueue.service >
31.298s systemd-cryptsetup@luks\x2d05b49888\x2df626\x2d47c1\x2d9dde\x2d26bfd00f>
28.157s qubes-vm@sys-net.service >
27.710s systemd-cryptsetup@luks\x2daeb29ae5\x2d9b10\x2d4de2\x2d8665\x2d579831c5>
15.741s qubesd.service >
15.663s upower.service >
15.569s systemd-logind.service >
8.355s systemd-udev-settle.service >
7.327s lvm2-monitor.service >
4.793s plymouth-quit-wait.service >
3.263s lvm2-pvscan@253:0.service >
2.218s systemd-journal-flush.service >
1.680s dracut-cmdline.service >
840ms qubes-qmemman.service

@dom0 ~]$ systemd-analyze critical-chain
The time when unit became active or started is printed after the “@” character.
The time the unit took to start is printed after the “+” character.

graphical.target @1min 18.259s
└─multi-user.target @1min 18.258s
└─qubes-vm@sys-firewall.service @28.363s +49.894s
└─qubes-meminfo-writer-dom0.service @28.354s
└─qubes-core.service @27.996s +330ms
└─qubesd.service @12.227s +15.741s
└─lvm2-pvscan@253:0.service @8.957s +3.263s
└─system-lvm2\x2dpvscan.slice @8.891s
└─system.slice
└─-.slice

I am just a user, when this issue is solved by the tech-people I’ll just reïnstall Qubes en restore from back-up.
Or stop using Qubes on this machine because another issue is I really need to install rpmfusion repos in dom0, what is not available in EOL Fedora 32.

1 Like

Oh! That was online reencryption, never did that!

I lost track on github issues with last comment being that open-qa fails as expected at https://github.com/QubesOS/qubes-issues/issues/4974#issuecomment-1295866835

What are the next steps @rustybird ?

In link with your comment VMs don't boot on 4Kn drive · Issue #7828 · QubesOS/qubes-issues · GitHub

Edit: sorry. I think as I go and seem to not be able to do a full post one shot and always edit multiple times. Sorry if you reply from email, hopefully this is sent to you in the 10 minutes edition permission time prior of sending the email you would reply to.

1 Like

It’s pretty neat.

I’m not sure. For the time being, people with 4Kn drives should probably simply install Qubes OS with the Btrfs installation layout.

1 Like

@rustybird unfortunately, not a desirable path for everything thin-lvm related support, including wyng support :confused:

2 Likes

@Insurgo

It seems the reason you were thwarted by gdisk is that it only guards the first sector alignment for you. And if you always make the LUKS partition last, then it will happily make the end ‘ragged’, not sizing the partition to multiples of the alignment value.

Easy fix for this is:

  1. In gdisk eXpert menu, choose ‘L’ and enter ‘8’ if your drive has 512b sectors. This should also work if your drive has 4096b sectors. ‘8’ is 4096 / 512.

  2. Go back to the main menu with ‘M’ and choose ‘N’ for a new partition. The default start should be OK. For the end, specify a relative canonical value like ‘+64g’ or ‘+64800m’. Using ‘g’ or ‘m’ will give you intrinsic multiples of 4096.

After that, cryptsetup with --sector-size 4096 should not complain about alignment.

3 Likes

@rustybird can you draw a clear picture of the current situation?

1 Like

Hmmm. The discussion just got reignited on Heads issue Choose stronger encryption by default and/or re-use encryption parameters of LUKS container · Issue #1539 · linuxboot/heads · GitHub

A user reported that the old cryptsetup version used under Heads (2.3.3) was using argon2i when reencrypting, which was suboptimal for security. This is true. Qubes 4.1 has cryptsetup 2.3.5. argon2i is used there, I think. Anyway: even when reencrypting Q4.1 install with cryptsetup 2.6.1, this results into LUKS container still being argon2i. I guess that’s good: we don’t want to enforce something that breaks compatibility with initrd of OS.

So I spent my day trying to version bump Heads to use 2.6.1. Was successful.
Fixed scripts to not call cryptsetup-reencrypt but cryptsetup reencrypt, removed -B 64 which is not supported anymore and tried reencrypting with both --direct-io and without, to realise that speed of reencryption (with/without misalignment errors of original OS installation assumptions, block size etc) were still present on Qubes 4.2?

I have posted screenshots and thought process under Choose stronger encryption by default and/or re-use encryption parameters of LUKS container · Issue #1539 · linuxboot/heads · GitHub

Any input welcome

Related Qubes issue (encryption defaults for installer and upgrader).

Option --use-directio is only for LUKS1 according to cryptsetup reencrypt man page and here in this thread there are cases where it was used and speed was higher as well as where it wasn’t and speed was higher (different tuples), so I think there may just be a lot of variation for all sorts of reasons and --use-directio probably has nothing to do with it (at least if the man page is to be trusted).

Regarding sector size I’ve looked around a bit and this is not consistent, either; some report better performance with 512, others with 4k, yet others basically the same performance…the one thing that definitely helps here is to avoid a misaligned partition on 4k, but my impression is that 512 is very much recommended for Qubes anyway, as 4k is not well supported at the moment.

Maybe an issue should be opened on the cryptsetup gitlab or a question sent to their mailing list?

@Bearillo thank you for your answer.

--use-directio
Removed it in last commit since I came with same conclusions. Unless --debug is creating an immense cost on reencryption speed, latest commit from branch GitHub - tlaurion/heads at cryptsetup_version_bump-reencryption_cleanup

cryptsetup reencrypt --force-offline-reencrypt --disable-locks "$LUKS" --debug --key-slot 0 --key-file /tmp/luks_current_Disk_Recovery_Key_passphrase

Which is even worse at 30.1 MiB/s (in debug) while still in “online mode”. Will open an issue soon I guess.

1 Like