Your tests reporting cryptsetup-reencrypt results on same hardware, same disk different disk partition table/ partition alignment/partition block size going from direct-io resulsts of 526.8 MiB/s to 718.2 MiB/s in reported tests shows a big difference for the LUKS container alignment performance test alone. This is important report. Like said previously, that reencryption test shouldn’t care on what is the actual content of the container. In my experience, speed with direct-io reported pretty steady speeds all along the reencryption, leading to the hypothesis that all the blocks are forward read, reencrypted and written back to disk without speeding up if unused or slowing down if used. The data seem simply translated and rewritten as it goes.
Buffered IO being improved massively (50MiB/s initially reported) vs 646.6 MiB/s is also an interesting data, showing that better alignment leads to better results, while still showing something off. Buffered IO should be better then direct-io, meaning something is not right (reported by heardware vs real), yet.
@51lieal Can you post recipe of commands that made it successful to you for the thin lvm scenario?
(A list of commands that were successful to you, just like you did on 4.1 installer LVM partitioning - hard to customize, missing space - #5 by 51lieal would permit exact reproducibility of results, intern validity and possible external validity of results. If we come up with proper adjustments, we could open an issue upsteam and challenge others.