Why does seamless GPU acceleration only work when using template based HVMs?

I’ve tested it some more and I’ve found the problem that you’ve encountered, it was Xorg. Using Wayland don’t have this performance drop.
Here is an example with sway, but it should work for KDE/GNOME/etc with Wayland if you prefer it:
Install sway in the template (e.g. sudo apt install sway xwayland for debian).
In the gaming app qube based on this template create the custom sway config:

mkdir -p ~/.config/sway
cp /etc/sway/config ~/.config/sway/
nano ~/.config/sway/config
echo "seat * pointer_constraint disable" >> ~/.config/sway/config

Attach mouse to the gaming qube by running this command in dom0 or attaching the USB mouse directly to the qube:

qvm-run -u root --pass-io --localcmd="qvm-run -u root --pass-io sys-usb \"input-proxy-sender /dev/input/by-id/usb-MY_MOUSE_ID-event-mouse\"" MyGamingQube "input-proxy-receiver --mouse"

Start sway in the gaming qube’s terminal:

WLR_NO_HARDWARE_CURSORS=1 sway --unsupported-gpu

Run the game inside the sway.