Dear Qubers,
I would like to enquire if anyone has had any sucess with Ollama with
GPU passthrough?
I am using an Arch linux template that is working well for dedicated
video out and can play media and games (with stutter) which is already a
great convienance on Qubes.
However the main reason I built a passthrough set up was for Ollama /
Llama.cpp.
I've tried pci strict resetting, no dynamic memory balancing, dasharo
and standard bios. Everything works fine if I swap from qubes OS to
Arch.
I'm using ollama-rocm.
I used the gpu-passthrough guide at:
GPU is detected but loading does not progress beyond:
llm_load_tensors: CPU buffer size = 35.44 MiB
It just hangs forever.
Full dump below.
If anyone has got any further or has any thoughts please let me know!