AI with qubes

Does anybody using LLM, ollama, comfy etc locally in Qubes.
If yes, what methods you used and what system requirements are needed for those VM.

I like using LM-Studio, you can use Ollama, etc ofc
Not sure what you mean by method, it is no different.
Just you wont have utilization of the GPU unless you attempt passthrough and will only use CPU & RAM.
I think LM studio will be best for you as it will inform you of which model will work based on your system resources.

Thanks I will look into that.
I tried Jan but it was failing to download models. So I thought if someone using.
I will reply after trying.

In /rw/config/qubes-bind-dirs.d/40_ollama.conf of the appVM I have:

binds+=( '/usr/local/bin' )
binds+=( '/usr/share/ollama' )
binds+=( '/usr/local/lib' )

Here you have to restart appVM to install ollama to persistent directories.

In /rw/config/rc.local I have:

#!/bin/sh
ln -s /rw/ollama.service /etc/systemd/system/
systemctl start ollama.service

The ollama.service is copied to its place after installation and modified to your taste.

Memory balancing does not seem to work well when ollama is loading the model so I set fixed amount. I dont have extra GPU so it is CPU only. Models like phi4-mini or gemma3:4b work decently on my CPU.

I use ollama on a secondary GPU, using a mount -o bind to move /usr/share/ollama/.ollama/models to an other location (symlinking was not working for unknown reasons)