Need Help with QubesOS GPU Passthrough for AI-Powered Data Management System (PIVOT & TOOL)

otter2 · October 19, 2024, 11:51am

There is no TL;DR and my attention span is far too short to keep reading until I get the full concept

As far as I can see the idea is weird. Users querying your server? How is that better than just asking OpenAI?

Sorry but I don’t think that I can help that much with any of your technical questions because I’m not that knowledgeable, but here’s my take:

Check IOMMU groups, passed dGPU alone in the group is good.

idk what are they coming from and to what? Probably whatever is the best generic network practice or something like the clipboard mechanism

Why don’t just run these in disposables and flush them for new queries?

I don’t think this is qubes-related, more like generic network security. Disposable data transfer handling vms?

Don’t give them access to network, I guess

For now, it is one (or more if you need it) dGPU per vm. In development. There are alternatives, like this:

And also hybrid graphics and OPTIMUS if hardware allows:

Apart from that, CPUs and RAM are very flexible. Main restriction is that initial memory of a qube (with memory balancing enabled) cannot be less than 10% of the max dedicated memory of a qube.

Sorry if what I’m stating here is obvious to you. This all is surface-level stuff and I’m pretty sure you can find lesser-known solutions and projects related to your questions in the community.