A Qubes-isolated sandbox for AI agents — FastMCP server in a dedicated mcp-control qube — that lets autonomous AI workflows provision qubes,
build templates, run pentests, and move files between them, all inside a
tag-scoped subset of the system. Untagged qubes remain structurally
invisible to the agent.
Stages A and B (below) are tested. Stages C–H are designed in CLAUDE.md.
Trust boundary = the qrexec tag ai-managed. Tag mutation is
hard-denied to AI at the policy layer; only the operator (in dom0) and
the create-time wrapper qmcp.SpawnAIManagedQube apply tags.
Dom0-mediated wrappers (qmcp.*). State-changing calls route through
small Python scripts in /etc/qubes-rpc/ that enforce invariants in
dom0 before touching qubesd: forced tagging on creation, cross-reference
validation on template/netvm/default_dispvm, opaque error responses.
Wrapped reads hide existence.qmcp.GetPropertyAIManaged returns "not found" indistinguishably whether the qube doesn’t exist or simply
isn’t tagged. The MCP-side helper normalises all qrexec failures (policy
deny, no-such-VM, transport error) to the same opaque error so the
lifecycle path doesn’t leak either.
Status
Stage A (tested) — tag-scoped lifecycle, spawn (atomic tag-on-create
with rollback + post-condition check), wrapped property read/write,
existence hiding.
Stage B (tested) — root command execution and inter-qube file
transfer inside ai-managed qubes via custom qrexec services installed
in ai-managed templates.
Stages C–H (designed) — network sandbox + tag-validated netvm cascade,
template cloning + DispVMs, device attach between ai-managed qubes, feature.Set wrapper + filtered event stream, mcp-control hardening +
Tor hidden service for sshd, FastMCP HTTP/SSE transport bound to a
second .onion for mobile-app reach.
Three review questions
For anyone familiar with the Admin API and qrexec policy R4.2+:
Wrapped-reads existence-hiding. Is returning a uniform "not found"
from a dom0 wrapper a robust primitive against existence oracles, or
are there qrexec-layer leaks (timing, error chains, side effects)
I’m missing?
qubes.Filecopy between @tag:ai-managed qubes. Stage B adds a
policy line bypassing the default ask dialog for transfers between
ai-managed qubes. Are there assumptions in qubes.Filecopy’s
implementation that depend on the dialog being present?
target=@adminvm documentation gap. Without that clause on
tag-scoped admin allows, qrexec attempts to start the target VM during
read-only operations. Subtle, easy to miss, not surfaced in current
docs. Worth a docs PR? Happy to write it.
Disclosure
AI-assisted implementation, human-designed boundaries. Not classical-engineer
credentialed but seriously trying to get the threat model right. Review
will find things I missed — that’s why this post is here.
Update — stages C through E2 landed since the original post.
The post above described stages A and B as tested and C–H as designed. That’s now out of date. Status as of 2026-05-25:
Stage
What it adds
Status
C
Single-egress network sandbox via ai-net-router — all ai-managed qubes route through one egress qube whose upstream (sys-firewall, sys-whonix, a VPN qube, or "" for offline) is operator-chosen from dom0 and not reachable from AI.
Tested
D
Cloning, DispVM klass support, dom0-side lifecycle wrapper covering start/shutdown/kill/pause/unpause/remove uniformly across all qube classes.
Tested
E1
Device attach/detach between ai-managed qubes only — block, USB, and microphone. PCI passthrough stays denied (different trust model, possible future stage).
Tested
E2
Ephemeral DispVMs with a qubes_run_disposable one-shot — agent gets a fresh disposable, runs the task, the qube is destroyed.
Tested
F–H still designed only.
Dom0 surface is nine RPC services now: qmcp.LifecycleAIManaged, qmcp.SpawnAIManagedQube, qmcp.CloneAIManagedQube, qmcp.AttachDeviceAIManaged, qmcp.DetachDeviceAIManaged, qmcp.GetPropertyAIManaged, qmcp.SetPropertyAIManaged, qmcp.ListAIManagedQubes, qmcp.SpawnDisposableAIManaged. All routed through invariant-checking scripts in /etc/qubes-rpc/.
Wrote up the broader threat model and what this means for auditing MCP-using products: MCP trust boundaries belong below the protocol · Alex Schose. The forum post above focused on three specific design questions; the writeup zooms out to the line-jumping threat class and what bounding it at the hypervisor layer actually forecloses.
Two new things I’d value pushback on after the C–E2 work:
Single-egress idiom. The ai-net-router design — one ai-managed qube with provides_network=True, every other ai-managed qube pointed at it via netvm, the router’s upstream operator-locked from dom0 — is the obvious answer to bounding outbound network for the cohort. Is it the Qubes-idiomatic answer, or is there a more established pattern using netvm chains and per-template firewalls that I’m reinventing? I deliberately don’t install firewall allow-lists on ai-net-router itself (AI can still set rules there) — the bound is purely “operator-chosen upstream”; comments welcome on whether that’s enough or whether a dom0-enforced firewall floor would close real attack surface.
Force-tagging window for disposables. Stage E2 lets the agent invoke qmcp.SpawnDisposableAIManaged, which spawns a Qubes DispVM (auto-named dispXXXX) and applies the ai-managed tag in dom0 before the name is returned to the agent. My read is that there’s no TOCTOU window — the wrapper runs as a single dom0 RPC and Admin API listing is gated by tag — but I’d value confirmation from anyone who’s hardened similar paths against a concurrent listing observer.
That’s the better story in this thread — zero Linux to a working mihomo NetVM that gets people in censored regions back online. Nice work.
One thought, since you wrote a real “Security — Let’s Be Real” section: that list — default-deny, least-privilege, disposables, audit — is exactly right, and right now it holds because you hold it. You read every line, you keep the agent to test qubes. That works until the day a prompt-injection or a confident hallucination doesn’t ask first. That’s the whole reason I built qubes-mcp: push that same list down into dom0, so the agent runs in one untrusted qube and can only touch qubes you’ve tagged ai-managed — your real system stays invisible to it, and it can’t widen its own scope. Same things you do now over SSH, minus discipline being the only thing holding the line.
If you ever want to point Hermes at it instead of raw SSH, happy to walk you through setup. Curious, too: when your agent works on the gateway, does it get root in sys-proxy directly?
Makes sense — root in sys-proxy is the usual setup. That’s the one piece I’d want bounded eventually, but if it’s working for you on test qubes there’s no urgency. Standing offer on the walkthrough whenever you want it.