Getting dom0 Out of Both the Network and IPC Data Paths on Bare Xen

Continuing on some of the previous posts on our custom bare-bones xen build:

We’ve been running a bare Xen setup on Alpine Linux (no Qubes, no libvirt, just xl and shell scripts) and I wanted to share what it took to fully isolate dom0 from both the network forwarding path and the IPC data path. This is something Qubes partially achieves with sys-net as a driver domain, but to my knowledge, even Qubes still routes all qrexec traffic through dom0. We went further and got dom0 completely out of both paths and parked it into an isolated corner where it does nothing after boot time and never sees the light of day.

Hopefully this is useful to others building Xen-based appliances or exploring what’s actually possible with the Xen primitives.

Starting point

Five VMs on Xen 4.19: dom0 (Alpine), a network driver domain (net-vm, PCI passthrough of the physical NIC), and three application VMs. The application VMs needed to talk to each other over IPC (Xen vchan) and needed network connectivity through the driver domain.

The problems:

  1. All VM network traffic was transiting dom0 because dom0 was the default vif backend for every VM
  2. All IPC traffic was transiting dom0 because the IPC daemon ran in dom0 and relayed every message between domains

Dom0 was seeing every packet and every IPC message. For a security sensitive workload, that’s exactly what we dont want.

Part 1: Getting dom0 out of the network path

The mechanism

This is straight out of the Qubes playbook. Xen supports specifying which domain acts as the netback for a vif using backend=<domain-name> in the vif spec. I think Qubes has used this for years with sys-net? On bare Xen with xl, you just add it to the xl config:

vif = ["ip=10.0.5.103,backend=net-vm"]

That’s it. The vif appears inside net-vm instead of dom0. Dom0 doesn’t forward anything.

What actually broke

The simple part was the config change. The hard parts were everything around it.

vif-watcher xenstore paths: We had a polling daemon in net-vm (similar to Qubes’ vif-route-qubes) that watched for new vifs and configured routes. It read the IP from xenstore using the frontend path (/local/domain/<domid>/device/vif/<devid>/ip). When net-vm is the backend instead of the frontend’s dom0, the xenstore permissions are different. Net-vm couldn’t read the frontend domain’s xenstore entries. We had to add a fallback to the backend path (/local/domain/<net-vm-domid>/backend/vif/<domid>/<devid>/ip).

Hardcoded IP in vif-watcher: The watcher was assigning dom0s IP to every vif it configured (it was originally written when only dom0’s vif existed). When a new VM’s vif appeared, net-vm would accidentally claim dom0’s IP address, making dom0 unreachable

Firewall rule cleanup: The old FORWARD chain had blanket accept rules between eth0 (dom0’s vif) and eth1 (physical NIC). These carried all the forwarded traffic. After moving vif backends to net-vm, those rules became obsolete and needed to be replaced with targeted rules for the direct vifs.

Boot ordering: Net-vm must be fully running before any VM that uses it as a backend is created. On our system, VMs boot sequentially from a shell script in dom0, so this was already the case. But if youre doing parallel VM creation, you need to be careful about this

The result

After the changes, dom0’s routing table had no entries for any application VM. Net-vm’s routing table showed direct routes to each VM via their vif interfaces. The only traffic on dom0’s vif to net-vm was dom0’s own management traffic (which we later eliminated too by removing dom0s IP entirely).

Part 2: Getting dom0 out of the IPC data path

This was the harder and more interesting part. Qubes’ qrexec runs a daemon in dom0 that mediates every inter-VM service call. Even though the actual data transport is vchan (shared memory between two domains), the routing goes through dom0: source agent sends to dom0 daemon, daemon evaluates policy and forwards to target agent. Dom0 sees every byte of IPC payload in the daemon’s process memory.

We built a system that starts with dom0 as the relay (like qrexec) and then migrates to direct peer-to-peer vchan connections.

How vchan works under the hood

The Xen vchan API (libxenvchan) is surprisingly flexible:

  • libxenvchan_server_init(remote_domain, xenstore_path) creates a server endpoint. It writes rendezvous data to xenstore and waits for a client. The server domain must have write access to the xenstore path.
  • libxenvchan_client_init(remote_domain, xenstore_path) connects to an existing server by reading the xenstore path.
  • Each vchan handle has its own fd from libxenvchan_fd_for_select(), so you can poll multiple vchans in one process.
  • There is NO global state. A single process can hold arbitrarily many server and client vchan handles.

The big insight: vchan is symmetric between any two domains. There is nothing in the libxenvchan API that requires dom0 to be involved. Any domU can be a vchan server, and any other domU can be a client. The constraint is xenstore permissions, not the vchan mechanism itself. As we will see later however, this configuration only works on static systems where every domU/appVM is known at boot time. We are not aware of a method to dynamically provision inter-guest IPC without dom0 involved in some way.

The xenstore permission problem

When domU A wants to create a VchanServer for domU B, it writes the rendezvous under its own xenstore subtree (/local/domain/<A>/ipc/direct/<B>). Domain A owns this path and can write to it. But domain B needs read access to connect. By default, B can’t read A’s xenstore entries.

The solution: dom0 pre-provisions the xenstore permissions at boot. After creating all VMs, dom0’s autostart script writes the rendezvous paths and grants the appropriate read permissions:

# sa-vm (domid 2) serves workload-vm (domid 3)
xenstore-write /local/domain/2/ipc/direct/3 ""
xenstore-chmod /local/domain/2/ipc/direct/3 b2 r3

Dom0 has write access to everything in xenstore, so it can set up the permissions for any pair. After this, the two domains can establish a vchan directly without any further dom0 involvement.

Domain ID resolution

We hit an interesting problem with domain IDs. Xen assigns domain IDs at creation time and they increment monotonically. If you hardcode IDs in kernel cmdline parameters, they break if a VM is recreated or boot order changes.

Our solution: dom0 writes a name-to-domid mapping in xenstore after all VMs are created:

for vm in sa-vm workload-vm mgmt-vm; do
    domid=$(xl list | awk "/$vm/{print \$2}")
    xenstore-write /ipc/domain-map/$vm "$domid"
done

Each agent reads this mapping at startup to resolve peer names to domain IDs. The kernel cmdline only contains peer names:

extra = "console=hvc0 rdinit=/sbin/init ipc_serve=workload-vm,mgmt-vm ipc_connect=sa-vm"

Policy evaluation

With dom0 out of the data path, someone still needs to enforce access control. We moved policy evaluation to the target domain’s agent. Each serving domain has a baked-in policy file (subset of the central policy scoped to its own services). When a ServiceRequest arrives on a direct vchan, the agent checks the source domain name (Xen guarantees the remote domain ID in the vchan setup) against the local policy.

This is actually a smaller attack surface than centralized policy in dom0. Each domain only knows about its own authorized callers, not the entire system’s policy. However, we still have open questions about whether an untrusted domU can securely manage it’s own policy. This is a sharp-edge where dom0 actually provides some level of security.

The result

After the full migration:

  • Dom0 runs zero IPC processes at runtime. No daemon, no agent relay.
  • Four direct vchan pairs carry all IPC traffic.
  • Dom0’s only boot-time roles: create VMs, write xenstore permissions, start agents. Then it’s done.
  • A dom0 compromise would require actively exploiting Xen grant table mechanisms to interfere with vchan. It can’t just read the daemon’s process memory because there is no daemon.

What the Xen primitives actually support vs. what people think

Most Xen documentation and tutorials assume dom0 is a permanent, active participant in everything. It manages networking, it relays communications, it serves as the backend for all virtual devices. But the actual Xen API doesn’t require any of this:

  • vif backends can be any domain (the backend= directive)
  • vchan works between any two domains (no dom0 mediation needed)
  • xenstore permissions can be pre-provisioned (dom0 sets them at boot and leaves)
  • grant tables are peer-to-peer (dom0 doesn’t intermediate shared memory)

The Qubes project figured out the network piece years ago with sys-net as a driver domain.

Practical considerations

Boot ordering matters. The server domain for a vchan must be running and have its VchanServer created before the client domain tries to connect. We solved this with a retry loop in the client agent (10 seconds of retries at 500ms intervals for xenstore resolution).

The agent’s event loop needs to handle multiple fds. Each vchan has its own event channel fd. Use poll(2) with all peer fds plus the local Unix socket for client connections. The existing Xen IPC daemon already does this (it polls N vchan server fds), so the pattern is proven.

Test with throwaway VMs first. We created two 64MB test VMs to prove the direct vchan path before touching any production VM. Found and fixed several issues (missing xenstore tools in the rootfs, domain ID hardcoding, missing --no-daemon flag) that would have been painful to debug on production VMs.

Silent failures will hurt you. The biggest time sink was things failing silently: curl fetches without -f that wrote HTTP error pages into binary files, OpenRC reporting [ ok ] for processes that crash 100ms after fork, missing shared libraries that only surface in logs you can’t easily access inside a VM. Add build-time assertions (does the binary exist? can ldd resolve all libraries?) and loud runtime failures (check for xenstore-read on PATH before trying to use it).

Summary

Dom0 on Xen is conventionally the center of everything: networking, IPC, device management, storage. But the Xen primitives don’t actually require this. With vif backend reassignment and direct peer-to-peer vchan, you can reduce dom0 to a boot-time orchestrator that creates VMs, sets up xenstore permissions, and then does nothing. Every packet and every IPC message flows directly between the domains that need to communicate, with dom0 completely out of the path.

The Qubes architecture documents were our starting point, and the Xen wiki Driver Domain page documents the vif mechanism. The direct vchan piece doesn’t seem to be documented anywhere, so hopefully this writeup fills that gap.

Last edited by @dustin.ray 2026-04-29T21:10:49Z

Last checked by @dustin.ray 2026-04-29T21:55:35Z

Check documentPerform check on document:

@dustin.ray What does the “Check document: Perform check on document:” button do and how did you add it here? Accidentally clicked it :sweat_smile:

I have no idea…