The config name is too long, rename it and make it shorter.
Yikes. I never would have thought of that. Thanks so much apparatus. It’s up and running now.
Can you give more details on the configuration of the connection to a random VPN?
I followed your entire guide but it didn’t work in my case.
thank you for your work
Hello, unfortunately this guide does not work for me. I’m running a fc39 “sys-vpn” AppVM on Qubes 4.2.
“sys-vpn” is configured for using wireguard. wireguard-tools
is correctly installed.
A thing I noticed. If I do not configure ANY appVM to use sys-vpn
as firewall VM, wireguard works a charm. I can bring the connection up and down with no problem whatsoever.
As soon as I configure any other AppVM to use sys-vpn
as firewall VM, all hell breaks loose. As soon as I bring the wireguard connection up:
- I cannot ping anything from
sys-vpn
; - I have no connection whatsoever on the AppVM using
sys-vpn
as firewall VM.
One thing I notice is that ifconfig
in sys-vpn
shows me a vif
interface for every appVM using it as firewall VM. I think this is normal behavior, but hence, long story short:
ifconfig
shows novifxx.y
interfaces insys-vpn
: Wireguard works flawlessly;ifconfig
shows one or morevifxx.y
interfaces insys-vpn
: Wireguard doesn’t work, and as soon as I bring it up everything stops working, and the only way I have to re-establish connection is by rebootingsys-vpn
.
I literally have no clue of even how to approach this problem, so any help is welcome.
Let’s start from here, nothing is going to work if sys-vpn does not have a working VPN tunnel.
Does “sudo wg” shows a tunnel with Rx and Tx values different from 0? This would indicate the VPN was established, if Rx is 0 it means the remote peer does not recognize your configuration.
So, if nothing uses sys-vpn
as firewall I see:
interface: wg0
public key: XXX
private key: (hidden)
listening port: 36964
fwmark: 0xcbeb
peer: YYY/ZZZ
endpoint: xx.xx.xx.xx:51820
allowed ips: 0.0.0.0/0
latest handshake: 14 seconds ago
transfer: 3.53 KiB received, 32.27 KiB sent
persistent keepalive: every 25 seconds
If something uses sys-vpn
as firewall I see:
interface: wg0
public key: XXX
private key: (hidden)
listening port: 36964
fwmark: 0xcbeb
peer: YYY/ZZZ
endpoint: xx.xx.xx.xx:51820
allowed ips: 0.0.0.0/0
latest handshake: 17 seconds ago
transfer: 5.21 KiB received, 45.45 KiB sent
persistent keepalive: every 25 seconds
As soon as I bring the VM up, ping becomes terrible:
ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=4 ttl=115 time=683 ms
64 bytes from 8.8.8.8: icmp_seq=8 ttl=115 time=2075 ms
64 bytes from 8.8.8.8: icmp_seq=9 ttl=115 time=2000 ms
^C
--- 8.8.8.8 ping statistics ---
14 packets transmitted, 3 received, 78.5714% packet loss, time 13359ms
rtt min/avg/max/mdev = 682.834/1585.778/2074.661/639.208 ms, pipe 3
As you can see, latency become monstrous. It blocks totally shortly after. Importantly, this does not recover after I bring wireguard down, even if I do not add any firewall rule to restrict traffic.
You said you cannot ping anything from sys-vpn just above?
This is quite weird, especially since it does not recover. Is the attached qube doing a lot of connection? (the kind of traffic generated by P2P / I2P / IPFS software). This may create too many network sessions in sys-vpn qube.
Try to increase the allocated memory and add a vcpu to the sys-vpn qube and see if it helps.
Sorry you are right lol, I’m making a mess. Let me be more specific:
My setup is the following: I run wireguard on my pfsense router for two main reasons:
- Get a hold of my local network while I am away (e.g. cameras)
- Redirecting all traffic through my router, encrypted (if I don’t trust my connection)
I would use two different wireguard profiles for these scenarios, so let’s focus on the first case. This setup is tested and it works beautifully from mobile wireguard client (on my phone). The setup is the following, from the file wg.conf
that I’m using.
[Interface]
PrivateKey=<My private key>
Address=10.xx.yy.zz/32
DNS=10.xx.yy.01
[Peer]
PublicKey=<My public key>
AllowedIPs=10.vv.0.0/16,10.ww.0.0/16, ... (A bunch of house subnets I care about)
Endpoint=<My hostname>:<My port>
PersistentKeepalive=25
So all in all a pretty simple setup. Now let’s consider the simplest case. I fire up a new qube appVM, call it wgVM
. Firewall is not altered in any way. wireguard-tools
is installed on the corresponding template. nmcli connection import type wireguard file wg.conf
, VPN is created correctly on network manager, it connects automatically, and everything works. I can shut it down or bring it up, everything works as intended.
Now, I take the same wgVM
, deselect ‘connect automatically’ on the wireguard VPN via network manager, shut it down, select ‘provides network’ in the qubes setting, and attach some other AppVM to it, call it clientVM
. When I fire up wgVM
again, i see a new vifpp.q
interface appearing in the output of ifconfig
.
Everything works merrily. I can browse the internet from both clientVM
and wgVM
no problem.
If I now bring up the wireguard VPN from network manager, what happens is that the VPN seems to be working ok for just a few seconds, sometimes just enough to load google on clientVM
. Then nothing really works anymore. Pages are stuck on load forever, and ping becomes super slow. If i bring the wireguard VPN down again on network manager, what happens is that ping stops working completely, both from wgVM
and clientVM
. It doesn’t change if I bring the VPN up again. At this point, to fix things:
- On
wgVM
I am forced to kill the VM and boot it up again; - On
clientVM
, either I rebootwgVM
as above or I select some other VM to act as firewall.
I can try to increase memory, but it really feels to me that somehow routing gets modified by wireguard, and nothing works from that point on unless I reboot!
Do you have 10.137.0.0/16,10.138.0.0/16,10.139.0.0/16 subnets there?
Check the output of these commands before starting wg, after starting wg and after stopping wg:
ip a
ip route show table all
sudo nft list ruleset
And compare them to see what changed between “before starting wg” and “after stopping wg”.
Is by any chance your LAN or VPN subnets colliding with Qubes OS internal network?
Ok, I was able to gather all the data requested.
First of all, no, I do not have 10.137.0.0/16,10.138.0.0/16,10.139.0.0/16 in my router subnets. I was very careful with this since all my qubesVM internal routing happens on these subnets.
Some ping statistics:
Before turning wireguard up (don’t get fooled by the look of the terminal, I am on Fedora39):
┌──(user㉿clientVM)-[~]
└─$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=111 time=22.9 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=111 time=20.3 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=111 time=20.3 ms
^C
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 20.261/21.148/22.923/1.254 ms
┌──(user㉿sys-wgVM)-[~]
└─$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=112 time=22.2 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=112 time=20.5 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=112 time=21.8 ms
^C
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 20.463/21.486/22.169/0.736 ms
Adter turning wireguard up:
┌──(user㉿appvm-clientVM)-[~]
└─$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=113 time=4195 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=113 time=3140 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=113 time=2153 ms
64 bytes from 8.8.8.8: icmp_seq=6 ttl=113 time=2150 ms
^C
--- 8.8.8.8 ping statistics ---
8 packets transmitted, 4 received, 50% packet loss, time 7105ms
rtt min/avg/max/mdev = 2149.638/2909.302/4194.952/844.853 ms, pipe 5
┌──(user㉿sys-wgVM)-[~]
└─$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=3 ttl=114 time=3093 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=114 time=5024 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=114 time=5027 ms
64 bytes from 8.8.8.8: icmp_seq=7 ttl=114 time=3998 ms
64 bytes from 8.8.8.8: icmp_seq=8 ttl=114 time=3335 ms
64 bytes from 8.8.8.8: icmp_seq=9 ttl=114 time=2953 ms
64 bytes from 8.8.8.8: icmp_seq=14 ttl=114 time=1268 ms
^C
--- 8.8.8.8 ping statistics ---
17 packets transmitted, 7 received, 58.8235% packet loss, time 16297ms
rtt min/avg/max/mdev = 1267.841/3528.177/5027.217/1217.289 ms, pipe 5
After turning wireguard down:
┌──(user㉿appvm-clientVM)-[~]
└─$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
From 10.137.0.37 icmp_seq=3 Destination Host Unreachable
From 10.137.0.37 icmp_seq=4 Destination Host Unreachable
From 10.137.0.37 icmp_seq=5 Destination Host Unreachable
^C
--- 8.8.8.8 ping statistics ---
7 packets transmitted, 0 received, +3 errors, 100% packet loss, time 6177ms
pipe 3
┌──(user㉿wgVM)-[~]
└─$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
^C
--- 8.8.8.8 ping statistics ---
12 packets transmitted, 0 received, 100% packet loss, time 11254ms
Now, as for ip a
. I have interfaces lo, eth0 and a vif10.0 When I turn wg on, I also have:
4: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default qlen 1000
link/none
inet 10.21.10.2/32 scope global wg0
valid_lft forever preferred_lft forever
This interface just disappears when I turn wg down. All the rest stays the same.
As for ip route show table all
, the only change is:
default dev wg0 table 52083 proto static scope link metric 50
local 10.21.10.2 dev wg0 table local proto kernel scope host src 10.21.10.2
Again, this just disappears after I turn wireguard down.
Finally, there are no changes in sudo nft list ruleset
, both before turning wg up, after turning it up, and after turning it down. In all three instances, I see:
table ip qubes-firewall {
chain forward {
type filter hook forward priority filter; policy drop;
ct state established,related accept
iifname != "vif*" accept
ip saddr 10.137.0.xx jump qbs-10-137-0-xx
}
chain prerouting {
type filter hook prerouting priority raw; policy accept;
iifname != "vif*" ip saddr 10.137.0.xx drop
}
chain postrouting {
type filter hook postrouting priority raw; policy accept;
oifname != "vif*" ip daddr 10.137.0.xx drop
}
chain qbs-10-137-0-13 {
accept
reject with icmp admin-prohibited
}
}
table ip6 qubes-firewall {
chain forward {
type filter hook forward priority filter; policy drop;
ct state established,related accept
iifname != "vif*" accept
}
chain prerouting {
type filter hook prerouting priority raw; policy accept;
}
chain postrouting {
type filter hook postrouting priority raw; policy accept;
}
}
Do you see only ip qubes-firewall
and ip6 qubes-firewall
tables?
You don’t have ip qubes
, ip6 qubes
, ip wg-quick-wg0
tables?
Do you have Qubes OS 4.1 or Qubes OS 4.2?
Or maybe your wgVM
template was restored from Qubes OS 4.1 backup?
I see only what it’s shown there. And you are correct, I seem to recall I just upgraded from 4.1 → 4.2, without changing the templates much.
Upgrade your templates to Qubes OS 4.2. You can change the repositories manually or use this script:
I ran the script in my template VM but unfortunately had no joy. The script ran correctly, and rebooting wgVM
I can see ip qubes
and ip6 qubes
, but not ip wg-quick-wg0
. The behavior is the same described above.
In that case I guess you need to run tcpdump
in wgVM
and in its net qube to check where the packets are going.
Ok, thanks for the suggestion, I’m making progress! I run a sys-firewall
between sys-net
and wgVM
. sys-firewall
is a MirageOS standalone VM. If i remove this link, and interface wgVM
and sys-net
directly, everything works perfectly. I didn’t do any edits to sys-firewall
so far. I guess maybe I should update it too…
Ok, after much fiddling I can confirm that the Mirage firewall was the culprit. I updated to the latest version of the firewall and now all problems are gone. Thank you very much for your patience!
Just to satisfy my curiosity, what was the previous version you used?
I have no idea, but I have a backup of the previous vmlinuz kernel. How do I check?