How to run an HTTP filtering proxy

Introduction

By default, Qubes uses a special firewall VM that sits between the networking VM and each AppVM. This VM controls the traffic for AppVMs and can be used to restrict what AppVMs can send or receive. The traffic rules can be setup using the filtering rules GUI in Qubes VM manager. The manager translates user-defined setup into iptables rules for the firewall VM’s kernel.

The primary goal of the filtering rule setup in the firewall VM is to allow for the user to protect either from his own mistakes (like accessing an arbitrary website from a browser running in a banking VM) or from the mistakes of websites (like a banking website that loads JS code from a social network operator when the user logs into the bank).

As the rules in the firewall are IP-based, it has drawbacks. First, the rules cannot be used if one has to use an HTTP proxy to connect to websites (a common setup on corporate networks). Second, Qubes resolves DNS names from the firewall rules when the AppVM loads. This prevents websites that use DNS-based load balancers from working unless the user reloads the firewall rules (which re-resolve the DNS names) whenever the balancer transfers her session to another IP. Third, the initial setup of the rules is complicated as the firewall drops the connection silently. As a workaround, one can use a browser’s network console to see what is blocked, but this is time-consuming and one can easily miss some important cases like including sites for OCSP SSL certificate verification in the firewall white-list.

These drawbacks can be mitigated if one replaces iptable-based rules with a filtering HTTP proxy. The following describes how to setup a tinyproxy-based proxy in either the firewall VM or a custom proxy VM to achieve such filtering.

Note This content only describes setup of an HTTP proxy. This will handle web browsing using HTTP and HTTPS, but this type of proxy does not support other protocols such as IMAP used in Thunderbird. For that, you need a fully featured SOCKS proxy such as Squid which is beyond the scope of this article.

Warning

Running an HTTP proxy in your firewall VM increases the attack surface against that VM from a compromised AppVM. Tinyproxy has relatively simple code and a reasonable track record to allow to certain level of trust, but one cannot exclude bugs especially in the case of hostile proxy clients as this is a less tested scenario. It is not advisable to use the proxy in a shared firewall VM against untrusted AppVMs to black-list some unwanted connections such as advertisement sites.

A less problematic setup is to white-list possible connections for several trusted and semi-trusted AppVMs within one firewall VM. Still, for maximum safety, one should consider running a separate ProxyVM for each important AppVM.

In Qubes R4.0, one no longer creates ProxyVMs as such. However, the same is accomplished by choosing the provides network checkbox when creating an AppVM that will be used as a proxy.

As a counterpoint to this warning, it is important to note that an HTTP proxy decreases the attack surface of AppVMs. For example, with a proxy the AppVM does not need to make direct DNS connections, so a bug in the kernel or in the browser in that area would not affect the AppVM. Also, browsers typically avoid many of the latest and greatest HTTP features when connecting through proxies, minimizing exposure of new and unproven networking code.

Setup

  1. After reading through the Warning section above, determine if you want to proceed with the following steps in either your default sys-firewall VM or in a newly created proxy VM. If you decide to create a separate proxy VM,

    • In R4.0, create a new AppVM with the provides network checkbox set.
    • In R3.2, create a new ProxyVM.

    Then, proceed with the following.

  2. Copy this archive (Note: not reviewed, use at own risk!) with the proxy control script, default tinyproxy config, and a sample filtering file into the proxy VM and unpack it in the /rw/config folder there as root:

     cd /rw/config
     sudo tar xzf .../proxy.tar.xz
    
  3. If necessary, adjust /rw/config/tinyproxy/config according to the man page for tinyproxy.conf. The included config file refuses the connection unless the host is white-listed in the filtering file, so this can be altered if one prefers to black-list connections. One may also specify upstream proxies here. The file is a template file and the control script will replace {name} constructs in the file with actual parameters. In general, lines with {} should be preserved as is.

  4. For each AppVM that one wants to run through the proxy, create a corresponding filtering file in the /rw/config/tinyproxy directory. With the default config, the filtering file should contain regular expressions to match white-listed hosts with one regular expression per line. See the man page for tinyproxy.conf for details. The file should be named:

     name.ip-address-of-app-vm
    

    The name before the dot is arbitrary. For convenience, one can use an AppVM name here, but this is not required.

    It is important to get the ip address part right, as this is what the control script uses to determine to which AppVM it will apply the proxy rules. If you have created a separate proxy VM, change the NetVM of each AppVM that will be using it to the proxy VM. That can be done in Qubes VM manager in the VM settings dialog under the Basic tab. Next, see the Networking settings on the same tab to check the IP address of an AppVM.

    The attached archive includes a social.10.137.2.13 file with rules for an AppVM allowing connections to Google, Facebook, Linkedin, Livejournal, Youtube, and few other other sites. One can use it as an example after changing the IP address accordingly.

    When editing the rules, remember to include a $ at the end of the host name, and to prefix each dot in the host name with a backslash (like \.). This way, the pattern matches the whole host and not just a prefix, and the dot is not interpreted as an instruction to match an arbitrary character according to regular expression syntax.

  5. Check that the proxyctl.py script can properly recognize the rule files. For that, run:

     sudo /rw/config/tinyproxy/proxyctl.py show
    

    For each rule file it should print the name, ip address, and network interface of the running AppVMs. It will also display the id of the tinyproxy process that proxies that AppVM. Each pid will be -- because we have no running proxies yet.

  6. Now, start the AppVM for which you created a rule file, and then run:

     sudo /rw/config/tinyproxy/proxyctl.py update
    

    The update command starts proxy processes and adjusts the iptable rules to allow for proxy traffic for each running AppVM from the filtering files list. For each stopped AppVM, the proxy is killed.

    Check that proxy is started and the pid field of the show command is a number:

     sudo /rw/config/tinyproxy/proxyctl.py show
    
  7. Run the browser in the active AppVM and configure it to use the proxy on port 8100 of the proxy VM interface’s IP address. In Qubes VM manager, the IP address is displayed in the Gateway field in the Settings dialog for the AppVM.

    In Firefox, go to the Preferences dialog, select Advanced->Network, and click Settings for the Connection section. In the Connection Settings dialog, select Manual proxy configuration. For the HTTP Proxy field use the IP address of the AppVM’s gateway. Enter 8100 as the port, and select the checkbox “Use this proxy server for all protocols”.

    Go to a test web site. The browser should either load it (if it was white-listed in the filtering file), or show a page generated by tinyproxy that the page was filtered out.

    In the proxy VM, see the /run/tinyproxy/<name>/log file. For each filtered out website it contains an entry, and one can adjust the filtering file to include the corresponding host. After changing the file, run either:

     sudo /rw/config/tinyproxy/proxyctl.py restart <name>
    

    to restart the proxy with an updated rules file only for the given VM, or

     sudo /rw/config/tinyproxy/proxyctl.py kill-all-and-restart
    

    to restart all proxy processes.

  8. To make sure that the proxy is started automatically when the AppVM starts, change /rw/config/qubes-firewall-user-script to include the following line:

     /rw/config/tinyproxy/proxyctl.py update
    

    If the file does not exist, create it so it looks like this:

     #!/bin/sh
     /rw/config/tinyproxy/proxyctl.py update
    

    Make sure that the script is owned by root and executable:

     sudo chown root:root /rw/config/qubes-firewall-user-script
     sudo chmod 755 /rw/config/qubes-firewall-user-script
    
  9. In Qubes VM manager, adjust the Firewall rules for each AppVM using a proxy. In a typical case, when only the HTTP proxy should be permitted for outside connections:

    • In R4.0, select “Limit outgoing Internet connections to…” and make sure the address list is empty.
    • In R3.2, select “Deny network access except…”, make sure the address list is empty, and then unselect the “Allow ICMP,” “DNS”, and “Update proxy” checkboxes.

    There is no need to add any special entries for the proxy in the GUI as proxyctl.py adds rules for the proxy traffic itself.


This guide was initially written by Igor Bukanov in a message to the qubes-devel mailing list.


This document was migrated from the qubes-community project
  • Page archive
  • First commit: 01 May 2018. Last commit: 08 Dec 2020.
  • Applicable Qubes OS releases based on commit dates and supported releases: 3.2, 4.0
  • Original author(s) (GitHub usernames): awokd
  • Original author(s) (forum usernames):
  • Document license: CC BY 4.0

According to this GitHub issue this guide should probably be reworked / updated.

Specifically:

  1. the supplied config file may need updating as it has outdated entries at lines 16-19.
  2. The Python script should probably be ported to Python3 to not require installation of Python1 in “modern” templates.
  3. A workaround should be found for the lacking update functionality (see linked issue), if possible (or tested on Qubes4.2 to see if the issue exists there).

I don’t really have expertise with any of this, but maybe someone has a more “up-to-date” implementation of this implementation here to share?

I’ve found this, which works on Qubes4.1, but it has the limitation of requiring a separate proxyVM for each AppVM that wants to use this kind of filtering, while the solution in this guide is supposed to only require one proxyVM for multiple AppVMs. Another limitation of the linked alternative is that the proxyVM only starts up once the AppVM has started up fully already, which leads to a longer total startup time, as I have to wait for the AppVM to boot and then the proxyVM to boot, while the method in this guide should make both boot simultaneously, because the proxy is designated as the netvm of the AppVM (or even the firewall qube is used, in which case it’s probably already running).

This guide also doesn’t address the recommendation in the official documentation about having a second firewallVM between the “network service qube” (proxyVM) and the AppVM, though honestly I don’t quite understand why this is necessary. The three points listed there seem to not apply IMO, as “sys-firewall-1” still protects the firewall rules IIUC.

I’ll post my updated Python script, but be aware that I don’t actually know Python, so this is really just a best-guess effort to update the script to Python 3 and I also fixed a bug it had:

#!/usr/bin/python3

from __future__ import print_function

import argparse
import os
import pwd
import re
import signal
import subprocess
import sys
import time

# proxy port as seen by VM
proxy_vm_port = 8100

# Directory for logs and proxy configs created at run-time.
# This is in memory so shuting down the VM wipes it out.
tinyproxy_root_dir = "/run/tinyproxy"

# Directory with VM proxy rules. Each rule file should be named
#   <vm-name>.<ip-address>
# where vm-name is symbolic name for the AppVM and ip-address is its address.
tinyproxy_config_dir = "/rw/config/tinyproxy"

tinyproxy_config_template_file = tinyproxy_config_dir+"/config"

# Pattern to match the rule files in tinyproxy_config_dir
rule_file_pattern = re.compile(r'^(.+)((?:\.[0-9]{1,3}){4})$')

# map of symbolic AppVM names to their IP addresses
vm_name_ip_map = None

# pattern to match AppVM lines with vif interfaces in "ip route" output
vm_route_pattern = re.compile(r'^([^ \t]+)[ \t]+dev[ \t]+(vif[0-9]+\.[0-9]+)')

# map of running AppVM IP addresses to their network interface names
vm_ip_interface_map = None

# Wait time in seconds to assume that the process is really terminated
# after sending the kill signal
termination_wait_time = 0.1

def error_and_exit(message):
    print(message, file=sys.stderr)
    os.exit(1)

def read_proxy_state():
    read_vm_names()
    read_vm_routes()

def read_vm_names():
    global vm_name_ip_map
    vm_name_ip_map = {}
    ip_map = {}
    last_ip_octet_set = {}
    for file_name in os.listdir(tinyproxy_config_dir):
        m = rule_file_pattern.match(file_name)
        if not m: continue
        vm_name = m.group(1)
        vm_ip = m.group(2)[1:]
        if vm_name in vm_name_ip_map:
            error_and_exit("Duplicate VM name '{0}' in the directory with "
                           "proxy rules {1}".
                           format(vm_name, tinyproxy_config_dir))
        vm_name_ip_map[vm_name] = vm_ip;
        if vm_ip in ip_map:
            error_and_exit("Duplicate IP address {0} in the directory with proxy rules {1}".
                           format(vm_ip, tinyproxy_config_dir))
        ip_map[vm_ip] = vm_name
        last_ip_octet = get_last_ip_octet(vm_ip)
        if last_ip_octet in last_ip_octet_set:
            error_and_exit("Unsupported config - two AppVMs have the same last octet {0} in their IP address".
                           format(last_ip_octet))
        last_ip_octet_set[last_ip_octet] = True;

def read_vm_routes():
    global vm_ip_interface_map
    route_text = subprocess.check_output(["ip", "route", "show"])
    vm_ip_interface_map = {}
    for route_line in route_text.splitlines():
        m = vm_route_pattern.match(route_line.decode())
        if not m: continue
        vm_ip_interface_map[m.group(1)] = m.group(2)

def get_last_ip_octet(ipv4_string):
    return int(ipv4_string[ipv4_string.rfind(".")+1:])


class ProxyInfo: pass

def create_proxy_info(vm_name):
    if not len(vm_name):
        error_and_exit("Empty VM name")
    vm_ip = vm_name_ip_map[vm_name]
    if not vm_ip:
        error_and_exit("Unknown VM name - '{0}'".
                       format(self.vm_ip))

    info = ProxyInfo()
    info.vm_name = vm_name
    info.vm_ip = vm_ip
    info.port = proxy_vm_port + get_last_ip_octet(vm_ip)
    info.dir = tinyproxy_root_dir+"/"+vm_name
    info.pid_file = info.dir+"/pid"
    info.config_file = info.dir+"/config"
    info.log_file = info.dir+"/log"
    info.rules_file = tinyproxy_config_dir+"/"+vm_name+"."+vm_ip

    return info

# In Qubes all AppVM interfaces share the same ip so we need to get
# only once on the first call to the function
bind_ip = None

# Pattern to match "ip address show..." output
bind_address_pattern = re.compile(r'^[ \t]*inet[ \t]+([0-9.]+)')

def get_proxy_bind_ip(vm_info):
    global bind_ip
    if bind_ip: return bind_ip

    interface = vm_ip_interface_map[vm_info.vm_ip]
    assert interface

    address_text = subprocess.check_output(
        ["ip", "address", "show", "dev", interface])
    for line in address_text.splitlines():
        m = bind_address_pattern.match(line.decode())
        if m:
            bind_ip = m.group(1)
            break

    return bind_ip

# List of (source, destination, port) tupels with openned tcp ports
# according to iptable output
iptables_opened_ports = None

iptabled_openned_port_pattern = re.compile(
    r'^ACCEPT[ \t]+tcp[ \t]+[^ \t]+[ \t]+([0-9.]+)[ \t]+([0-9.]+)[ \t]+tcp[ \t]+dpt:([0-9]*)')

def ensure_opened_firewall(vm_info):
    global iptables_opened_ports
    if not iptables_opened_ports:
        iptables_opened_ports = []
        filter_text = subprocess.check_output(
            ["iptables", "-t", "filter", "-n", "-L", "INPUT"])
        for line in filter_text.splitlines():
            m = iptabled_openned_port_pattern.match(line.decode())
            if not m: continue
            entry = (m.group(1), m.group(2), int(m.group(3)))
            iptables_opened_ports.append(entry)

    bind_ip = get_proxy_bind_ip(vm_info)

    if (vm_info.vm_ip, bind_ip, vm_info.port) in iptables_opened_ports:
        return

    # Add missing rules
    subprocess.call([
            "iptables", "-t", "filter", "-I", "INPUT",
            "-s",  vm_info.vm_ip, "-d", bind_ip,
            "-p", "tcp", "--dport", str(vm_info.port),
            "-j", "ACCEPT"])

    # Add extra rule to redirect from the standard proxy port to
    # AppVM-specific port
    subprocess.call([
            "iptables", "-t", "nat", "-A", "PREROUTING",
            "-s",  vm_info.vm_ip, "-d", bind_ip,
            "-p", "tcp", "--dport", str(proxy_vm_port),
            "-j", "REDIRECT", "--to-ports", str(vm_info.port)])


def ensure_started_proxy(vm_info):
    ensure_opened_firewall(vm_info)

    # If pid exists, assume the proxy is running with the necessary config
    if os.path.isfile(vm_info.pid_file):
        return

    if not os.path.isdir(tinyproxy_root_dir):
        os.mkdir(tinyproxy_root_dir, 755)

    if not os.path.isdir(vm_info.dir):
        os.mkdir(vm_info.dir, 770)
        pwd_entry = pwd.getpwnam("tinyproxy")
        os.chown(vm_info.dir, pwd_entry.pw_uid, pwd_entry.pw_gid);

    with open(tinyproxy_config_template_file, "r") as f:
        config_template = f.read()

    config = config_template.format(
            port=vm_info.port,
            vm_ip=vm_info.vm_ip,
            bind_ip=get_proxy_bind_ip(vm_info),
            log_file=vm_info.log_file,
            pid_file=vm_info.pid_file,
            rules_file=vm_info.rules_file)

    with open(vm_info.config_file, "w") as f:
        f.write(config)

    if 0 != subprocess.call(["tinyproxy", "-c", vm_info.config_file]):
        print("Failed to run tinyproxy for {0}".format(vm_info.vm_name),
              file=sys.stderr)

def read_proxy_pid(vm_info):
    if os.path.isfile(vm_info.pid_file):
        try:
            with open(vm_info.pid_file, "r") as f:
                return int(f.read(20).strip())
        except OSError as ex: pass
    return 0

def ensure_stopped_proxy(vm_info):
    pid = read_proxy_pid(vm_info)
    if not pid:
        return False

    try:
        os.kill(pid, signal.SIGTERM)
    except OSError as ex:
        try: os.remove(vm_info.pid_file)
        except OSError: pass
    return True


def kill_all_proxies():
    should_wait_termination = False
    pid_files_to_remove = []
    for vm_name in vm_name_ip_map.keys():
        info = create_proxy_info(vm_name)
        pid_files_to_remove.append(info.pid_file)
        if ensure_stopped_proxy(info): should_wait_termination = True

    if should_wait_termination: time.sleep(termination_wait_time)

    if 0 == subprocess.call(['killall', '-q', '-SIGKILL', 'tinyproxy']):
        time.sleep(termination_wait_time)

    for pid_file in pid_files_to_remove:
        try: os.remove(pid_file)
        except OSError: pass

def update_command(args):
    for vm_name in vm_name_ip_map.keys():
        info = create_proxy_info(vm_name)
        if info.vm_ip in vm_ip_interface_map:
            ensure_started_proxy(info)
        else:
            ensure_stopped_proxy(info)

def start_command(args):
    for vm_name in args.vm_name_list:
        info = create_proxy_info(vm_name)
        if info.vm_ip in vm_ip_interface_map:
            ensure_started_proxy(info)
        else:
            print("Cannot start proxy for {0} - no IP route information is "
                  "available".format(vm_name), file=sys.stderr);

def stop_command(args):
    for vm_name in args.vm_name_list:
        info = create_proxy_info(vm_name)
        ensure_stopped_proxy(info)

def restart_command(args):
    should_wait_termination = False
    start_list = [];
    for vm_name in args.vm_name_list:
        info = create_proxy_info(vm_name)

        # Sleep when the proxy was running to wait for proxy to terminate
        terminated = ensure_stopped_proxy(info)
        if info.vm_ip in vm_ip_interface_map:
            if terminated: should_wait_termination = True
            start_list.append(info)

    if len(start_list):
        if should_wait_termination: time.sleep(termination_wait_time)
        for info in start_list:
            ensure_started_proxy(info)

def kill_all_command(args):
    kill_all_proxies()

def kill_all_and_restart_command(args):
    kill_all_proxies()
    for vm_name in vm_name_ip_map.keys():
        info = create_proxy_info(vm_name)
        if info.vm_ip in vm_ip_interface_map:
            ensure_started_proxy(info)

def show_command(args):
    output_cells = [('Name', 'IP', 'Interface', 'Proxy_Pid')]
    for vm_name in vm_name_ip_map.keys():
        info = create_proxy_info(vm_name)
        interface = vm_ip_interface_map.get(info.vm_ip)
        pid = read_proxy_pid(info)
        output_cells.append((info.vm_name, info.vm_ip,
                             interface or '--', str(pid) if pid else '--'))

    last_column = len(output_cells[0]) - 1
    max_cell_widths = [0 for x in range(last_column)]
    for line in output_cells:
        for i in range(last_column):
            l = len(line[i])
            if l > max_cell_widths[i]: max_cell_widths[i] = l
    for j in range(len(output_cells)):
        line = output_cells[j]
        s = ""
        for i in range(last_column):
            s += line[i].ljust(max_cell_widths[i] + 1)
        print(s+line[last_column])

parser = argparse.ArgumentParser(description='Control HTTP proxy for Qubes App VMs.')
subparsers = parser.add_subparsers(title='subcommands', description='Pass -h after the subcommand name for its detailed help')

sub_parser = subparsers.add_parser('update', help='for each running AppVM ensure that its proxy runs. All other proxies are stopped.')
sub_parser.set_defaults(func=update_command)

sub_parser = subparsers.add_parser('start', help='start proxy for AppVM')
sub_parser.add_argument('vm_name_list', metavar='AppVM', nargs='+', help='ensure that proxy for AppVM is started. Does nothing if AppVM is not started')
sub_parser.set_defaults(func=start_command)

sub_parser = subparsers.add_parser('stop', help='stop proxy for AppVM')
sub_parser.add_argument('vm_name_list', metavar='AppVM', nargs='+', help='ensure that proxy for AppVM is stopped')
sub_parser.set_defaults(func=stop_command)

sub_parser = subparsers.add_parser('restart', help='restart proxy for AppVM')
sub_parser.add_argument('vm_name_list', metavar='AppVM', nargs='+', help='stop the given AppVM proxy and then start it again unless AppVM is running')
sub_parser.set_defaults(func=restart_command)

sub_parser = subparsers.add_parser('kill-all', help='forcefully kill all running proxy instances')
sub_parser.set_defaults(func=kill_all_command)

sub_parser = subparsers.add_parser('kill-all-and-restart', help='forcefully kill all running proxy instances and then restart proxies for running AppVms')
sub_parser.set_defaults(func=kill_all_and_restart_command)

sub_parser = subparsers.add_parser('show', help='show the status of VMs with proxies')
sub_parser.set_defaults(func=show_command)

args = parser.parse_args()

read_proxy_state()

args.func(args)

It would be nice if the bug preventing the proxy from automatically updating when a new AppVM connects to it were fixed, though; then again…~1800 open issues on Github, so the devs are quite busy already.

Edit: since this is iptables based it won’t work on Qubes 4.2.