What do you find confusing about Salt on Qubes OS?

I want to write a guide (in the form of a blog post probably, but also shared here) to using Salt on Qubes OS. I know that most of the guides out there didn’t really help me when I started learning

To that end, I want to know what’s been confusing for folks. I’ll try to answer everything in the guide, and will probably also give responses here in the meantime

1 Like

I have wanted to use salt, but each time I go to look into it, I get confused on the file locations, file layout, what you should put in what file, how you know what you can get it to do, and how to run the whole thing or just a part of it.

I would like to setup a way build out my current config if I needed to rebuild:
-download all of the templates I use
-setup the named templates I have
-setup the VMs I have and include the additional software I have installed on them

There could probably be more included in that but seeing I have not used salt yet, I don’t know what to include.

Thank you for posting about this and I look forward to see what you come up with.

2 Likes

A LOT of things. Unfortunately I’m confused enough I am not sure how to articulate a list.

I use salt a lot, but I’ve had to bastardize it; I have many bash scripts running salt. Even more insane: I have salt write some of the the bash cripts to be used to run salt. I think I can get away from this now and use it more directly…except I’m using the user environment /srv/user_salt and I can’t get something there to behave as a top.sls file. No matter how I try to enable it something seems to go wrong. Our resident guru works in /srv/salt/some-subdirectory rather than with the user environment /srv/user_salt, so his examples are (uncharacteristically) not very helpful with this one detail [but have otherwise shown me a way out of the bastardization I just mentioned]. (Of course all of this means you have to include discussion of setting up a user salt environment in the first place!)

And you’re right–the Salt documentation talks about typical use cases, and Qubes is not typical; salt has been heavily adapted for Qubes.

2 Likes

I don’t know why I didn’t link it earlier, but for reference, here’s my own Salt repo:

1 Like

Yeah, that’s pretty much how I’ve organized my own repo (which I just posted).

What I wanted was to configure my system such that backups never needed to hold templates, only AppVMs that held persistent data. Everything else can be restored by cloning the repo

1 Like

Yeah, the Qubes OS setup is different from conventional Salt projects and it took me a long time to realize that. I think it’d be worth doing a blog post that covers what’s unique about using Salt on Qubes OS first.

Just to gloss over it, the *.top file stuff is specific to Qubes, as is top.enable etc. The minion config tells Salt where to look for formulas, pillar files, and salt states.

I’ll go into more detail later and post about it!

1 Like

This is exactly what I have…but it’s incredibly complex the way I have it set up. I am trying to simplify it and rely on “top” sls files but as I said I can’t seem to enable them when they’re in /srv/user_salt. (I back up dom0 and an assortment of regular AppVMs: my vault, my music player, and my email client–and even the music player I only back up the things like playlists, etc since the music files are on a NAS).

In theory I can go into an sls file (/srv/user_pillar/enable_regen.sls) that actually loads into a pillar), put ‘1’ next to any (or many) template/appvm/named disposable grouping (I call them “stacks” because the named disposable depends on the dvm template which depends on the template), fire it up, walk away and it gets rebuilt from nothing. The AppVM is left alone if it’s one of those ones with continuing, persistent data [but in order to do that I have to temporarily change its template].

In practice it seems like there’s always some glitch that stops the process early, sometimes as simple as having one of the VMs running at the time (preventing it from being deleted).

The end result is I can actually copy those config files (there are a LOT of them) onto a fresh install [I actually store them on a “repository” VM, and keep THAT backed up], and then transform that computer into a mini-me clone of my regular computer…with SOME work on my part–I haven’t been able to fully automate it especially with new setups. I can then restore those few AppVMs from backups.

Things like this really hurt… I spent a lot of time trying to understand the disconnects between Qubes salt and the upstream docs. “top” does seem to exist in plain salt here but doesn’t match somehow.

I feel like there are multiple terms for every concept…

I got some good help in this thread , but I still have problems with the different docs (although I did make a little progress).

Here’s an example… In the qubes salt docs there is this:

In Qubes, we don’t have a master. Instead we have one minion which resides in dom0 and manages qubes from there.

I can get that - but then later there is this:

The target_matching_clause will be used to select your minions (Templates or qubes)

…and there I am, lost and fumbling around in the dark. What is it: one minion, multiple ones, one which can create its own sub-minions, one which creates its peer minions?

As a concrete suggestion, instead of yet more explanation with embedded fragments of sls/top files, maybe it would be useful to have a single, giant, and verbosely self documented, setup to create and maintain a small set of demo qubes.

1 Like

If you mean you’re trying to run top.enable on a top.sls file, I think the enabling/disabling stuff (i.e. qubesctl top.enable etc) only works with *.top files.

That’s my plan. My own repo will be pretty fleshed out, and I’ll use that as an example to compare against.

I don’t think I’ll be able to offer something like @ben-grande’s qusal (where it’s flexible enough that anybody could use it with some configuration); but I think I can get it to a state where it’s a great reference for Qubes users to get familiar with Salt

2 Likes

Since you asked for feedback, I’ll use my specific issue with salt on qubes as an example.

What I find infuriating is the lack of introspection and out-of the box debug to help me figure out what breaks, instead of trying to paste error strings into google, or internalizing the entire saltstack codebase. I don’t recall any other tool that was so opaque to debug.

I went three (!!!) steps into the qubes Salt Beginner Guide before getting stuck. Specifically, I activated qubes.user-dirs: sudo qubesctl state.sls qubes.user-dirs, then created a state file to clone a fedora template, and a topfile to call that statefile. Then I ran qubestctl top.enable qubes-test1 (my topfile /srv/user_salt/quest-test1.top), and…

applying state breaks spectacularly:

[ERROR   ] Unable to render top file: while parsing a block mapping
  in "<unicode string>", line 17, column 1
did not find expected key
  in "<unicode string>", line 19, column 5

local:
----------
    ID: states
    Function: no.None
    Result: False
    Comment: No Top file or master_tops data matches found. Please see master log for details.
     Changes:
Summary for local
------------
Succeeded: 
Failed:    
------------
Total states run:     1
Total run time:   0.000 ms
DOM0 configuration failed, not continuing

Moreover, I suspect something is broken outside of my config, based on the first few lines; all my config files are way shorter than 17 lines, and the error appears even when I disable my topfile.

So, how do find out where it breaks? Seriously, how?

strace is not available in dom0, and even then I’d be slogging through python startup every time.

Manpage for qubesctl says it should be run instead of salt-call --local, and accepts the same arguments as that tool. Well, that’s pork pies. It doesn’t accept --log-level=debug, for instance.

Except… checking the contents of /usr/bin/qubesctl, it does salt_call() and seems to forward the options, when argv[1] is --dom0-only. Oh man. Except except, fat lot of good it does since the error doesn’t appear in that invocation.

I expect to make mistakes, and I don’t expect tools to be turn-key solutions, but not having a meaningful debug is just hateful.

1 Like

did you run sudo qubesctl saltutil.sync_all ?

This is because salt states are rendered twice before being ran: once through the Jinja renderer, then once more through the YAML renderer. You can debug stuff like this by copy/pasting your .sls files into a Jinja rendering tool. I use https://j2live.ttl255.com/ because it has a salt option

Yes, you can use the salt-call options, but only after the module, so for example:

sudo qubesctl state.apply -l debug # works
sudo qubesctl -l debug state.apply # doesn't work

You’re totally right that the Salt workflow - especially on Qubes - is obtuse as hell; but I have some more tips that can help:

  • qubesctl --show-output state.apply is a nice way to see the output from states ran on guests
  • journalctl in dom0 will show you qrexec calls which can give you a bit of insight into the stages of the states being run
  • tail -f /var/log/qubes/disp-mgmt-$qube can also give you some information about the currently running states

For those who are on Matrix, I put together a space for the user community, and in it there’s a room dedicated to Salt. You can join here:

I discovered last night that if I run a qvm.create state to create a VM off of a non-existent template–it returns OK even though it didn’t create any VM. [Yes I was testing a file by making sure it actually couldn’t do anything but at least this way I know the jinja rendering happens properly.]

So if you have other states that “require” this create (you know, like ones to set prefs, features and tags)…they will try to run anyway even though the VM didn’t actually get created, and fail.

1 Like

On the subject of useless error messages, one favorite of mine is where it refuses to run your state because a mapping value is not allowed in this context.

It’d be nice if it provided a file and line number (I do a lot of jinja includes) or at least write out the line in question.

1 Like

Sounds like a bug. You should open an issue on Github! (If you don’t use Github can open it)

That almost always means a syntax error or malformed SLS (but not malformed YAML) - if you know the file that’s causing the error, run it through a Jinja renderer (I linked on above) and then try a Salt linter!