Why is `salt` used in Qubes OS for automation while being SO BAD?

Why is salt used in Qubes OS for automation while being SO BAD?

I am certain that using salt stack for automation in Qubes OS is not an optimal solution. Because salt is too unreliable and too often break things.

Here are some of the reasons:

Bad error handling, mostly not handling errors at all

Salt has no proper error handling. If any step of declarative commands goes wrong with error, user sees a wall of text with call stack and un-handled exception, even in simple cases.

E.g. run on Qubes OS R4.3 this:

sudo qubesctl top.enable AAA

It will show you 3 screens of text with call stack.

  • What the hell is that error processing?
  • Why no proper error messages?
  • Why such basic exception was un-handled?

Terrible software design.

No reverts, not preserving system consistency but breaking OS

Salt is not able to actually revert changes if something goes wrong. I mean it can be Turing-complete language, so it can have proper processing in theory, but I’ve never saw it to be the case in Qubes OS, nor anywhere else.

Salt just leaves the user’s system in half-way semi-broken state almost always.

  • No free space? Terminate half-way!
  • You have different qube name for firewall? No checks, terminate half-way!
  • Syntax of some external command has changed? Terminate half-way!

In all cases it shows ridiculous error debug-level error message and leaving a lot of un-reverted changes in the OS.
No information what should be done, no human error messages, no proper revert/undo features in case of any errors. So, every call of salt I ever make is gamble, and in many cases I regret I used it at all.

Declarative language without any validation nor compilation

The whole syntax of salt files is terrible. You never can know in advance what fields are possible, what values are allowed. Any typo in text string in any field and you will have half-ruined OS after running this script (due to execution being stop in the middle).

It would be good to have some type of scripts validation, or compilation would be good to have on the language level.

Salt is not for Qubes OS

Even in case salt does not fail (as often happens for me), it still is designed for different type of big-network-systems, with multiple computers, master-minions relations and etc. Qubes OS uses only part of it. The whole technology looks as alien and over-engineered technology, that does not suit Qubes OS and does not work reliably at all.

Proper alternatives?

Maybe there are good alternatives to salt, that can work reliably, be well-designed, and hopefully allow interactive run like port installation scripts in FreeBSD.

Even good python or even bash automation scripts would be better in most cases.

Let’s consider on task that needs automation: creating sys-audio qube. The proper automation solution should:

  • guide user, telling what is happening on each step,
  • ask questions, explaining possible options,
  • revert changes if something failed at each step,
  • telling user what exactly failed and what should be checked to solve the issue.

And salt is terrible in all that, showing walls of stack traces in every basic trouble, as we have in the current issue.

I decided to install sys-audio on R4.3 via salt commands: qusal/salt/sys-audio at main · ben-grande/qusal · GitHub . But running the first command sudo qubesctl top.enable sys-audio in dom0 showed a huge 3-screen-size output with python exceptions, probably telling that sys-audio.top file was not found or something. I hope this command execution did not break anything already. I am afraid to run such commands again.

Why is salt so bad?

I do not know, you tell me.

Each time I touch it as a user, I suffer and risk getting broken system. EACH TIME.

Can you imagine that running

sudo dnf install AAAA

would also show 100 lines of internal call stack with several unhandled exceptions in them instead of proper ERROR text?
And would also leave the system in unknown state (user does not know what part of automation was done before unhandled exceptions were thrown).

Can salt be a dead-end approach for Qubes OS?

5 Likes

I use Salt without living in constant fear :slight_smile: but I understand your experience.

I use the commands on qubes I could delete first, to be sure everything is working.

Salt has many flaws, and it is not as advanced as you wish, unfortunately. I wonder if Ansible is better?

3 Likes

Well, I don’t even know if salt allows making advanced scripts that handle errors properly. I’ve never saw it in case of salt.

Also I do not understand the advantages of declarative form of writing commands that are executed. Especially in case of Qubes OS. It violates KISS principle by a lot.

That I do not know.

1 Like

I’m not sure to understand, but I sometimes see scripts here that could be way more simple with salt (I.e.: install/update software).

Can you provide examples please?

The most simple case I see is to create a qube with required settings (memory, color and etc.). Actually Qubes OS has a lot of salt scripts like that already.

  • In case of bash script it’s trivial - a single line starting with qvm-create.
  • In case of salt - it’s creating 2 files, each with many lines and several commands to run to execute this simple task.

Unfortunately replacing salt by whatever alternative working with states like ansible or chef will come with the same issues.

Ideally we need a declarative operating system that always produce only what we put in the file and remove what is not in that declaration (salt and ansible and whatever can’t do that because they know absolutely nothing about the system, they dumbly apply changes from a recipe).

This leaves you with either:

  • NixOS everywhere, I think someone is working at having a Dom0 nixos and someone else at a NixOS guest, but this Linux distribution comes with issues like not integrating well with non nixos stuff
  • rebuild qubes from scratch using salt/ansible so you always start from a clean state and build on top instead of adding changes on a living system, this could be working well except it will be super slow to recreate templates for any change, this is a bit what containers do but they work with base layers so you don’t always have to redo all the steps

Salt was a leading software a while ago, we didn’t have better solution back then. Nowadays it’s still a good software for the right job, but managing qubes like this isn’t really fitting well, but I can’t imagine any reliable alternative except if it switched with NixOS or something container based (but isolation is not there).

4 Likes

@balko I’m a new-is user, and had the same-ish experience as you using Salt.

(nb: I’m speaking purely from a personal perspective. I don’t really have an opinion on Salt “overall” - different tools work for different people and use cases.)

I ended up making my own python framework YASA - Yet Another Salt Alternative (incremental searchable backups & enforceable declarative architecture).

It’s by no mean perfect - tbh it’s tailored for my own use cases for now.

In case it can help, feel free to fork it and adapt it to your use case.

Note

As it stands it does NOT do everything you whish Salt did - but since it’s in python, it’s flexible and you can easily add whatever you want.

1 Like

I use Salt every day, and see it implemented on many Qubes deployments.
It is not perfect by any means, but it’s usable for simple and
complicated use cases.

Salt is just a framework. It already provides a range of exception
handlers that can be implemented, and the Qubes implementation does not,
I think, make good use of them, but they are there to be used.

Salt is written in python, and I fear that YASA is going to reinvent the
wheel, although implementing a small subset of what is available in
Salt will suit some people.

@balko - salt is a framework, and the issues you raise can be addressed
by adopting some good practices. The first is to use requirements -
you should not be running a series of states without a requirements
chain. Adopt defensive programming - make the states conditional and
implement whatever checks you need in the code, using pillar items as
appropriate.

In almost every case where a state implementation fails, even where the
configuration is quite complex, you can simply fix the error and rerun
the state. I invariably do this by adding some additional state as a
requirement, so that future use will not hit the same problem.

Add a conditional to check for space, and extend space as appropriate.
Make this a requirement of the install state.

Use the pillar, or check for the netvm, before implementing a state that
depends on that value.

Fix that error - it’s your error.

You could write a bash script to configure your qubes and hit
exactly the same issues. You would not, I think, blame bash. Salt
does not do these things automatically but it provides a framework that
allows you to simply implement checks and balances in to your code.

Yes, it can. You can leverage the requirements framework, particularly
Watches and OnChanges, to roll back changes in case of error. It’s there
if you want to use it, but it’s up to you to implement it. I do this in
complicated deployments, although in most cases in Qubes it’s sufficient
to fix the error or add a conditional, and rerun the state.

I never presume to speak for the Qubes team. When I comment in the Forum I speak for myself.
2 Likes

Space amount can change during the process. Preconditions are not solving the issue for 100%, so the question is what happens with salt automation in Qubes OS when something goes wrong? Answer - nothing good, salt just terminates and does not handle errors at all, right?

And what about error handling? If in bash user calls wrong (outdated) argument for the commend, it will return error and I will be able to process it properly. What happens in salt scripts of Qubes OS if some flag of some qvm-* command would be changed?

Bash has a lot of to be blamed for (mostly pitfalls), but in case of bash the proper handling of error situations is possible. Not sure if it possible in salt. If it is possible, then why Qubes OS has like almost none of that?

Is it implemented in Qubes OS? If it’s still mostly NOT used by Qubes OS, then, maybe salt was not a right framework for Qubes OS in the first place.

Thanks for the link. Interesting, but I am not ready to commit.

Ansible isn’t much different. It won’t do any of these:

It is marginally better at telling what is going on at the moment - it reports progress upon completing each task and block unlike salt.

It also returns better errors - but they still aren’t short and simple. Hey, at least they seem to always point at actual problem unlike salt and jinja.

On the other hand IMO salt is better ordered thanks to tops, pillars, and highstate.

I can only bump solene’s reply, this is just the state of configuration management tech at the moment. Things like Nix and Guix might be the future but I’m willing to bet that they have their own set of problems and limitations.

Regarding other languages that can be used for scripting like bash and python - they are great, but don’t have a built-in mechanism for skipping already configured steps. Both salt and ansible do that (even with scripts and commands, using markers configured by you, like creates). This makes writing bash script equivalent to a salt state comparatively long and probably buggy unless you are very good.

2 Likes

It’s to the user to handle errors as appropriate.

For example:

wget https://nl.mirror.flokinet.net/qubes/iso/Qubes-R4.3.0-x86_64.iso:
  cmd.run:
    - runas: user

will generate a salt error:
Cannot write to 'Qubes-R4.3.0-x86_64.iso' (No space left on device).

Whereas:

wget https://nl.mirror.flokinet.net/qubes/iso/Qubes-R4.3.0-x86_64.iso:
  cmd.run:
    - runas: user
    - unless: rm -rf /home/user/Qubes-R4.3.0-x86_64.iso

Will automatically remove the partial download in the event of a failed
download.

But you probably dont want to start the download unless there’s enough
space, so check that free space is available:

check_space:
  disk.status:
    - name: /home/user
    - maximum: 9000000000 KB 
    - minimum: 8600000000 KB 
    - absolute: True
    - free: True


wget https://nl.mirror.flokinet.net/qubes/iso/Qubes-R4.3.0-x86_64.iso:
  cmd.run:
    - runas: user
    - require:
      - disk: check_space 
    - unless: rm -rf /home/user/Qubes-R4.3.0-x86_64.iso

If you have a series of states and have used requirements, then you can
userequire and onfail to control flow. It is documented here.

Generally you get an error message, as you would at the command line -
output of --help and “unrecognised argument”

I dont understand - it’s implemented and available in salt. I’d
need to see examples of why you think Qubes has none of it. For
complicated configurations I generally use such a framework. For simple
setup AND for beginner training, I dont. (If you look at my shaker
which is used as a training resource, most of the states are deliberately
very simple. You have to learn to walk before you can run.)

I never presume to speak for the Qubes team. When I comment in the Forum I speak for myself.
6 Likes
pkg.installed:
  - pkgs:
    - qubes-core-agent-passwordless-root

With bash, you should check if you should use dnf or apt. (And yes, my original statement was excessive :slight_smile: )

You only have to create one file like the following. Setting more things makes salt more useful:

test-vm:
  qvm.vm:
    - present: 
      - template: debian-13-xfce
      - label: red
    - prefs:
      - template: debian-13-xfce
      - label: red
    - tags:
      - add:
        - my-tag
    - features:
      - enable:
        - expert-mode 
    - service:
      - enable:
        - shutdown-idle
    - firewall:
      - set:
        - 'action=drop'

And with bash you still need to check the exit code, etc.

2 Likes

Maybe I misjudge salt, but the way it is used in Qubes OS is frustrating.
Mostly it looks to me like commands being written in declarative way, dividing 10 commands over 7 files, with no error handling.

Simple bash scripts would be more reliable, not flexible and more simple to fix after fail.

If you want to give salt another chance, maybe you can create a topic about something you want to achieve with Salt, how you would do that in bash and see if we suggest something better.

Salt is a great tool for me because I’m not good at bash (and not very good at Python).

3 Likes

Last time I was hoping to create semi-official version of sys-audio with salt, based on awesome guide from @neowutran.

But salt failed me again, so I decided to follow the guide manually step by step, controlling situation much better, and succeeded.

I’m certain it’s possible to write a great automation script for this task, asking questions, providing options to the user, guiding them and properly telling if something failed. All that I have never saw from salt in Qubes OS.

Maybe someday, but probably not soon.

1 Like

It is up to you to choose how to structure your config, separating it is just a convention. Minimal setup that uses all unique salt features is 4 files - state top, state, pillar top, and pillar, but this is more than most people need. It is possible to put all your salt configuration in one file.

Even if you absolutely despise the way salt behaves (which is completely understandable) it still can be useful as a glorified script delivery mechanism. That resolves abysmal salt errors (single cmd.script is very unlikely to fail), you rip the benefits of the scripting language you like, you can easily orchestrate what machines run your script from top file, and template said script using jinja.

1 Like

I do not despise it, it’s more like:

  • I am a bit frighten to run it (especially it if alternatives are possible), because the additional approach complexity (that I am not sure is necessary) would take quite some time for me to find out if the system is affected or not after salt run failed.
  • I am genuinely do not understand the reason to adopt it in Qubes OS nowadays. I see flaws that I estimate as significant, but do not see major advantages, considering the rest of whole system being written in python and bash.
1 Like

All is perfectly understandable.
I see huge advantages in using salt, particularly given the range of
available modules, and the ease with which they can be applied. I do not
think that there is great complexity - I’ve run a number of training
sessions and almost everyone is able to write and use state files. In
many cases there need be no complexity - the states are structured, and
relations between them can be clearly stated.

Salt itself is written in python, so it’s a natural fit with Qubes.

Have you looked at the existing salt states for sys-audio? We could
start there, and use them as base for further discussion. Perhaps we
might be able to improve them also.

I never presume to speak for the Qubes team.
When I comment in the Forum I speak for myself.

2 Likes

Besides disagreeing with all your arguments that Salt is bad (I know of Salt deffects, but your arguments are just misconceptions), people have already pointed out on this thread multiple times, I won’t was my time debating over it.

You don’t get DevOps framwork and tools and propose shellscript because you are accustomed to it, it is your comfort zone, it doesn’t make the DevOps tools bad though.

I will, however, point out two important things:

  • The installation section explicitly says that R4.2 is supported. I haven’t had time to update my formulas to R4.3. I won’t make a promise when an update can be done, I have been focusing on other things, still related to Qubes.
  • I’ve seen you post this 3 times, twice on the forum and once on github. You know what I haven’t seen? The exception. Yes, the one thing that would allow for anyone to understand your issue. Sharing it won’t make me obliged to fix your issue, it is open source after all, but would allow others to help if they have time and will

Instead of focusing on “things broken, replace salt” (which Ansible is a contender, but not for most of the reasons you listed, mostly because packaging Salt is pain, especially for Fedora, and initial plan is to support both rather than replace it), focus on getting things solved.