Salt: file.recurse works for dom0 but not for another vm

The state file.recurse works for dom0 but not for another vm.

Here are the steps to reproduce the problem:

  • Execute the following commands as root in dom0:
# cd /srv/salt
# mkdir -p test/files/a_dir
# cd test
# touch files/a_dir/a.txt
# touch files/a_dir/b.txt
# touch init.sls
  • Write the following code in /srv/salt/test/init.sls:
recurse_dir:
  file.recurse:
    - name: /tmp/a_dir
    - source: salt://test/files/a_dir
  • Execute:
# qubesctl --show-output state.sls test
local:
----------
          ID: recurse_dir
    Function: file.recurse
        Name: /tmp/a_dir
      Result: True
     Comment: Recursively updated /tmp/a_dir
     Started: 06:05:34.635857
    Duration: 312.937 ms
     Changes:   
              ----------
              /tmp/a_dir/a.txt:
                  ----------
                  diff:
                      New file
                  mode:
                      0644
              /tmp/a_dir/b.txt:
                  ----------
                  diff:
                      New file
                  mode:
                      0644

Summary for local
------------
Succeeded: 1 (changed=1)
Failed:    0
------------
Total states run:     1
Total run time: 312.937 ms
  • Result: On dom0, it works.

Now lets try on an appvm:

  • Create an appvm named salty
  • Execute:
# qubesctl --show-output --skip-dom0 --targets salty state.sls test
salty:
  ----------
            ID: recurse_dir
      Function: file.recurse
          Name: /tmp/a_dir
        Result: False
       Comment: Recurse failed: none of the specified sources were found
       Started: 06:08:49.387395
      Duration: 32.777 ms
       Changes:   
  
  Summary for salty
  ------------
  Succeeded: 0
  Failed:    1
  ------------
  Total states run:     1
  Total run time:  32.777 ms
  • Result: On salty, it does not work.

Questions:

  • Why does file.recurse work in dom0 but not in another vm?
  • How can I get file.recurse to work in a vm other than dom0?

You simple haven’t given enough information for anyone to help.

First, file.recurse works properly in both 4.1 and 4.2, in dom0, and in
templates/qubes.

So, to help with your issue you should say , at a minimum, what version
of Qubes you are using, what template your management qube is using,
what template is used by the qube where it doesn’t work.
If you can test against other templates that would be useful.

Here is more information:

  • Qubes OS version: 4.2
  • fedora-38-xfce:
    • Is a template vm
    • Is a fresh install
  • default-mgmt-dvm:
    • template: fedora-38-xfce
  • salty:
    • Is an appvm
    • Newly created qube with the GUI: Qubes Manager > New qube
    • template: fedora-38-xfce

I tried to debug salt. The problem seems to occur ~19/20 times. states/file.py: recurse() relies on modules/file.py: source_list() to check the source url. When source_list() tries to get a list of directory using modules/cp.py: list_master_dirs(), it returns an empty list most of the time when using recurse().

Here is how I debugged/tested:

  • Recreate the setup from the first message
  • Start fedora-38-xfce
  • Patch modules/file.py: source_list() in fedora-38-xfce:
    (Salt version: 3006.5-1.fc38)
--- file.py_	2023-12-13 19:00:00.000000000 -0500
+++ file.py	2024-01-29 00:25:39.397320812 -0500
@@ -4428,6 +4428,9 @@
     if isinstance(source, list):
         mfiles = [(f, saltenv) for f in __salt__["cp.list_master"](saltenv)]
         mdirs = [(d, saltenv) for d in __salt__["cp.list_master_dirs"](saltenv)]
+        with open('/tmp/files.txt', 'w') as ff:
+            ff.write(repr(mfiles) + '\n')
+            ff.write(repr(mdirs) + '\n')
         for single in source:
             if isinstance(single, dict):
                 single = next(iter(single))
  • Shutdown fedora-38-xfce
  • Start salty
  • Execute in dom0: qubesctl --show-output --skip-dom0 --targets salty state.sls test
  • Execute in salty: cat /tmp/files.txt
  • Result: /tmp/files.txt when it does not work:
[]
[]
  • Result: /tmp/files.txt when it works:
[(<path>, <env>),...]
[(<path>, <env>),...]

This problem is over my head and I don’t know what to do.

It seems the problem occurs specifically in salt 3006.5. I downgraded to 3006.4 for the template vm used by the management qube and it works.

There is also a bug report here.

2 Likes

Nice catch - so this will affect anyone who picks up the update to
3006.5. As it’s already fixed upstream we just have to wait for the
reverted version to be packaged in the fedora archives.
It isnt yet in the testing archive - so if anyone needs this
functionality, then the reversion to 3006.4 is best.
In Fedora based systems you can use:

dnf install salt-ssh-3006.4-2.fc38

This specifies the version you want to install.

In Debian based systems, the update isnt yet available, so this isnt an
issue - it could be blocked in testing until 3006-6 becomes available. In
any case you can stop the package from being automatically updated
by issuing:
apt-mark hold salt-ssh
Dont forget to remove the hold once the issue is resolved, or the
package will never be updated.

2 Likes