Udev rules not working as before

Hi,

This is the procedure I have been using to TRIM on USB flash drives that pretend they don’t support it (while they actually do):

https://www.jeffgeerling.com/blog/2020/enabling-trim-on-external-ssd-on-raspberry-pi

This has worked fine for years with zero issues on bare-metal Linux as well as on Qubes OS, even when I attach only the individual partition (sda1) to the particular qube.

However, recently, the following sequence of events happened.

  1. Attaching the USB drive partition (sda1) to a disposable qube (which has the relevant udev rules in the upstream template), then mounting the partition, results in:
user@ddd3226:~ > sudo fstrim -v ~/usb
fstrim: /home/user/usb: the discard operation is not supported

After unmounting, re-attaching and mounting again, TRIM worked fine. Not as convenient as before but still working.

  1. Recently (last week or so), it stopped working completely. The only way to TRIM now is to do it:
  • either directly in sys-usb; or
  • if I attach the whole USB device to the disposable qube

then follow the manual steps from the linked guide, because the device is not in set to “unmap” mode automatically (as it used to be, and as the udev rule dictates).

I assume the udev rule simply does not activate on attaching the partition, as it used to.

What has changed in Qubes OS that prevents the udev rule triggering correctly? How can I have this work again as before?

If you’re attaching across VMs as a block device (not as a USB device), the udev rule has to be active in the source VM you’re attaching from (e.g. sys-usb), not in the destination VM you’re attaching to.

Both sys-usb and the target qube are based on the same template. Even in sys-usb it is not triggered correctly, so the manual procedure has to be followed.

FWIW, the udev rule is copied at boot like this by /rw/config/rc.local:

cp /usr/local/etc/udev/rules.d/10-trim.rules /etc/udev/rules.d/

It has always been like that for years and has worked fine. I have not modified any config regarding that.

Is there anything else that might have changed in Qubes or that I need to do?

You could just keep it in /usr/local/lib/udev/rules.d/

Either way though that’s on the ‘private’ volume of sys-usb, so maybe it’s loaded too late if your USB device is already plugged during sys-usb startup? You could put it in /usr/lib/udev/rules.d/ (without local/) of sys-usb’s (disposable template’s) TemplateVM to load it earlier. Or try udevadm control --reload-rules && udevadm trigger in rc.local after the .rules file has become available in one of the udev configuration directories.

Either way though that’s on the ‘private’ volume of sys-usb, so maybe it’s loaded too late if your USB device is already plugged during sys-usb startup?

How can I check that?

BTW, I was just about to try the different locations you suggested. Here is what happened:

  1. I attached the sda1 to the disposable and mounted it

user@ddd1532:~ > sudo fstrim -v ~/usb
/home/user/usb: 1.9 GiB (1987969024 bytes) trimmed
  1. umount ~/usb
  2. Detach sda1 from the qube
  3. Start new disposable, repeat step 1
  4. Try to repeat step 2:
root@ddd7136:~ # udevadm control --reload-rules && udevadm trigger
root@ddd7136:~ # fstrim -v /home/user/usb/
fstrim: /home/user/usb/: the discard operation is not supported

So, obviously some inconsistency although sys-net was not restarted.

How should I approach this at all?

Everything udev related including udevadm control --reload-rules && udevadm trigger must be done in sys-usb, not in the (Disposable)VM that you’re later attaching the block device to.

There’s probably some way to make udev log what rules it’s applying to what events.

But also:

Although it is interesting that detaching and reattaching the block device has this effect. udevadm monitor in sys-usb during this might show something relevant. And you can look inside the provisioning_mode file somewhere in /sys to see how it changes.

1 Like

So… I have spent quite some time looking into this and it seems to me file location is not the issue.

Procedure (in sys-usb):

Set udev_log=debug in /etc/udev/udev.conf.
Manually copy the rules to /etc/udev/rules.d/10-trim.rules, as per the guide.
Restart udev and systemd-udevd services.
In terminal 1 run journalctl -f
In terminal 2 run watch 'find /sys/ -name provisioning_mode -exec cat "${}" \;'
Keep plugging and unplugging the same USB device and watch what is happening in both terminals.

Result:

Sometimes terminal 2 shows ‘unmap’, sometimes ‘full’.

Terminal 1 always shows a journal line that ‘unmap’ is set to the relevant device file, even when terminal 2 reports ‘full’.

What do you think?

I tried to reproduce this, but unfortunately neither one of my two ASMedia SATA-to-USB adapters seems to really support TRIM: After writing unmap into provisioning_mode, blkdiscard “succeeded” but it just reset the device and hexdump -C showed that the data was actually still present on the SSD.

What is the right way to approach such a problem?

I have never seen it on my Linux systems, only on Qubes. I wonder if it might be some kernel-specific Qubes thing or sth else.

Have you checked if your drive really supports TRIM, by writing unmap to provisioning_mode in sys-usb, using blkdiscard -f on a nonempty partition whose data you don’t mind throwing away, and comparing the raw data in the partition before and after blkdiscard? (If e.g. hexdump -C shows all zeros afterwards, TRIM definitely worked, but it could also be random-ish garbage instead of zeros on some drives.)

If it doesn’t really work, configuring udev is moot.

If it does work: Maybe making the TRIM request like this using blkdiscard will already cause the udev configuration problem? If that doesn’t cause it, try mounting and fstrim in sys-usb, without ever attaching the device to another VM. Or conversely, attaching/detaching/reattaching without ever mounting/fstrim. Then maybe without reattaching too.

(After each test, I would unplug the drive, restart sys-usb, and plug it in again.)

Have you checked if your drive really supports TRIM, using (in sys-usb) blkdiscard -f and hexdump -C on a partition whose data you don’t mind throwing away?

It says it will destroy the data and I don’t want that:

root@ddd8794:~ # blkdiscard -v /dev/xvdi
blkdiscard: /dev/xvdi contains existing file system (vfat).
blkdiscard: This is destructive operation, data will be lost! Use the -f option to override.

As I said, following the manual procedure from the guide allows successful fstrim. Isn’t that a confirmation?

If it does work: Maybe making the TRIM request like this using blkdiscard will already cause the udev configuration problem? If that doesn’t cause it, try mounting and fstrim in sys-usb, without ever attaching the device to another VM. Or conversely, attaching/detaching/reattaching without ever mounting/fstrim. Then maybe without reattaching too.

Well, that is pretty much what I currently do - replug/reattach etc. Not quite convenient but there it is.

Of course, I could also write a short script that repeats the manual steps, which won’t need udev rules at all.

However, none of that solves the mystery of inconsistency which I am really after. My speculation: Considering the ‘unmap’ always shows up in the debug log, as mentioned, there is obviously something that irregularly cancels that somehow after that and that seems Qubes-specific, as it doesn’t happen on a regular Linux system. I wonder what that might be. Any ideas?

Eh, that could be down to different versions (including patches) of the kernel or systemd (providing udev). Or something else I don’t have on my radar. So my approach would be to first narrow down what parts really, truly work at the moment, and then to zoom into where it fails.

This should happen in sys-usb, without attaching to another VM.

Can you shrink the filesystem and the partition, then create a new one with some dummy data?

Not if the “success” is fake:

In which case it might all just be a change in how errors are reported.

So my approach would be to first narrow down what parts really, truly work at the moment, and then to zoom into where it fails.

Alright.

This should happen in sys-usb, without attaching to another VM

My concern is that blkdiscard operates on device, not on partition. I really don’t want to damage any of the stored data.

Can you shrink the filesystem and the partition, then create a new one with some dummy data?

I have deliberately left some unpartitioned space for overprovisioning, so I can experiment inside that area. Would that be OK?

Please let me know how to proceed and I will report back.

You can (and should!) pass it the dummy partition device, e.g. /dev/sda2 instead of /dev/sda.

Probably still a good idea to have a backup :wink:

Sounds perfect.

Sounds perfect.

Great.
So, I created a 2GB ext4 partition in the empty space.
Please let me know how to repeat your test.

Without the udev rules file in sys-usb, and assuming the dummy partition is /dev/sda2, something like

# echo unmap >/sys/block/sda/device/scsi_disk/*/provisioning_mode
# cat /dev/random >/dev/sda2
# sha256sum        /dev/sda2 >checksum.txt
# blkdiscard -f    /dev/sda2
# hexdump -C       /dev/sda2

should show all zeros in the hex output. (If not, the checksum can be compared to the one before TRIM to see if it changed at all.) Also, provisioning_mode should still contain unmap, so that another TRIM will work again.

Here is my test (based on discussion so far):

root@sys-usb:~ # find /sys/ -name provisioning_mode -exec cat "{}" \;
unmap
root@sys-usb:~ # mount /dev/sda3 /mnt
root@sys-usb:~ # echo XXXYYYZZZ > /mnt/abc.txt
root@sys-usb:~ # hexdump -C /dev/sda3 > /tmp/before
root@sys-usb:~ # rm /mnt/abc.txt 
rm: remove regular file '/mnt/abc.txt'? y
root@sys-usb:~ # fstrim -v /mnt
/mnt: 7.3 MiB (7620608 bytes) trimmed
root@sys-usb:~ # hexdump -C /dev/sda3 > /tmp/after
root@sys-usb:~ # grep XXXYYYZZZ /tmp/before 
00880400  58 58 58 59 59 59 5a 5a  5a 0a 00 00 00 00 00 00  |XXXYYYZZZ.......|
root@sys-usb:~ # grep XXXYYYZZZ /tmp/after
00880400  58 58 58 59 59 59 5a 5a  5a 0a 00 00 00 00 00 00  |XXXYYYZZZ.......|
root@sys-usb:~ # umount /mnt
root@sys-usb:~ # 
root@sys-usb:~ # hexdump -C /dev/sda3 > /tmp/after2
root@sys-usb:~ # grep XXXYYYZZZ /tmp/after2
00880400  58 58 58 59 59 59 5a 5a  5a 0a 00 00 00 00 00 00  |XXXYYYZZZ.......|
root@sys-usb:~ # blkdiscard -f /dev/sda3
blkdiscard: Operation forced, data will be lost!
root@sys-usb:~ # hexdump -C /dev/sda3 > /tmp/after_blkdiscard
root@sys-usb:~ # grep XXXYYYZZZ /tmp/after_blkdiscard 
00880400  58 58 58 59 59 59 5a 5a  5a 0a 00 00 00 00 00 00  |XXXYYYZZZ.......|
root@sys-usb:~ #

IIUC, this means the data is still exposed by device firmware, i.e. not truly trimmed, overprovisioning is the only way to have some wear reduction. I will try another device and write back.

That still doesn’t demystify the udev part though.

should show all zeros in the hex output.

It doesn’t.

(If not, the checksum can be compared to the one before TRIM to see if it changed at all.)

No change in checksum either.

Since you remember it previously at least giving the impression that it worked, you could now retry the blkdiscard stuff with an old enough template and VM kernel version for sys-usb.

Maybe not, but it’s hard debug automation when it doesn’t even work manually.