Heavy disk IO freezes Qubes completely

When I start btrfs scrub on multiple hdds (sata (dom0) + luks unlocked in appvm) Qubes freezes completely maybe the mouse moves with an extreme delay but thats it.

I am on Amd and have 13 drives, they are all on btrfs but separate file systems (no raid or anything).

I manage the disks manually with bash scripts (btrbk etc. which works really well).
But when I try to scrub more than one drive at a time Qubes freezes completely.

This really bothers me other disk operation seem to work fine.

Does anybody have any tips on how to improve this? Qubes freezes sometimes as well but this is the only operation where I can unfortunately replicate it :frowning:

Same issue on kernel-latest and 6.12.

Giving every vm where the block device are attached more cpu + ram does not help.

The scrub operation is really i/o intensive, I’m not sure this can be done while keeping a working system.

However, btrfs allows you to pause the scrub, so you could scrub at night and pause at day :slight_smile:

1 Like

Does it recover once the operations complete?

What kind of disks are these? Spinning disks? NVME?

Some half-baked ideas about things you can look into:

  • If NVME, check modinfo nvme and modinfo nvme_core and review optional paraneters controlling things like number of queues as well as operating mode (e.g. poll mode)
  • If NVME, tweak the parameters mentioned in the previous item at runtime via sysfs, if you find some that help but you don’t want active all the time
  • Look into io priority for these scrub tasks. I’m guessing scrub i/o operations take place in kernel threads, I’m not knowledgeable about how exactly you can tune them, probably some LLM knows
  • Look into whether the most appropriate interrupt type is being used. I’m not an expert in low level OS / interrupts but as I understand it there are ā€œoldā€ style interrupts, MSI and MSI-X. Most NVME support MSI, you could make sure that’s being used. Try cat /proc/interrupts
  • Look through xl dmesg and dmesg (journalctl -b I think?) to see if there are any messages at boot about failing to allocate / set up anything related to these drives
  • Do the same during or after the scrub operation looking for anything obviously wrong
  • Try using the perf read workqueue and/or perf write workqueue LUKS2 flags. Check here and here for some info (control-f + perf on the pages)

I know these aren’t solutions, but at least you can investigate a little. Happy to help diagnose anything you paste here, I spent a good deal of time working on i/o performance on my system, maybe I learned something helpful

I should acknowledge @solene’s comment, these operations are inherently going to hammer the drives with i/o. Maybe reducing the priority of the i/o threads (again, I don’t know how to do this for btrfs kernel threads). But rather than degrade the system less for longer, her idea (doing them off hours) or staggering them is a better solution

2 Likes

this stuff is a game changer with regard to performance

1 Like

Thank you will definitely test your suggestions!

What kind of disks are these? Spinning disks? NVME?

They are all HDDs (should all be CMR?)
My motherboard only has 4 Sata ports so I have a HBA with 8 sata and a cheap chinese x1 card with an asm chip.

Does it recover once the operations complete?

I always do a hard shutdown bc I don’t hear the discs properly working anymore (some heads seem to make weird moves tho)

Is setting the targetvm to HVM or PVM better? (luks-dvm is only for unlock then attach the unlocked block to another vm)

Also are there any bios settings that could help? I am on am4 and I think I have everything virtualization related enabled.

btrfsmaintenance/btrfs-scrub.sh at master Ā· kdave/btrfsmaintenance Ā· GitHub has an ioprio function. Not sure if this could help / how to apply them to my case ( I don’t use these scripts…)

see the doc for flags -c and -n

https://btrfs.readthedocs.io/en/latest/btrfs-scrub.html

I have no idea how to use them, sorry.

1 Like

Okay these are not that difficult but even with -c 3 -n 7 (lowest prio possible ) the system still freezes with 10 simultaneous scrubs. ( maybe I have to use a different scheduler?)

cat /sys/block/xvdi/queue/schedular gives me:
[none] mq-dealine kyber bfq

They all start but after a while (mostly when I type or open a programm) everything freezes

using limit -l 100M definitely helps (system stays up for much longer before it freezes).

Hah TBH I am not too happy about all these hard shutdown + emergency head retracts

In which VM can I apply a limit to the scheduler? dom0 is currently the only one that applies a schedular to the device (bfq) should I change that ?

Okay yeah setting the target vm to HVM and using that qubes kernel and then setting the scheduler for all the devices to bfq + -c 3 -n 7 + limiting the scrub to 100M definitely helps.

I am not sure tho if this is correctly done.
And system still freezes :frowning:

Will have to try applying this now:

  • Try using the perf read workqueue and/or perf write workqueue LUKS2 flags. Check here and here for some info (control-f + perf on the pages)

Ah is this not something I should set for HDDs…

I think the best solution is probably to scrub them when the system is idle, and probably not all of them at once, as @solene suggested

I’ll be interested if you find any other solutions

Last note, I noticed this on the link @solene sent. I think you would want a value lesser than 100MB, though

Since linux 5.14 it’s possible to set the per-device bandwidth limits in a BTRFS-specific way using files /sys/fs/btrfs/FSID/devinfo/DEVID/scrub_speed_max. This setting is not persistent, lasts until the filesystem is unmounted. Currently set limits can be displayed by command btrfs scrub limit.
$ echo 100m > /sys/fs/btrfs/9b5fd16e-1b64-4f9b-904a-74e74c0bbadc/devinfo/1/scrub_speed_max
$ btrfs scrub limit /
UUID: 9b5fd16e-1b64-4f9b-904a-74e74c0bbadc
Id Limit Path – --------- -------- 1 100.00MiB /dev/sdx

And I’m still pretty confident just doing them serially will be better. The issue seems like it might be related to the HBAs more than anything else

I can’t tell whether you’re getting full lockup or crash or just excessive contention for mouse/kB. For me, a freeze can unfreeze…

Two friends for working out…

while sleep 5 ; do date ; done

sleep 120 ; btrfs scrub cancel <the devices>
To run in separate term windows, then launch the scrub… and see if life returns after 2 minutes.

If not, is there a hardware or PSU problem? I normally trust Qubes more than I trust my hardware!

Related: I see you mentioned this in your first message:

Are these full lockups in regular use?

Wherever I see everyone say that you should do btrfs scrub one by one.
Any parralel scrubing kills any most powerfull machines.

edit: try dissabling quota on every disks (it’s enabled by default)

Pretty sure it’s full system lockup. Sometime happens during regular use as well (most of the time when there is load and I launch something or use something heavy like a game but sometimes also seemingly random. I should have more than enough CPU time tho).

Will test with the auto cancel after 2m.

5 simultaneous scrubs were running for the last 10h with 100M limit

I have modified my script so that 5 run at a time and they start/stop when I sleep.

Let’s see how stable this is.

EDIT:
As soon as I opened Qube Manger it froze :frowning:
I will play around with the amount of drives (maybe just 1 and the data limit)

quotas are already disabled

Just to be clear, it’s only the parallel operation that brings the system to its knees?

Please tell us the exact CPU/motherboard combo and HBA type.

Without knowing anything, I’d guess that your system is suffering from I/O starvation. Did you calculate PCI lanes?

For a start I’d check the drive cables and ditch that x1 controller in favour of some used LSI 2008 flashed to IT/strict pass through mode (some Intel M1015s are around 40 bucks over here via eBay). I would do this anyway, as long as your motherboard and case support some PCIe 2 card of that length. Iā€˜d double check every single drive is CMR as well and SMART status is ok.

Even with that: donā€˜t expect miracles. I donā€˜t know much about btrfs. But I know ZFS. Scrubbing large (i.e. wide) pools of spinners attached to some consumer hardware could take not only hours but days.

I had a ZFS pool of HDD that was taking A MONTH to scrub.