Very slow large-storage VM shutdown time and system stress after restore

qube · March 8, 2021, 4:22pm

I restored my old setup on a new, faster laptop, but the shutdown time of my large-storage qubes is really long now. I don’t know if this has anything to do with the restore, to be fair – but on my older laptop, this never happened.

It takes 2-5 minutes to shut down both AppVMs and templates, and even worse, during that time the rest of the OS is basically locked up. Anything more that dragging windows around is impossible. Sometimes the Qubes Manager goes blank during this time as well, especially if I’m on another desktop space.

I have tried rebooting but I wouldn’t know what else to do to solve this. It’s quite annoying since the system becomes mostly unusable during VM shutdown and the Qubes Manager is not only unusable but also shows no “dot” at all (not yellow) so it can be hard to tell what’s going on. My laptop’s hard drive light is flashing and goes out when the shutdown completes, and everything returns to normal.

The expected messaging is normal – I get the message that the qube is attempting to shut down, then it takes several minutes for the next message to appear that it has completed shutting down.

deeplow · March 8, 2021, 4:33pm

Sounds like a hardware compatibility issue. Could it be this: Qubes on slow motion on new install

Try creating a new AppVM and starting it. Then you’ll know if it’s some issue with the restore or not.

qube · March 8, 2021, 4:44pm

Sounds like a hardware compatibility issue. Could it be this: Qubes on slow motion on new install

Nope, performance is ONLY slow during VM shutdown, then it returns to normal. Also, mouse functions normally.

Try creating a new AppVM and starting it. Then you’ll know if it’s some issue with the restore or not.

Thank you, good idea. The new AppVM shuts down quickly/normally.

deeplow · March 8, 2021, 4:56pm

Now we know it’s an issue with one of the backed up ones.

Next my advice would be to start every VM from the backup to see if the issue is consistent among all restored VMs.

And if it doesn’t not happen to all of them, try finding any commonality among them. Any extremely large storage, for example?

kommuni · March 8, 2021, 4:58pm

Have you increased the system storage size or private storage size of your templates or AppVMs by large amounts in the past? Like 50GiB or higher.

qube · March 8, 2021, 5:04pm

Have you increased the system storage size or private storage size of your templates or AppVMs by large amounts in the past? Like 50GiB or higher.

Yes. Do you have advice about large VMs? They were the same size and never had this issue before.

Next my advice would be to start every VM from the backup to see if the issue is consistent among all restored VMs.

I will try this next.

kommuni · March 8, 2021, 5:12pm

I think it’s related to this issue. It can be triggered when the VM storage is large.

deeplow · March 8, 2021, 5:27pm

Haha. I was about to link to that

Even if it’s not related to the above linked issue, I would suggest @qube follows the debugging advice on said issue – particularly the first few comments.

qube · March 8, 2021, 5:42pm

Even if it’s not related to the above linked issue, I would suggest @qube follows the debugging advice on said issue – particularly the first few comments.

Thanks for that. However they discuss quite a number of things in that thread. Can you point me to something specific you think I ought to try?

kommuni · March 8, 2021, 5:59pm

As a side note, the workaround is to install the Qubes OS with Btrfs instead of ext4 on LVM.

deeplow · March 8, 2021, 6:32pm

This first suggestion by Marek to check out the log /var/log/xen/console/guest-VMNAME.log.

qube · March 8, 2021, 8:43pm

This first suggestion by Marek to check out the log /var/log/xen/console/guest-VMNAME.log .

My log looks pretty similar to Heinrich’s. If those numbers at the start of the log entries are timestamps, then the system halt entry happened very quickly and some other processes are what are freezing up the system afterwards.

Regarding the testing, I have confirmed that other restored VMs do NOT have this issue: only the large size template and AppVMs based off it. So it’s unlikely specifically to do with the restore, although it may have triggered this issue that does appear to be based on the storage allocated.

Changing the thread title to reflect this now

deeplow · March 9, 2021, 3:27pm

Then I guess you are under the same circumstances as the user who reported the above-mentioned issue. So I guess the best course of action is to move some of that storage elsewhere or following the evolution of the reported issue on Github.