Everything was working and this issue did not exist, and I didn’t do anything new and it just arised.
When booting and logging in, everything is normal. Starting VMs is normal. But then, occasionally (and I couldn’t figure out exactly when), dom0 starts eating RAM, that is whenever a VM is shutdown it immediately eats the RAM freed. Still, its maximum is 4080 MB as usual.
When booting and logging in and observing RAM usage: dom0 isn’t using much and is giving e.g. sys-whonix a good amount of RAM. Occasionally when dom0 starts this strange behavior of eating RAM, and getting everything as before when booted (i.e. shutting down all VMs that do not start on boot, having running VMs being same as when observed RAM usage,) RAM usage isn’t like before: dom0 isn’t giving sys-whonix a good amount of RAM as before but becoming greedy and taking the RAM for itself, even though everything is like before (same VMs running, nothing changed in dom0 or these VMs)
Log out fixes this. But it happens again and hence a new log out is needed and so on.
As I understand it, the physical memory is distributed according to need. However, my understanding is very limited, from reading things like this qmemman document .
Do you see qubes which give OOM errors, or show other signs of memory pressure, even while Dom0 has memory it is not using?
I think you have this right.
If dom0 does not give up the memory that would be a bug, but I do not
see this. (I have in the past.) @geoffrey - what Qubes version are you
running?
You can restrict the amount taken by dom0 by changing the boot
parameter for dom0_mem=max:
I never presume to speak for the Qubes team.
When I comment in the Forum I speak for myself.
Another explaination:
On boot, say there are 4 VMs running alongwith dom0: sys-net, sys-firewall, sys-whonix, sys-usb.
Try to start some VMs and use your computer normally, they start without any issues.
Then, at once (occasionally and I could not know when / what triggers this), everything is smooth and no lagging, you simply shutdown a VM then try to start another, it simply won’t start: Not enough memory notification - what happens is that dom0 immediately ate the RAM that the VM was using (can be lively seen in qui-domains tray), even though it was working absolutely well without it and it doesn’t seem to need it for anything.
The latest 4.2.4
I know about this but was looking to solve the problem itself and get the behaviour as it was. I am using Qubes since a long time and this behaviour never happened, without setting max mem for dom0
I am not sure this is relevant, but there was a problem with memory a few weeks ago - I did not really understand the cause, but it made new qubes be stuck at their initial memory, which is sometimes not enough:
I think the solution to this problem requires updates, but maybe it is necessary to increase the initial memory to allow the templates and AppVMs to start, so that they can update.
If that doesn’t help, then maybe someone more expert can help more.
I should have asked before suggesting anything: how are you measuring/observing the eating of the RAM? This may help others to explain the issue
Based on the answer, it should be easier to determine if the issue is a memory leak in the hypervisor, some behavior of dom0 kernel, or even a userland dom0 process, with the last one being easiest to track to the exact culprit
The tools I’m aware of for this are xentop and top/htop. Another tool or subsystem (qmemman) was mentioned but I’m not familiar with it.
I don’t know much about the runtime debugging features of Xen, aside from knowing they exist and are available in Qubes, from dom0. Ditto for the kernel interfaces in sysfs, procfs, debugfs, etc. in dom0 kernel