[qubes-users] Systemd terminating qubesd during backup?

I seem to have an intermittent problem when my backup scripts are running late at night.

My qubesd is apparently being shutdown (sent a sigterm signal) by systemd during my long running backup sessions which then causes an eof pipe close exception and qvm-backup then receives a socket exception and immediately receives a second exception while still handling the first exception, thus the qvm-backup process gets forcibly terminated mid stream. Just prior to the qubesd shutdown I can clearly see that systemd had also shutdown/restarted the qubes memory manager (qubes-qmemman) too.

Q: What kind of background maintenance processing would actually require qubesd or the memory manager to be restarted?

Q: Could this processing be put on hold during backups?

Q: Or, how could I at least know when this maintenance is scheduled to happen so I could defer my own processing?

My scripts can certainly trap this error, erase the incomplete backup file, then loop and check for qubesd to complete its restart, and then finally restart my own backup processing, but why should this even be necessary?

When this happens its almost always during the backup of my largest VM which can take well over 90 minutes to complete. If I can somehow block/defer this kind of system maintenance until after my backups are complete that would be better than having to deal with trapping random restarts.

thanks,

Steve

(Attachment backup_error.txt is missing)

(Attachment journalctl.txt is missing)

I seem to have an intermittent problem when my backup scripts are running
late at night.

My qubesd is apparently being shutdown (sent a sigterm signal) by systemd
during my long running backup sessions which then causes an eof pipe close
exception and qvm-backup then receives a socket exception and immediately
receives a second exception while still handling the first exception, thus
the qvm-backup process gets forcibly terminated mid stream. Just prior to
the qubesd shutdown I can clearly see that systemd had also
shutdown/restarted the qubes memory manager (qubes-qmemman) too.

Q: What kind of background maintenance processing would actually require
qubesd or the memory manager to be restarted?

I guess that's logrorate (but it isn't clear to me why qubesd too, not
just qubes-qmemman service...).

Q: Could this processing be put on hold during backups?

Q: Or, how could I at least know when this maintenance is scheduled to
happen so I could defer my own processing?

If that's indeed logrotate, see `systemctl status logrotate.timer`

My scripts can certainly trap this error, erase the incomplete backup file,
then loop and check for qubesd to complete its restart, and then finally
restart my own backup processing, but why should this even be necessary?

When this happens its almost always during the backup of my largest VM which
can take well over 90 minutes to complete. If I can somehow block/defer
this kind of system maintenance until after my backups are complete that
would be better than having to deal with trapping random restarts.

Again, if that's logrotate, you can stop the timer before, and restart it
afterwards.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

This sounds like:

I agree that it's a serious bug. It makes no sense for logrotate to interrupt backups. Backups completing successfully and reliably is infinitely more important than rotating log files.

It looks like the issue has been fixed in 4.1, but I'm still experiencing on 4.0, as well. I've just gotten in the habit of trying not to let my backups run between ~1-6am. :\