[qubes-users] BIG instability problems of qubes

Dear qubes community, I use qubes since its version 3, with many up's
and downs (more up's, happily). Since its version 4 it worked quite
stable, but this changed since some months. I am obliged to hard-reboot
my machine 5-10 times per day, versus a scheduled reboot every two-three
weeks before.

- Somehow the 5.4.x kernels (for xen) are instable on my machine. They
run smoothly my debian appvm's. No clue if the kernel its crashes, but
after 2-15 min the systems becomes unusable: the screen "hangs" and no
other way out than hard reboot. I have a rather std i7, I mention. HCL
attached. My problems:

- The last upgrade removed my last 4.9 xen kernel which would work fine
(how can I get that one back??) so I switched to 5.10 directly. The last
one braught by update won't work: under 5.10.11 kernel, NO WAY to boot a
debian-vm. Journalctl says:

Jan 29 21:39:55 dom0 qubesd[2087]: Start failed: internal error:
libxenlight failed to create new domain 'sys-net'
Jan 29 21:39:55 dom0 qmemman.daemon.algo[2095]:
balance_when_enough_memory(xen_free_memory=12370411092,
total_mem_pref=779203379.2, total_available_memory=15886175008.8)
Jan 29 21:39:55 dom0 qmemman.systemstate[2095]: stat: dom '0'
act=4294967296 pref=779203379.2 last_target=4294967296
Jan 29 21:39:55 dom0 qmemman.systemstate[2095]: stat:
xenfree=12422839892 memset_reqs=[('0', 4294967296)]
Jan 29 21:39:55 dom0 qmemman.systemstate[2095]: mem-set domain 0 to
4294967296

- when running zoom with 5.10.5 xen kernel inside a dedicated zoom-vm
(debian-10) inside firefox (no custom app). The system "hangs" screen
hangs, sound loops over last second, and that's it. I do not see any
special before the problem occurs (see down) but there is something
strange while boot. It is displayed for each CPU separately.

Feb 02 16:14:43 dom0 kernel: ------------[ cut here ]------------
Feb 02 16:14:43 dom0 kernel: WARNING: CPU: 1 PID: 0 at
/home/user/rpmbuild/BUILD/kernel-latest-5.10.5/linux-5.10.5/arch/x86/xen/enlighten_pv.c:660
get_trap_addr+0x81/0x90
Feb 02 16:14:43 dom0 kernel: Modules linked in: loop ebtable_filter
ebtables ip6table_filter ip6_tables iptable_filter vfat fat
snd_hda_codec_hdmi snd_soc_skl snd_soc_sst_ipc snd_soc_sst_dsp
Feb 02 16:14:43 dom0 kernel: xen_acpi_processor xenfs ip_tables
dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt hid_multitouch
nvme rtsx_pci_sdmmc mmc_core crct10dif_pclmul crc32_pcl
Feb 02 16:14:43 dom0 kernel: CPU: 1 PID: 0 Comm: swapper/1 Tainted: G
     W 5.10.5-1.qubes.x86_64 #1
Feb 02 16:14:43 dom0 kernel: Hardware name: Dell Inc. Latitude
7390/09386V, BIOS 1.5.1 07/12/2018
Feb 02 16:14:43 dom0 kernel: RIP: e030:get_trap_addr+0x81/0x90
Feb 02 16:14:43 dom0 kernel: Code: b0 c4 e1 82 48 89 07 b8 01 00 00 00
85 f6 74 04 84 c0 75 16 b8 01 00 00 00 c3 48 8b 42 08 48 89 07 0f b6 42
10 83 f0 01 eb e2 <0f> 0b 31 c0 c3 cc cc cc cc
Feb 02 16:14:43 dom0 kernel: RSP: e02b:ffffc90000abfe08 EFLAGS: 00010002
Feb 02 16:14:43 dom0 kernel: RAX: 0000000000000001 RBX: ffffffff830d41d0
RCX: ffffffff82625558
Feb 02 16:14:43 dom0 kernel: RDX: ffffffff82625558 RSI: 0000000000000005
RDI: ffffc90000abfe10
Feb 02 16:14:43 dom0 kernel: RBP: ffffffff830da0f0 R08: 0000000000000001
R09: 0000000000000000
Feb 02 16:14:43 dom0 kernel: R10: ffffffff8249f900 R11: ffffffff82744648
R12: ffffffff830d9f20
Feb 02 16:14:43 dom0 kernel: R13: 000000000000001d R14: ffffffff8249f440
R15: 000000000000001d
Feb 02 16:14:43 dom0 kernel: FS: 0000000000000000(0000)
GS:ffff888135c40000(0000) knlGS:0000000000000000
Feb 02 16:14:43 dom0 kernel: CS: 10000e030 DS: 002b ES: 002b CR0:
0000000080050033
Feb 02 16:14:43 dom0 kernel: CR2: 0000720f340010c6 CR3: 0000000002610000
CR4: 0000000000050660
Feb 02 16:14:43 dom0 kernel: Call Trace:
Feb 02 16:14:43 dom0 kernel: cvt_gate_to_trap+0x50/0xa0
Feb 02 16:14:43 dom0 kernel: ? asm_exc_double_fault+0x30/0x30
Feb 02 16:14:43 dom0 kernel: xen_convert_trap_info+0x60/0xa0
Feb 02 16:14:43 dom0 kernel: xen_load_idt+0x46/0xa0
Feb 02 16:14:43 dom0 kernel: load_current_idt+0x11/0x20
Feb 02 16:14:43 dom0 kernel: cpu_init+0x148/0x410
Feb 02 16:14:43 dom0 kernel: cpu_bringup+0x10/0x90
Feb 02 16:14:43 dom0 kernel: xen_pv_play_dead+0x38/0x60
Feb 02 16:14:43 dom0 kernel: do_idle+0x1c9/0x2b0
Feb 02 16:14:43 dom0 kernel: cpu_startup_entry+0x19/0x20
Feb 02 16:14:43 dom0 kernel: asm_cpu_bringup_and_idle+0x5/0x1000
Feb 02 16:14:43 dom0 kernel: ---[ end trace 011f03ca1c0f295f ]---
Feb 02 16:14:43 dom0 kernel: cpu 1 spinlock event irq 131
Feb 02 16:14:43 dom0 kernel: ACPI: \_PR_.PR01: Found 3 idle states
Feb 02 16:14:43 dom0 kernel: CPU1 is up
Feb 02 16:14:43 dom0 kernel: installing Xen timer for CPU 2
Feb 02 16:14:43 dom0 kernel: ------------[ cut here ]------------

[IN RED COLOUR]
Feb 02 16:15:22 dom0 qmemman.systemstate[2401]: Xen free = 142013308 too
small for satisfy assignments! assigned_but_unused=117851537,
domdict={'6': {'no_progress': False, 'id': '6', 'mem_us

Feb 02 16:19:58 dom0 qmemman.daemon.algo[2401]:
balance_when_enough_memory(xen_free_memory=15972114,
total_mem_pref=8106054041.599998, total_available_memory=7690845634.400002)
Feb 02 16:19:58 dom0 qmemman.daemon.algo[2401]: left_memory=176709601
acceptors_count=8
Feb 02 16:20:15 dom0 qmemman.daemon.algo[2401]:
balance_when_enough_memory(xen_free_memory=15972114,
total_mem_pref=8106054041.599998, total_available_memory=7690845634.400002)
Feb 02 16:20:15 dom0 qmemman.daemon.algo[2401]: left_memory=176709601
acceptors_count=8
Feb 02 16:20:28 dom0 qmemman.daemon.algo[2401]:
balance_when_enough_memory(xen_free_memory=15972114,
total_mem_pref=8149318041.599998, total_available_memory=7647581634.400002)
Feb 02 16:20:28 dom0 qmemman.daemon.algo[2401]: left_memory=175593328
acceptors_count=8
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: stat: dom '6'
act=1464576447 pref=746505011.2 last_target=1464576447
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: stat: dom '3'
act=33554432 pref=108003328 last_target=33554432
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: stat: dom '19'
act=912364423 pref=460770918.40000004 last_target=912364423
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: stat: dom '0'
act=3300310385 pref=1696378880.0 last_target=3300310385
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: stat: dom '24'
act=3337194814 pref=1739761254.4 last_target=3337194814
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: stat: dom '5'
act=1499811981 pref=764737126.4 last_target=1499811981
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: stat: dom '10'
act=1237603629 pref=629061222.4 last_target=1237603629
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: stat: dom '4'
act=662710788 pref=331591270.40000004 last_target=662710788
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: stat: dom '7'
act=3332800663 pref=1672509030.4 last_target=3332800663
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: stat: xenfree=68400914
memset_reqs=[('7', 3260730361), ('0', 3306954173), ('24', 3390963866),
('5', 1502835818), ('10', 1240100289), ('6', 146
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: mem-set domain 7 to
3260730361
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: mem-set domain 0 to
3306954173
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: mem-set domain 24 to
3390963866
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: mem-set domain 5 to
1502835818
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: mem-set domain 10 to
1240100289
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: mem-set domain 6 to
1467529442
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: mem-set domain 3 to 33554432
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: mem-set domain 4 to
664051611
Feb 02 16:20:28 dom0 qmemman.systemstate[2401]: mem-set domain 19 to
914207181
-- Reboot --

Good news: I could send this mail using qubes :)) Thank for your
helpful comments. Bernhard

Qubes-HCL-Dell_Inc_-Latitude_7390-20201129-212036.yml (793 Bytes)

haaber:

- The last upgrade removed my last 4.9 xen kernel which would work fine
(how can I get that one back??) so I switched to 5.10 directly. The last
one braught by update won't work: under 5.10.11 kernel, NO WAY to boot a
debian-vm. Journalctl says:

You may be able to get the old kernel back by specifying its version number with something like "dnf install kernel-4.19.155-1.pvops.qubes". Not entirely positive on the syntax or if it will work or further break your system.

just FYI: kernel-latest is now at 5.10.13, which has a fix for the specific issue you encountered.

Otherwise, you can find the version numbers for earlier kernels here: Index of /r4.0/current/dom0/fc25/rpm/ for manual installation.