Re: [qubes-users] win7 HVM will not start with RAM > 3GB with "Failed to load VM config"

Because of this bug I had given up on Windows HVMs for a while - but now I have more reasons to want to run Win7 HVMs and there have been many updates to Xen etc so thought I’d give it a shot. Sadly, I see no marked improvement:

Xen Minimal OS!
start_info: 0x570000(VA)
nr_pages: 0x2000
shared_inf: 0x10fc1000(MA)
pt_base: 0x573000(VA)
nr_pt_frames: 0x7
mfn_list: 0x560000(VA)
mod_start: 0x0(VA)
mod_len: 0
flags: 0x0
cmd_line: -d 44
stack: 0x51ef20-0x53ef20
MM: Init
_text: 0x0(VA)
_etext: 0x10ab51(VA)
_erodata: 0x15d000(VA)
_edata: 0x1662a8(VA)
stack start: 0x51ef20(VA)
_end: 0x55f828(VA)
start_pfn: 57d
max_pfn: 2000
Mapping memory range 0x800000 - 0x2000000
setting 0x0-0x15d000 readonly
skipped 0x1000
MM: Initialise page allocator for 589000(589000)-2000000(2000000)
MM: done
Demand map pfns at 2001000-2002001000.
Heap resides at 2002002000-4002002000.
Initialising timer interface
Initialising console … done.
gnttab_table mapped at 0x2001000.
Initialising scheduler
Thread “Idle”: pointer: 0x2002002050, stack: 0x5a0000
Thread “xenstore”: pointer: 0x2002002800, stack: 0x5b0000
xenbus initialised on irq 1 mfn 0x144db
Dummy main: start_info=0x53f020
Thread “main”: pointer: 0x2002002fb0, stack: 0x5d0000
Thread “pcifront”: pointer: 0x2002003760, stack: 0x5e0000
pcifront_watches: waiting for backend path to appear device/pci/0/backend
dom vm is at /vm/58c49001-4ae8-4768-8c3f-16c3fa24476b
“main” “-d” “44” “-d” “44” “-domain-name” “win7” “-vnc” “127.0.0.1:0” “-vncunused” “-videoram” “8” “-std-vga” “-boot” “dca” “-usb” “-usbdevice” “tablet” “-acpi” “-vcpus” “4” “-vcpu_avail” “0xf” “-net” “nic,vlan=0,macaddr=00:16:3e:5e:6c:0f,model=rtl8139” “-net” “tap,vlan=0,ifname=tap44.0,bridge=xenbr0,script=no” “-net” “lwip,client_ip=10.137.2.17,server_ip=10.137.2.254,dns=10.137.2.1,gw=10.137.2.1,netmask=255.255.255.0”
domid: 44
domid: 44
************************ NETFRONT for device/vif/0 **********

net TX ring size 256
net RX ring size 256
backend at /local/domain/2/backend/vif/45/0
mac is 00:16:3e:5e:6c:0f

        > Assigning my Win7 HVM more than 3GB of RAM makes it unable to
        start:
        >
        > [alex@dom0 ~]$ qvm-start --debug win7
        > --> Loading the VM (type = HVM)...
        > xc: error: panic: xc_dom_bzimageloader.c:634:
        xc_dom_probe_bzimage_kernel:
        > kernel is not a bzImage: Invalid kernel

        Ignore this message.

        > libxl: error: libxl_device.c:479:libxl__wait_for_device_model
        Device Model
        > not ready

        But this can be a problem. Perhaps the result of zombie win7-dm
        from previous
        failed boot. But the first one failed for some other reason.

    Here is what happens after a fresh boot:

    [alex@dom0 ~]$ qvm-start --debug win7
    --> Loading the VM (type = HVM)...
    xc: error: panic: xc_dom_bzimageloader.c:634:
    xc_dom_probe_bzimage_kernel: kernel is not a bzImage: Invalid kernel
    xc: error: panic: xc_dom_boot.c:159: xc_dom_boot_mem_init: can't
    allocate low memory for domain: Out of memory
    libxl: error: libxl_dom.c:207:libxl__build_pv xc_dom_boot_mem_init
    failed: Device or resource busy
    xl: fatal error: libxl_create.c:499, rc=-3:

libxl__create_device_model

    libxl: error: libxl_dm.c:763:libxl__destroy_device_model Couldn't
    find device model's pid: No such file or directory
    libxl: error: libxl.c:740:libxl_domain_destroy
    libxl__destroy_device_model failed for 3

    ERROR: Failed to load VM config
    [alex@dom0 ~]$ xl list

    Name ID Mem VCPUs
    State Time(s)
    dom0 0 6166 4
    r----- 88.5
    netvm 1 200 4
    -b---- 17.4
    firewallvm 2 1512 4
    -b---- 11.2
    win7-dm 4 17 0
    --p--- 0.0
    [alex@dom0 ~]$

        > xl: fatal error: libxl_create.c:536, rc=-1:
        > libxl__confirm_device_model_startup
        > ERROR: Failed to load VM config
        > [alex@dom0 ~]

        Check this logs:
        /var/log/xen/console/guest-win7*
        /var/log/xen/console/hypervisor.log
        /var/log/xen/xl-win7.log*

(...)

This is the same problem that I reported in the thread "intermittent
error starting HVM" to which nobody replied. Basically, if you assign a
lot of memory to an HVM, it will not always start. If it fails, type
"sync", wait about 10 seconds, and try again. It will often succeed on
the second or third try. It has something to do with memory
fragmentation in xen (so I am told.) I guess maybe xen needs to find
contiguous memory for the HVM?

Not sure it's the same, because my Win7 HVM will simply *not start* for
days now, across reboots etc.

Hope the above provide some hint as to what's going on, as this is turning
out to be a functionality killer :frowning:

Alex

Because of this bug I had given up on Windows HVMs for a while - but now I
have more reasons to want to run Win7 HVMs and there have been many updates
to Xen etc so thought I'd give it a shot. Sadly, I see no marked
improvement:

(...)

populating video RAM at ff000000
Failed to populate video ram
close(0)
GPF rip: 0xff114, error_code=0
Thread: main

(...)

It seems the video RAM bug is still there - even with this new HVM with
only 2GB of RAM given to the HVM. Please note there are no zombie VMs -
they've all been killed with xl destroy.

Anything new we can try?

I'm trying to reproduce this. HVM with 4GB works perfectly for me... It fails
with 6GB, but this can be real out of memory.

But found another problem - stubdom output isn't logged to
/var/log/xen/console/*.log. So check timestamp of this logfile - is those
entries the fresh one?
If not - edit /etc/sysconfig/xencommons, add "XENCONSOLED_TRACE=all" and
reboot the system.

        > Assigning my Win7 HVM more than 3GB of RAM makes it unable to
        start:
        >
        > [alex@dom0 ~]$ qvm-start --debug win7
        > --> Loading the VM (type = HVM)...
        > xc: error: panic: xc_dom_bzimageloader.c:634:
        xc_dom_probe_bzimage_kernel:
        > kernel is not a bzImage: Invalid kernel

        Ignore this message.

        > libxl: error: libxl_device.c:479:libxl__wait_for_device_model
        Device Model
        > not ready

        But this can be a problem. Perhaps the result of zombie win7-dm
        from previous
        failed boot. But the first one failed for some other reason.

    Here is what happens after a fresh boot:

    [alex@dom0 ~]$ qvm-start --debug win7
    --> Loading the VM (type = HVM)...
    xc: error: panic: xc_dom_bzimageloader.c:634:
    xc_dom_probe_bzimage_kernel: kernel is not a bzImage: Invalid kernel
    xc: error: panic: xc_dom_boot.c:159: xc_dom_boot_mem_init: can't
    allocate low memory for domain: Out of memory
    libxl: error: libxl_dom.c:207:libxl__build_pv xc_dom_boot_mem_init
    failed: Device or resource busy
    xl: fatal error: libxl_create.c:499, rc=-3:

libxl__create_device_model

    libxl: error: libxl_dm.c:763:libxl__destroy_device_model Couldn't
    find device model's pid: No such file or directory
    libxl: error: libxl.c:740:libxl_domain_destroy
    libxl__destroy_device_model failed for 3

    ERROR: Failed to load VM config
    [alex@dom0 ~]$ xl list

    Name ID Mem VCPUs
    State Time(s)
    dom0 0 6166 4
    r----- 88.5
    netvm 1 200 4
    -b---- 17.4
    firewallvm 2 1512 4
    -b---- 11.2
    win7-dm 4 17 0
    --p--- 0.0
    [alex@dom0 ~]$

        > xl: fatal error: libxl_create.c:536, rc=-1:
        > libxl__confirm_device_model_startup
        > ERROR: Failed to load VM config
        > [alex@dom0 ~]

        Check this logs:
        /var/log/xen/console/guest-win7*
        /var/log/xen/console/hypervisor.log
        /var/log/xen/xl-win7.log*

(...)

This is the same problem that I reported in the thread "intermittent
error starting HVM" to which nobody replied. Basically, if you assign a
lot of memory to an HVM, it will not always start. If it fails, type
"sync", wait about 10 seconds, and try again. It will often succeed on
the second or third try. It has something to do with memory
fragmentation in xen (so I am told.) I guess maybe xen needs to find
contiguous memory for the HVM?

Not sure it's the same, because my Win7 HVM will simply *not start* for
days now, across reboots etc.

Hope the above provide some hint as to what's going on, as this is turning
out to be a functionality killer :frowning:

Alex

Because of this bug I had given up on Windows HVMs for a while - but now I
have more reasons to want to run Win7 HVMs and there have been many updates
to Xen etc so thought I'd give it a shot. Sadly, I see no marked
improvement:

(...)

populating video RAM at ff000000
Failed to populate video ram
close(0)
GPF rip: 0xff114, error_code=0
Thread: main

(...)

It seems the video RAM bug is still there - even with this new HVM with
only 2GB of RAM given to the HVM. Please note there are no zombie VMs -
they've all been killed with xl destroy.

Anything new we can try?

I'm trying to reproduce this. HVM with 4GB works perfectly for me... It fails
with 6GB, but this can be real out of memory.

I think I found something. When trying to start HVM with 6GB, this appears in
xl dmesg:
(XEN) memory.c:134:d0 Could not allocate order=9 extent: id=43 memflags=0 (2 of 4)
(XEN) memory.c:134:d0 Could not allocate order=9 extent: id=43 memflags=0 (0 of 4)
(XEN) memory.c:134:d0 Could not allocate order=9 extent: id=43 memflags=0 (0 of 4)
(XEN) memory.c:134:d0 Could not allocate order=9 extent: id=43 memflags=0 (0 of 4)
(XEN) memory.c:134:d0 Could not allocate order=9 extent: id=43 memflags=0 (0 of 4)

id=43 is XID of just created HVM, order=9 is 2MB (2^9*4kB). So this basically
means that there isn't enough free 2MB super pages in the system. Perhaps
memory is too fragmented...

Some time ago I tried to find a way to force to use only 2MB memory chunks (to
not split them to individual 4kB pages), but without success.

But found another problem - stubdom output isn't logged to
/var/log/xen/console/*.log. So check timestamp of this logfile - is those
entries the fresh one?
If not - edit /etc/sysconfig/xencommons, add "XENCONSOLED_TRACE=all" and
reboot the system.

Actually "service xencommons restart" is enough, full reboot not needed.

Marek Marczykowski-Górecki: