Hardware Error - mce - What does it mean?

Just popped in a new CPU in my PC running HEADS and Qubes OS. It appears
to work, but this is the first thing I see when turning it on:

mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank: 7: ee20000000
3110a
mce: [Hardware Error]: TSC 0 ADDR fefe78c0 MISC 3880000086
mce: [Hardware Error]: PROCESSOR 0:306a9 TIME 1622589409 SOCKET 0
mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank: 8: ee20000000
3110a
mce: [Hardware Error]: TSC 0 ADDR fefe7880 MISC 3880000086
mce: [Hardware Error]: PROCESSOR 0:306a9 TIME 1622589409 SOCKET 0
APIC 0 microcode 1f

Does this mean the hardware is damaged or do I have to reconfigure
something?

(I know this might be on the edge of on-topic even for this category, if
you feel this shouldn’t be here let me know and I’ll remove it)

Why? To me it seems a standard topic for “Hardware support” category.

1 Like

@Sven is that with the 3740?

i get a MCE with heads and the T430 when running 3740 CPU. I will check if its the same MCE as yours. I have a 2nd t430 and a 2nd 3740 CPU here so if they both do it with the exact same error, its likely something to do with Coreboot

normally a MCE is down to damage to the CPU (previous heat damage etc) but it would be weird if three different T430s with three different CPUs had the exact same MCE

@Sven is that with the 3740?

Yes

i get a MCE with heads and the T430 when running 3740 CPU. I will
check if its the same MCE as yours.

That would confirm my suspicions. Initial online research shows that
this is a somewhat common error folks see when switching to Linux. They
also all report their CPU working just fine.

It might have to do with the CONFIG_INTEL_PMC_CORE kernel parameter
and/or the OPTIONAL_TABLE stuff in Heads.

Thank you! I moved it there.

MCE means Machine Check Exception.

The mce linux kernel related options : linux/boot-options.rst at master · torvalds/linux · GitHub .

Solved here with new RAM.

Thanks for the pointers. A bit of background:

[i7-3520M] [2x8GB Silicon Power Hynix] … no error
[i7-3632QM] [2x8GB Silicon Power Hynix] … mce errors
[i7-3520M] [2x8GB Crucial] … no error
[i7-3740QM] [2x8GB Crucial] … mce errors

This makes me pretty sure it’s not the RAM.

I had general freezes in all combinations that are root caused to 5.x
kernel. With 4.19 there are no freezes.

I had a very weird issue with qvm-backup-restore, but that followed the
i7-3520M and is gone since switching to the i7-3740QM.

The mce errors are reported by Heads.

1 Like