On Tue 10-Jan-2017 3:52 PM, Corey Richardson wrote:

This kernel doesn't even seem to boot at all with -cpu host -enable-kvm

(or -cpu IvyBridge), regardless of the configuration. Curiously, with
-cpu SandyBridge -enable-kvm, it does boot. It seems to be a qemu bug.

It seems to be stuck in some sort of loop handling int_0e (page fault).
When I attach gdb, I get this sort of behavior:

<int_0e>     mov    0xffffffff80745008,%rsp

  <int_0e+8>   add    $0x3e8,%rsp
  <int_0e+15>  push   %rcx

At the start, rsp is x64KSIRQStack, but after an si $rsp is 0? Another
si and it's 0x3e8, and trying to execute the push causes gdb to "wait"
until I press Ctrl-c, at which point it's back to the top of int_0e and
the stack as reported by bt is unchanged:

#0  0xffffffff8071537c in int_0e ()
#1  0x0000000000000002 in ?? ()
#2  0xffffffff8071538b in int_0e ()

followed by some garbage. I expect the "waiting" is gdb not knowing
what's happening when qemu makes an interrupt happen.

If it's at all useful, here's the kernel under consideration:

https://files.octayn.net/sel4-4.0.0-x86_64-pc99

It's 4.0.0 with this config: http://ix.io/1PDr

Kofi was able to reproduce this in Qemu, but not on Haswell hardware.
Any idea what the issue might be?

I've tracked this down to a combination of two problems.

1. There is actually a single GP fault happening to trigger everything but the early boot code is not setup to handle and print out kernel faults correctly and hence you end up in this infinite page fault loop
2. KVM fails to emulate the PLATFORM_INFO_MSR properly, despite reporting through CPUID a CPU model where it is architecturally defined to exist. Workaround seems to be to detect GP faults to indicate a failure to read the MSR

Cannot say when we will have a chance to fix these.

Adrian

Adrian