Re: [seL4] Scheduling of vCPUs on x86

9 Nov 2018

      As the original author of the x86 model I can provide the trip down memory
lane.

The original x86 implementation had the model that ARM has now.
Unfortunately due to having two different threads there was always a very
difficult to resolve race in performing interrupt injection into the guest.
There is a race since injecting an interrupt requires
1. Read guest state
2. Validate guest is interrutible
3. Modify guest state or request notification when guest is interruptible
If the guest state changes between 1 and 3 then there is a problem, so
there needs to be a way to synchronize the guest. No such method exists,
since in the thread APIs back then (the MCS kernel has a fix) you could
only suspend a thread, which had the effect of cancelling all its IPCs,
including any faults. So if suspended a VCPU thread that was faulted its
fault message would get thrown away, which is a problem.
Running the VCPU and VMM at different priorities has its own problems and
leads to a priority management nightmare that can still have races where
some other event causes us to want to inject an interrupt at the same time
as the VCPU now has a pending fault. Since thread states aren't queryable
the VMM, whilst injecting the interrupt, does not know that the VCPU is
actually sitting on a pending fault.

All this lead to the very deliberate decision to use a single thread for
both vmx and native execution scheduling as it allows the VMM to *know*
that when it is running its native thread that the guest will not run and
it knows precisely whether it is faulted or not. The other option discussed
at the time was a "suspend thread if running and return its current state".
This would have removed the problems of discarding a fault by blinding
calling 'suspend', and also allows for knowing there is a fault to handle
it prior to the interrupt. Ultimately this option was not chosen as it was
deemed more invasive with unknown proof consequences and required more
system calls, thus hurting performance. A third option would be to have the
kernel inject interrupts, but this was never really considered as moving
something to the kernel just to avoid giving the user appropriate tools is
bit nasty and would have put policy in the kernel on how interrupt
injection should happen.

ARM does not have this problem since interrupt injection goes via the vGIC,
and since the kernel cannot give the user direct access to the vGIC
registers there was no choice but to move it into the kernel. The ARM
virtualization model of extra EL levels naturally lends itself to just
letting threads run at higher privelege levels (with the VCPU just being
the 'extra register storage space') and the notion of a 'guest' and 'vmm'
become a user concept. In fact for the kernel aside from extra register
saving and restoring there really is no difference between a thread with
and without a VCPU, they can both a cspaces, make systems calls, and
generally do whatever they want. In fact an EL1 thread with a VCPU could
act as the fault handler (i.e. VMM) for another thread.

Is one of the models inherently more desirable? Perhaps. Does it matter if
they are different? Perhaps not. I shall withhold any such opinions as I
have no say in these matters now, I'm just here to provide the stories.

Adrian

On Fri, Nov 9, 2018 at 8:43 AM Jeff Waugh <jdub@bethesignal.org> wrote:
...
On Fri, Nov 9, 2018 at 6:10 AM <Anna.Lyons@data61.csiro.au> wrote:
...
seL4's x86 and ARM virtualisation architectures are indeed different. On
x86, we use one kernel thread for both guest and host, and use the thread
state to differentiate what is currently running. On ARM, guest and host
run in different kernel threads so this isn't required.
Out of interest, what's the reason for the different approaches?
_______________________________________________
Devel mailing list
Devel@sel4.systems
https://sel4.systems/lists/listinfo/devel

Re: [seL4] Scheduling of vCPUs on x86

Adrian Danis