As the original author of the x86 model I can provide the trip down memory lane.

The original x86 implementation had the model that ARM has now. Unfortunately due to having two different threads there was always a very difficult to resolve race in performing interrupt injection into the guest. There is a race since injecting an interrupt requires

1. Read guest state

2. Validate guest is interrutible

3. Modify guest state or request notification when guest is interruptible

If the guest state changes between 1 and 3 then there is a problem, so there needs to be a way to synchronize the guest. No such method exists, since in the thread APIs back then (the MCS kernel has a fix) you could only suspend a thread, which had the effect of cancelling all its IPCs, including any faults. So if suspended a VCPU thread that was faulted its fault message would get thrown away, which is a problem.

Running the VCPU and VMM at different priorities has its own problems and leads to a priority management nightmare that can still have races where some other event causes us to want to inject an interrupt at the same time as the VCPU now has a pending fault. Since thread states aren't queryable the VMM, whilst injecting the interrupt, does not know that the VCPU is actually sitting on a pending fault.

All this lead to the very deliberate decision to use a single thread for both vmx and native execution scheduling as it allows the VMM to *know* that when it is running its native thread that the guest will not run and it knows precisely whether it is faulted or not. The other option discussed at the time was a "suspend thread if running and return its current state". This would have removed the problems of discarding a fault by blinding calling 'suspend', and also allows for knowing there is a fault to handle it prior to the interrupt. Ultimately this option was not chosen as it was deemed more invasive with unknown proof consequences and required more system calls, thus hurting performance. A third option would be to have the kernel inject interrupts, but this was never really considered as moving something to the kernel just to avoid giving the user appropriate tools is bit nasty and would have put policy in the kernel on how interrupt injection should happen.

ARM does not have this problem since interrupt injection goes via the vGIC, and since the kernel cannot give the user direct access to the vGIC registers there was no choice but to move it into the kernel. The ARM virtualization model of extra EL levels naturally lends itself to just letting threads run at higher privelege levels (with the VCPU just being the 'extra register storage space') and the notion of a 'guest' and 'vmm' become a user concept. In fact for the kernel aside from extra register saving and restoring there really is no difference between a thread with and without a VCPU, they can both a cspaces, make systems calls, and generally do whatever they want. In fact an EL1 thread with a VCPU could act as the fault handler (i.e. VMM) for another thread.

Is one of the models inherently more desirable? Perhaps. Does it matter if they are different? Perhaps not. I shall withhold any such opinions as I have no say in these matters now, I'm just here to provide the stories.

Adrian

On Fri, Nov 9, 2018 at 8:43 AM Jeff Waugh <jdub@bethesignal.org> wrote:

On Fri, Nov 9, 2018 at 6:10 AM <Anna.Lyons@data61.csiro.au> wrote:
seL4's x86 and ARM virtualisation architectures are indeed different. On x86, we use one kernel thread for both guest and host, and use the thread state to differentiate what is currently running. On ARM, guest and host run in different kernel threads so this isn't required.

Out of interest, what's the reason for the different approaches?
_______________________________________________
Devel mailing list
Devel@sel4.systems
https://sel4.systems/lists/listinfo/devel