On 12/2/20 9:51 PM, Andrew Warkentin wrote:
On 12/2/20, Demi M. Obenour <demiobenour@gmail.com> wrote:
That said, a set of userland utilities to run multiple seL4-based systems on the same kernel would could arguably be considered a hypervisor, and I consider it quite practical. There may also be cases where the extra layer of page tables used in a VM are necessary for some reason. The only one I can think of is to prevent physical addresses from being leaked, since obtaining them is a prerequisite to Rowhammer attacks.
I don't really want to add any kind of multi-personality infrastructure to UX/RT beyond VFS-level containerization, since it would complicate the architecture a fair bit.
That’s understandable. From my perspective, it appears that the only change required would be to swap out the VMM, since the same kernel capabilities would be required either way. The only difference would be that a nested instance of UX/RT would need to get untyped memory objects somehow, which seems simple. Of course, I could very well be missing something here ― if this would meaningfully increase the complexity of the system, it probably isn’t worth it. That said, from the way you phrased your message, I thought you were referring to a type-1 hypervisor that would run below UX/RT. IMO, that is where such a tool really belongs ― it can provide strong isolation guarantees between multiple seL4-based systems, and still allow each of those systems to use hardware virtualization if they so desire. For instance, an embedded system might have high-assurance components running on the seL4 Core Platform, while UX/RT is used as a replacement for VMs running Linux. Similarly, a hypothetical seL4-based QubesOS might use this type-1 hypervisor to isolate qubes from each other. FYI, since you plan on a Linux compatibility layer, you might want to contact the illumos developers. illumos is, of course, a completely different OS design, but they do have a full Linux compatibility layer and might be able to give some implementation advice.
There is already going to be support for hosting device backends since I want the option of virtualizing other systems alongside UX/RT. I also want to be able to run complete UX/RT VMs including their own kernels for testing/development purposes (and possibly running older applications, although UX/RT will generally try to maintain backwards compatibility unless there is a good reason not to). It's very likely that there could be issues that only show up when running as a VM and not as an alternate personality (or issues that only appear when running as an alternate personality), and also the lack of a stable kernel API would make mixing versions much more difficult.
An officially-supported, API- and ABI- stable C library is planned, so this may not be a roadblock for much longer.
I don't really think there would be any serious issues with nesting microkernels in VMs. Nesting instances of the same monolithic kernel is already common with type 2 VMMs, and I would imagine there would be less overhead with nesting lightweight microkernels than there would be with monolithic kernels.
I can’t think of any blockers, but I can think of three serious (IMO) disadvantages: 1. The need for an emulator removes many of the assurance guarantees provided by seL4, since one must rely on the correctness of the emulator to prevent in-VM privilege escalation vulnerabilities. Such vulnerabilities are not uncommon in existing hypervisors. 2. Nested hardware virtualization is quite difficult to implement, and has significant overhead. On the other hand, nested virtualization based on seL4 capabilities is free. 3. I doubt seL4 supports issuing any specialized hypercall instructions, so you might need to fall back to emulation. Again, UX/RT is your project; I merely hope my statements will be of use. Take or leave them as you wish. Sincerely, Demi