
On Sat, Mar 8, 2025 at 3:12 AM Gernot Heiser via Devel <devel@sel4.systems> wrote:
I don’t see a reason for providing bulk data copying as a kernel service (despite Liedtke doing this in the original L4, in violation of his own minimality principle). It can (and therefore should) be provided as a user-level service, which is why we dropped it in seL4. And, done properly, it’ll easily outperform QNX’s in-kernel implementation.
I can't come up with a way for such a service to determine the server's address space short of tying endpoints to server address spaces, which is something I want to avoid, since I want the option of handing off connections to threads/processes with different address spaces. The other issue is with mapping the source and destination address spaces; the most "straightforward" ways I can think of to do this would either be to map both source and destination buffers on every IPC, which would be extremely inefficient (especially with the requirement to have a separate capability for each VSpace a page or table is mapped into, meaning one system call would become potentially hundreds or more depending on the buffer sizes), or have an identity mapping of all of physical memory with user-level shadow page tables to look up each page before copying it, which could also be somewhat slower than what could be done in the kernel, where you could have table-sized address space windows reserved for source and destination address spaces, with only two table updates required if both buffers lie within a single table's address space. I did actually end up using the kernel's physical address space window and software walking of page tables in my initial in-kernel implementation since it might have ended up being slightly easier to do initially, but at some point I want to replace it with a table-window-based setup.
Our design (details of which are still very much in flux and thus too early to publish) reduces the TCB to something that should be verifiable.
Presumably this has a novel API designed specifically for verification and relegates Unix programs to a "penalty box"? I'm specifically trying to write a better QNX than QNX and a better Linux than Linux that allows incremental porting and provides all of its advanced features to Unix software, rather than a completely new system that has a separate limited compatibility environment for existing software, so being able to verify a dynamic system with a native API that's incompatible with everything else doesn't help me at all. I suspect that verifying a Unix-like OS (even something like what I'm writing, with read and write being its only real primitives) could end up being rather difficult, although I don't have a background in formal verification so I'm not completely sure on that.