On 7/31/21, Gernot Heiser
Paravirtualising the (dynamically-linked) Linux libraries to IPC to a Linux server rather than trying to do a direct Linux syscall would be another one, and it shouldn’t be hard to do this form of paravirtualisation automatically. (We did it way back with the actual Linux kernel, which is a much harder thing to do.)
But even if for some reason that isn’t clear to me you insist on no changes at all to the Linux program (not even dynamic libraries? why?) then that would be quite solvable as well: Any seL4 syscall must be authorised by a cap, and the Linux syscalls won’t present a cap, so trigger an exception, and get forwarded to an exception handler. That receives the fault address and can emulate the Linux functionality.
I don't want to require Linux applications to use a specific patched libc since that would create all kinds of compatibility issues. I'm writing a general-purpose OS that is just as likely to be used on desktops and servers as on embedded systems, and I want to support as many Linux binaries as possible (ideally everything that doesn't depend on certain types of kernel modules and doesn't manage sessions/logins). Requiring a specific libc would significantly limit compatibility. Wouldn't there be a risk that a Linux system call would present an argument that happens to look like a capability and not get intercepted if system calls were implemented by just catching the existing invalid-syscall exceptions?
seL4 IPC isn’t like Linux IPC – it’s a user-controlled context switch, a mechanism for implementing protected procedure calls (https://microkerneldude.wordpress.com/2019/03/07/how-to-and-how-not-to-use-s...). Hence doing seL4 IPC across cores makes no sense. Layering a Linux-like IPC model on top (probably using Notifications rather than IPC) doesn’t require kernel changes.
QNX has IPC that is similar in concept to seL4 (i.e. functioning as a fast context switch whenever possible) but has no caveats or restrictions on cross-core use that I could find. And UX/RT's Unix-style filesystem API won't be one RPC service among many like on most academic and/or "Mach-like" microkernel OSes. It will be the sole form of user-visible IPC. This is sort of like on QNX but will go a bit further (QNX still provides user access to raw kernel IPC but it uses filesystem-based naming for basically all servers, at least in Neutrino). Like I said, there will be additional "rare mode" register- and shared-buffer-based versions of read/write (interoperable with the traditional versions) to allow for zero-copy IPC and eliminate the need for user-visible raw IPC. No support for cross-core endpoints would mean that UX/RT wouldn't be able to use endpoints for IPC at all and would have to only use notifications, which would significantly impact performance since the fastpath would never be used.