On Apr 13, 2020, at 04:17, Andrew Warkentin
wrote: On 4/12/20, Heiser, Gernot (Data61, Kensington NSW)
wrote: Sure, OS structure matters a lot, and I’m certainly known for be telling people consistently that IPC payloads of more than a few words is a strong indicator of a poor design. Microkernel IPC should be considered a protected function call mechanism, and you shouldn’t pass more by-value data than you would to a C function (see https://microkerneldude.wordpress.com/2019/03/07/how-to-and-how-not-to-use-s...).
I would think that an RPC layer with run-time marshaling of arguments as is used as the IPC transport layer on most microkernel OSes would add some overhead, even if it is using the underlying IPC layer properly, since it has to iterate over the list of arguments, determine the type of each, and copy it to a buffer, and the reverse happening on the receiving end. Passing around bulk unstructured/opaque data is quite common (e.g. for disk and network transfers), and an RPC-based transport layer adds unnecessary overhead and complexity to such use cases.
I think a better transport layer design (for an L4-like kernel at least) would be one that maintains RPC-like call-return semantics, but exposes message registers and a shared memory buffer almost directly with the only extra additions being a message length, file offset, and type code (to indicate whether a message is a short register-only message, a long message in the buffer, or an error) rather than using marshaling. This is what I plan to do on UX/RT, which will have a Unix-like IPC transport layer API that provides new read()/write()-like functions that operate on message registers or the shared buffer rather than copying as the traditional versions do (the traditional versions will also still be present of course, implemented on top of the "raw" versions).
RPC with marshaling could easily still be implemented on top of such a transport layer (for complex APIs that need marshaling) with basically no overhead compared to an RPC-based transport layer.
This is effectively the approach CAmkES takes. Generated, specialised RPC entry points that use seL4’s IPC mechanisms. With enough information provided at compile time, you can generate type safe and performant marshaling code. With some care, you can generate code the compiler can easily see through and optimise. For passing bulk data, you can either use shared memory or unmap/remap pages during RPC. When we did some optimisation work on CAmkES for a paper (~2014?) we were able to comfortably hit the limit of what one could have achieved with hand optimised IPC.