[seL4] Re: sel4cp and device driver API's

13 Sep 2022


      Demi Marie Obenour wrote:
...
...
While the basic framework is in place and performs well (it outperforms Linux without even trying too hard…) there are a number of questions that still need further research, and are unlikely to be resolved by the time of the initial release. One of them is whether drivers should be active (synchronising with notifications only) or passive PDs (using PPCs). There are a bunch of tradeoffs to consider, and we need a fair amount of experimental work to settle it. The good news is that the effect the choice has on driver as well as client implementation is minimal (a few lines changed, possibly zero on the client side).
I **strongly** recommend active drivers, for the following reasons:
1. While seL4 can perform context switches with incredible speed, lots
    of hardware requires very expensive flushing to prevent Spectre
    attacks.  On x86, for instance, one must issue an Indirect Branch
    Predictor Barrier (IPBP), which costs thousands of cycles.  Other
    processors, such as ARM, also require expensive flushing.  In the
    general case, it is not possible to safely avoid these flushes on
    current hardware.
The issue of managing the number and timing of context switches is 
indeed critically important to any multi-process system. However, in as 
far as my original question was about comparing the mechanisms of 
"shared memory+IPC" and "shared memory+notifications" (which may or may 
not be captured by the terms "active driver" vs "passive driver", I'm 
not sure), I'm seeking to understand the operations of these styles of 
communications independently of any policy that may be built on top of them.
To put it another way, the act of making another, higher-priority thread 
runnable is the action which induces a context switch; whether that 
action is part of an IPC, notification, fault, etc. If a piece of code 
decides to perform one of those activities, it is choosing to incur the 
costs of a context-switch. It decides if and when to do this based on 
its own policy and the compromises it necessarily entails; and these are 
made independently of the mechanisms provided by the kernel.
...
Therefore, context switches must be kept to a
    minimum, no matter how good the kernel is at doing them.
Let us speak precisely here. The management of the cost of context 
switches, being a limited resource, is a policy-based determination 
which must be left up to the application to decide and not dictated by 
the system. What the system should provide is mechanisms which allow the 
correct trade-offs to be made, which is why I'm especially curious to 
see how we can support things like scatter-gather and Segmentation 
Offloading which are critical for other platforms and I expect this one too.
...
2. Passive drivers only make sense if the drivers are trusted.  In the
    case of e.g. a USB or network drivers, this is a bad assumption: both
    are major attack vectors and drivers ported from e.g. Linux are
    unlikely to be hardened against malicious devices.
Hmm, this I am quite surprised by. Is this an expected outcome of the 
seL4 security model?

This implies that a rather large swath of kernel functionality (the IPC 
fastpath, cap transfer, the single-threaded event model) are simply not 
available to mutually suspicious PD's. I'm very concerned about the 
expansion of the T(rusted)CB into userspace, for both performance and 
assurance reasons.
...
3. A passive driver must be on the same core as its caller, since
    cross-core PPC is deprecated.  An active driver does not have this
    restriction.  Active drivers can therefore run concurrently with the
    programs they serve, which is the same technique used by Linux’s
    io_uring with SQPOLL.
For a high-performance driver, I would expect to have at least one TX 
and RX queue per CPU. Therefore, a local call should always be possible. 
The deprecation of cross-CPU IPC is related to the basic concept that 
spreading work across CPU's is generally not a good idea.
But yes, if the user wants to distribute the workload in that way, 
passing data between CPU's, obviously IPC fastpath is off the table, and 
notifications seem like a pretty clear choice in that case. (However, 
the existence of this use case is _not_ a reason to sacrifice the 
performance of the same-core RTC execution model.)
...
4. Linux’s io_uring has the application submit a bunch of requests into
    a ring buffer and then tell the kernel to process these requests.
    The kernel processes the requests asynchronously and writes
    completions into another ring buffer.  Userspace can use poll() or
    epoll() to wait for a completion to arrive.  This is a fantastic fit
    for active drivers: the submission syscall is replaced by a
    notification, and another notification is used to for wakeup on
    completion.
Agreed, it's a very attractive model. Indeed, that is basically how I 
got started on this line of thinking; it is quite apparent that these 
command ring/pipe-like structures are very flexible and could be used as 
the building blocks of entire systems. So the question I wanted to 
answer was: what are we leaving on the table if we go with this 
approach? Particularly given the emphasis on IPC, its optimizations and 
the contention that fast IPC is the foundational element of a 
successful, performant microkernel system.

-Eric

[seL4] Re: sel4cp and device driver API's

Eric Jacobs