On 10/16/22 13:56, Gernot Heiser wrote:
On 16 Oct 2022, at 23:25, Harry Butterworth
wrote: The "single producer, single consumer, lockless bounded queues implemented as ring buffers" also caught my attention.
In the past, to minimize jitter, I found it useful to have a pool of threads consuming work from a single queue. This reduces the probability that tasks will be significantly delayed behind other tasks that are taking an unusually long time, because additional threads in the pool become free and allow subsequent tasks to bypass the ones taking too long. This is well known and often implemented in post offices where there is a single queue for multiple counters. This implies MPMC (or perhaps SPMC and MPSC in pairs).
Hi Harry, Stewart,
The whole point here is that we don’t think we need this, and as a result can keep implementation complexity as well as overheads low. SPMC/MPSC is inherently more complex than SPSC, and, judging by the papers recently I read that use it, I’m reasonably confident we’ll at least match performance of such approaches (and I'll offer a grovelling retraction if proven wrong ;-).
How do you plan to handle multi-queue devices? Modern devices often have multiple queues so that they can be used from multiple cores without any CPU-side synchronization.
In our design there’s only one place where there’s a 1:n mapping, that’s in the multiplexer, and all it does is moving pointers from a driver-side ring to a set of per-client rings (input) or from a set of per-client rings to a driver-side ring (output). On output it will apply a simple policy when deciding which client to serve (priority-based, round-robin, or bandwidth limiting). A particular multiplexer will just implement one particular policy, and you can pick the one you want. Basically looking at building a lego set, where every lego block is single-threaded and can run on a different core.
How do you plan on handling access control? Using a block device as an example, a client must not be able to perform any requests to regions of the block storage that it is not authorized to access. This could either be handled in the multiplexer itself or by having the multiplexer include an unforgeable client ID with each request sent to the driver. Also, what are the consequences of a compromised driver? Will drivers be able to escalate privileges directly, or will the multiplexer and client libraries enforce some invariants even in this case?
This keeps all rings SPSC, and every single piece very simple (and likely verifiable), and should be able to maximise concurrency of the whole system.
I agree, with the above caveat about multi-queue devices.
We haven’t implemented and evaluated the multiplexers yet, but that’ll be one of the first things we’ll do when Lucy returns from internship/vacation in early Jan (and I’m very much looking forward to analysing results).
Will it be possible for clients to pre-register buffers with the multiplexer, and for the multiplexer to in turn register them with the driver? That would allow for devices to DMA directly to client buffers while still having the IOMMU restricting what the driver can do. -- Sincerely, Demi Marie Obenour (she/her/hers)