On 10/16/22 09:09, Lucy Parker via Devel wrote:
Hi Stewart, The ring buffers were previously implemented as a library on CAmkES. You can find the PR here: https://github.com/seL4/projects_libs/pull/15 The move to the Core Platform was made due to performance overheads in the CAmkES framework. There is no reason the sDDF can’t be used on other platforms either, though the sample system will need porting.
Hi Harry, The use of single producer, single consumer queues was an intentional simplification for the driver framework. This makes it much easier to reason about and verify as well as eliminating many possible concurrency bugs. The framework does not require single threaded address spaces though. We could have multiple threads acting as a single component, but they would be servicing different queues (and thus the queues would remain single producer, single consumer). Ideally for larger systems with multiple clients, we would use a multiplexing component to service multiple queues between client applications and instead grow the stack laterally. This design aims to provide a strong separation of concerns as each job would be a separate component and the simplicity of the queues means there is little performance overhead. In multicore, each of these components would run on separate cores.
You can probably hide most of this for devices where a single hardware thread is sufficient to handle the device and the devices are independent but if you are doing anything involving the CPU in conjunction with a fast modern network device then one hardware thread won’t be enough
We propose adding a second hw thread, one to service each direction and thus we could still maintain single producer/single consumer. We still have some work to do to expand the sDDF to other device classes (and this will involve benchmarking the framework on high throughput networking systems), so more on this to come! :)
For network devices, have you considered using hardware receive-side scaling to shard the workload among multiple independent cores? My understanding is that this is the best solution to this problem, as the fast paths operate with no cross-core synchronization. -- Sincerely, Demi Marie Obenour (she/her/hers)