Hello Andrew, On 2024-04-19 12:40, Andrew Warkentin wrote:
On Fri, Apr 19, 2024 at 4:22 AM Indan Zupancic <indan@nul.nu> wrote:
Because it is asynchronous: You do 10k system calls which queue some data, QNX can do one context switch to the server to handle all 10k of them in one go.
The client needs to read a reply from the server for each message sent to make it equivalent to seL4's code.
QNX's read() appears to be a very simple wrapper around MsgSend(), which is synchronous and has similar RPC-like semantics to seL4_Call except that it operates on a user-provided arbitrary-length buffer instead of a fixed one.
If what you think is true, QNX can do a context switch in 100 ns, or the whole send and reply with two context switches in 600 ns for the custom server version. I have my doubts, I thought context switch overhead was higher on x86. Do you have access to the source code of the QNX version you are running? Just write a low-level test using MsgSend() and MsgReply() directly and add a way to verify that all 10k calls are handled correctly. E.g. pass an integer, let the server increment it and use the reply for the next call. The final value should be 10k. There's probably a mistake with seL4's code or kernel configuration. Even when using the slow path I wouldn't expect such high numbers, at least not on ARM (I have no experience with seL4 on x86). 11 us per iteration you get is extremely slow, even for a debug build (except if you get debug prints now and then). Greetings, Indan