[seL4] Re: seL4 vs QNX and Linux benchmarks

20 Apr 2024

      The topic of sel4 vs QNX performance came up in the developer hangout and I
found the entire discussion perplexing for the following reasons:

 - not inconsequential, Gernot, who I presume is reasonably expert, was
strongly of the opinion it was a non-issue.
 - marginal performance gains, even if possible, are not even remotely
warranted at the cost of any security whatsoever.
 - some timing issues may be security critical and altering them could
compromise the security of the kernel.
 - hardware capacity is vastly improved in speed as of 2024 by the time
sel4 is fully productionized hardware capacity will most likely erase any
differential in speed.
 - I do not find it hard to believe that sel4 is perfectly adequate as RTOS
within its current architecture.
 - There does not seem to me to be any advantage in modifying access to the
kernel and there are plenty of disadvantages. It's all downside. The OS is
open source, so people are at liberty to modify as they please. They just
can't call it sel4.
 - To me, the moniker 'sel4' is a promise and a guarantee and that promise
cannot reasonably be kept by making special use-case exceptions.
 - QNX minimum requirements call for a greater than 1GHz processor here:
https://www.qnx.com/developers/docs/6.5.0SP1.update/com.qnx.doc.momentics_in...
 - Say, for the sake of argument, there is a system call scenario whereby
the kernel architecture is causing an unwarranted delay. That might be a
defect and could be addressed as such.

Finally, there is an old development rule of thumb that first you make it
work before you look to optimize.

I have been researching security since the late 1980s. I designed and wrote
the secure network access system for a Canadian bank in the early 1990s. As
it happens, I designed a security protocol twenty years ago that is in use
by the Canadian mint to secure money. Since 1992, security has been an
integral part of my ongoing research project into data packaging. Years ago
I got permission to release code using Gutmann's cryptlib and I spent a
great deal of time reviewing and learning from that code.

I don't really hold myself out as an expert WRT security. However, I have
much to say about it, and I do have some knowledge. I would not presume to
question Gernot's judgment with respect to sel4 either security or
otherwise. Gernot has expressed confidence in sel4 and that is good enough
for me.

On Thu, Apr 18, 2024 at 10:31 PM Gernot Heiser via Devel <devel@sel4.systems>
wrote:
...
On 18 Apr 2024, at 23:55, Andrew Warkentin <andreww591@gmail.com> wrote:
...
Somebody in the Zoom call the other day (IIRC it was Gernot) said that
seL4 IPC is considerably faster than QNX IPC, but it seems that QNX is
somehow faster in a simple benchmarking experiment in which a client
repeatedly sends messages to a bit-bucket server, using the TSC to
determine timing.
I'm running both QNX and seL4 under identical VirtualBox
configurations with a single core. I'm using a QNX 6.3.1 x86-32 image
i have in my collection (x86-64 support was only added in 7.0; this is
the most recent image I have with dev tools installed). The seL4
version I'm using is from February 2022 and is built for x86-64 with
MCS enabled (the priority of the threads is 255, and both the
timeslice and budget are the initial thread default of 5 ms).
My benchmark times a loop of 10000 RPCs; for QNX it's just using the
regular read() function on /dev/null or a custom server that also acts
like /dev/null but forces a copy of the message into a server-provided
buffer, and for seL4 I'm directly calling Send()/Recv()/Reply(). I've
also tested Linux's /dev/null, and QNX's implementation actually seems
somewhat faster than it (for the system /dev/null; the custom server's
best case seems only slightly worse than under Linux).
Under QNX and Linux I'm running a C implementation of the benchmark
and under seL4 I tried both C and Rust implementations (which have
very similar performance). The messages on QNX and Linux are 100
bytes, and on seL4 they are 2 words, which should be hitting the fast
path.
Here are the best and worst cases I got:
seL4: 0.114-0.136s
QNX, custom null server: 0.006-0.063s
QNX, system null server: 0.002-0.022s
Linux: 0.004-0.005s
Has anyone else benchmarked seL4 IPC against that of QNX and gotten
better (or comparable) results for seL4, unlike what I got? Is
synchronous IPC just less optimized under seL4 than QNX?
I’m not completely sure what you’re actually measuring here (actually I’m
totally unsure
what your setup really is, I can’t make sense of the above description in
terms of
what the seL4 system is doing).
But measuring performance of a kernel inside a virtual machine is somewhat
pointless at best.
In any case it seems clear that you’re *not* measuring raw seL4 IPC.
Your quoted cost of a round-trip on seL4 is say 120ms for 10,000 round
trips,
or 12µs for one round trip. You don’t specify the clock rate, but that
would be
somewhere between 12,000 and 50,000 cycles, when a round-trip IPC on x86
with MCS is <900 cycles
Note that using Send()/Recv()/Reply() is the wrong way to use seL4 IPC, see
https://microkerneldude.org/2019/03/07/how-to-and-how-not-to-use-sel4-ipc/
However, this will only be about a factor two slower than using the
correct Call()/ReplyWait() incantation.
In fact, while it is in principle possible to outperform seL4 in IPC, it
is impossible to do so
by more than a few percent, as the seL4 IPC cost is only about 20% higher
than the
pure hardware cost of kernel trap, minimal state save+restore and
switching page tables.
This has been independently established in a peer-reviewed paper, see
https://dl.acm.org/doi/pdf/10.1145/3302424.3303946
Everything I’ve ever seen from QNX is that their IPC costs is many
thousands of cycles
(probably >10,000), and at least an order of magnitude slower than seL4’s.
Gernot
_______________________________________________
Devel mailing list -- devel@sel4.systems
To unsubscribe send an email to devel-leave@sel4.systems
-- 
Bob Trower
--- From Gmail webmail account. ---