On Fri, Apr 19, 2024 at 12:47 AM Gernot Heiser via Devel <devel@sel4.systems> wrote:
the correct syscall to use is ReplyWait():
Recv(…); while (1) { ReplyWait(…); }
Es per my blog: Send() and Recv() should only ever be used in initialisation and exception handling.
Rewriting the server function to: void c_test_server(seL4_CPtr endpoint, seL4_CPtr reply) { struct seL4_MessageInfo info = seL4_MessageInfo_new(0, 0, 0, 2); seL4_Recv(endpoint, (void *)0, reply); while (1){ seL4_ReplyRecv(endpoint, info, (void *)0, reply); } } made little difference.
so you’re using standard I/O to /dev/null
My Posix is a bit rusty, but this should be buffered in the library (i.e. most calls will *not* result in a system call). And, given that the output goes to /dev/null, the data may be thrown away completely.
Basically you’re measuring the cost of a function call.
As Demi said, read()/write() aren't usually buffered. I haven't actually checked whether QNX is buffering it, but I'd be a little bit surprised if it were. On Fri, Apr 19, 2024 at 3:12 AM Indan Zupancic <indan@nul.nu> wrote:
Hello Andrew,
I'm not quite sure what would be slowing it down. I would have thought this would be measuring only the cost of a round trip IPC and nothing more. Is there possibly some kind of scheduling issue adding overhead?
Yes, you are running it in a virtual machine, remember? Any timing is suspect.
Wouldn't that affect timing relative to bare metal more than timing of kernels relative to each other? I guess seL4 might just somehow not get along with virtualization (the timings in QEMU and VirtualBox are fairly similar), although I'd be a little surprised if it were affected that much relative to QNX.
The QNX/Linux equivalent is:
int main() { char buf[100]; int f = open("/dev/null", O_WRONLY); int i; for (i = 0; i < 10; i++) { uint64_t start = __rdtsc(); uint64_t end; int j; for (j = 0; j < 10000; j++){ if (write(f, buf, 100) != 100){ printf("cannot write\n"); exit(1); } } end = __rdtsc(); printf("cycles: %u\n", end - start); } }
This doesn't do any IPC and measures the time of 10k dummy write syscalls.
The POSIX equivalent would be to have two processes using pipes or fifos with the server sending a reply and the sender waiting for the reply before sending the next message.
AFAIK writing to /dev/null does some IPC under QNX (/dev/null is exported by the process server) although I suspect it's not copying anything. In any case I wrote my own null server that specifically copies the data into its own buffer (that's what the "custom null server" timings are) in order to avoid any optimizations like that, and it's slower than real /dev/null, but still way faster than what I'm getting out of seL4.