Hello Andrew, On 2024-04-19 05:00, Andrew Warkentin wrote:
Here's the exact code for the benchmark loop I'm using on seL4:
void c_test_client(seL4_CPtr endpoint) { struct seL4_MessageInfo info = seL4_MessageInfo_new(0, 0, 0, 2); int j; for (j = 0; j < 10000; j++){ seL4_Call(endpoint, info); } }
void c_test_server(seL4_CPtr endpoint, seL4_CPtr reply) { struct seL4_MessageInfo info = seL4_MessageInfo_new(0, 0, 0, 2); while (1){ seL4_Recv(endpoint, (void *)0, reply); seL4_Send(reply, info); } }
This measures synchronous calls, so 20k context switches back and forth between the client and the server.
I'm not quite sure what would be slowing it down. I would have thought this would be measuring only the cost of a round trip IPC and nothing more. Is there possibly some kind of scheduling issue adding overhead?
Yes, you are running it in a virtual machine, remember? Any timing is suspect.
The QNX/Linux equivalent is:
int main() { char buf[100]; int f = open("/dev/null", O_WRONLY); int i; for (i = 0; i < 10; i++) { uint64_t start = __rdtsc(); uint64_t end; int j; for (j = 0; j < 10000; j++){ if (write(f, buf, 100) != 100){ printf("cannot write\n"); exit(1); } } end = __rdtsc(); printf("cycles: %u\n", end - start); } }
This doesn't do any IPC and measures the time of 10k dummy write syscalls. The POSIX equivalent would be to have two processes using pipes or fifos with the server sending a reply and the sender waiting for the reply before sending the next message. Greetings, Indan