MCS round-robin scheduling not behaving as expected
Hello, I'm trying to understand MCS and how the scheduling works but I'm running into some issues. My setup: RISCV on BeagleVFire in sel4test I have configured 2 threads: seL4_SchedControl_ConfigureFlags(info->schedcontrol.start, sched_context_thread2, 1000000, 1000000, 0, 0, 0); seL4_SchedContext_Bind(sched_context_thread2, thread2_TCB); seL4_SchedControl_ConfigureFlags(info->schedcontrol.start, sched_context_thread3, 1000000, 1000000, 0, 0, 0); seL4_SchedContext_Bind(sched_context_thread3, thread3_TCB); The threads have the same priority. From the documentation I was expecting them to behave like this: - Thread1 does something - Thread2 does something - Thread1 does something - ... But it seems instead only the first thread gets scheduled, and it never allows the second thread to get any CPU time. Am I correct in assuming that 2 threads with period=budget and equal priority should be scheduled round-robin? I found that if I call seL4_SchedContext_Consumed(sched_context_thread) periodically in the threads they will switch if the budget has expired. It doesn't make much sense to me why this is the case, I thought they would switch automatically once their budget is expired and that seL4_SchedContext_Consumed() was simply used to keep track of this. I'm hoping it can offer some clue as to what I'm doing wrong. Any help or guidance would be greatly appreciated. Kind Regards, Liam
"liam" == liam vervecken
writes:
liam> Hello, I'm trying to understand MCS and how the scheduling works liam> but I'm running into some issues. My setup: RISCV on liam> BeagleVFire in sel4test liam> I have configured 2 threads: liam> seL4_SchedControl_ConfigureFlags(info->schedcontrol.start, liam> sched_context_thread2, 1000000, 1000000, 0, 0, 0); liam> seL4_SchedContext_Bind(sched_context_thread2, thread2_TCB); liam> seL4_SchedControl_ConfigureFlags(info->schedcontrol.start, liam> sched_context_thread3, 1000000, 1000000, 0, 0, 0); liam> seL4_SchedContext_Bind(sched_context_thread3, thread3_TCB); The only times that the seL4 kernel will preempt a running thread are: -- if a higher priority thread becomes runnable, or -- if the running thread runs out of budget. Your example sets budget==period so no thread will ever run out of budget and all threads will run until they wait on an event or call sched_yield(). To get the effect you want, set period to twice the budget, so each thread takes up half the available time. -- Dr Peter Chubb https://trustworthy.systems/ Trustworthy Systems Group CSE, UNSW Core hours: Mon 8am-3pm; Wed: 8am-5pm; Fri 8am-12pm.
On 24 Jul 2024, at 08:52, Peter Chubb via Devel
wrote: "liam" == liam vervecken
writes: liam> Hello, I'm trying to understand MCS and how the scheduling works liam> but I'm running into some issues. My setup: RISCV on liam> BeagleVFire in sel4test
liam> I have configured 2 threads: liam> seL4_SchedControl_ConfigureFlags(info->schedcontrol.start, liam> sched_context_thread2, 1000000, 1000000, 0, 0, 0); liam> seL4_SchedContext_Bind(sched_context_thread2, thread2_TCB);
liam> seL4_SchedControl_ConfigureFlags(info->schedcontrol.start, liam> sched_context_thread3, 1000000, 1000000, 0, 0, 0); liam> seL4_SchedContext_Bind(sched_context_thread3, thread3_TCB);
The only times that the seL4 kernel will preempt a running thread are: -- if a higher priority thread becomes runnable, or -- if the running thread runs out of budget.
Your example sets budget==period so no thread will ever run out of budget and all threads will run until they wait on an event or call sched_yield().
To get the effect you want, set period to twice the budget, so each thread takes up half the available time.
Nope, that’s not how it’s supposed to work, and if that’s what the implementation does then that’s a bug. A thread should always be preempted if its budget is depleted. A depleted budget is replenished once the new period begins. If budget=period, then that’s immediate. After budget replenishment, the thread should be inserted back at the end of the ready queue. This means, a full budget acts like a normal time slice, with threads of equal priority scheduled round-robin. Gernot
On 24 Jul 2024, at 10:20, Gernot Heiser via Devel
On Tue, Jul 23, 2024, 8:15 PM Gerwin Klein via Devel
If the first thread is round-robin, that means it is always the highest priority thread as soon as it gets started, and the second thread never gets started at all, because the sel4test setup thread does not run any more (since the first thread is always ready).
Keeping the setup thread priority high until after both round-robin threads are started (and then dropping it) shows the correct round robin behaviour for me.
I noticed a similar issue, but I have all the threads at the same priority and the scheduler seems to be behaving more like a FIFO scheduler than a round-robin one. I haven't actually looked at the scheduler code all that much yet so I'm not quite sure what's going on. I am using a fork rather than seL4 itself, but this was happening before I started making major changes.
On 10 Aug 2024, at 05:47, Andrew Warkentin
wrote: On Tue, Jul 23, 2024, 8:15 PM Gerwin Klein via Devel
wrote: If the first thread is round-robin, that means it is always the highest priority thread as soon as it gets started, and the second thread never gets started at all, because the sel4test setup thread does not run any more (since the first thread is always ready).
Keeping the setup thread priority high until after both round-robin threads are started (and then dropping it) shows the correct round robin behaviour for me.
I noticed a similar issue, but I have all the threads at the same priority and the scheduler seems to be behaving more like a FIFO scheduler than a round-robin one.
For period = budget, the behaviour of the MCS scheduler should definitely be round-robin (and does appear to be for all tests so far). If you can reproduce a FIFO behaviour, that would definitely be interesting to investigate further and should be fixed. Cheers, Gerwin
Hello Liam, On 2024-07-22 14:56, liam.vervecken@gmail.com wrote:
But it seems instead only the first thread gets scheduled, and it never allows the second thread to get any CPU time.
This could happen if the port is broken and you don't get any timer interrupts. To verify if this is happening, you could add a debug print to getActiveIRQ() in src/arch/riscv/machine/hardware.c at line 117 where it set irq = KERNEL_TIMER_IRQ;
I found that if I call seL4_SchedContext_Consumed(sched_context_thread) periodically in the threads they will switch if the budget has expired. It doesn't make much sense to me why this is the case, I thought they would switch automatically once their budget is expired and that seL4_SchedContext_Consumed() was simply used to keep track of this. I'm hoping it can offer some clue as to what I'm doing wrong.
Calling most system calls should have the same effect, as the kernel does a time and budget check for all non-fastpath syscalls. Greetings, Indan
participants (6)
-
Andrew Warkentin
-
Gernot Heiser
-
Gerwin Klein
-
Indan Zupancic
-
liam.vervecken@gmail.com
-
Peter Chubb