On 12 Apr 2020, at 10:20, Zach Lym
Red Hat's lead real-time scheduling engineer (who has a solid background in formal methods) posted an interested comment on an LWN article about seL4's scheduler research (https://lwn.net/Articles/745946/).
While I disagree with the tone of the comment, I rarely see informed critiques of seL4 and context switching performance is core to microkernel viability. My apologies if this is an inappropriate venue for this discussion.
First-off, I think that Jonathan’s report of my talk was really excellent (a few minor misconceptions notwithstanding). The following discussion was also largely informed, with the trolls quickly hosed down by others ;-) Re the comments by bistot:
"With the current scheduler, that driver must be given a high priority and trusted to not hog the CPU." So, they need to implement either RT throttling or the CBS, like Linux.
that’s the whole thing behind the MCS model: budgets prevent hogging, in a clean way that supports (formal) reasoning about what happens.
"Another problem is data structures that are shared between criticalities....se of the priority-ceiling protocol can help." Nice, then they will see that the priority ceiling will add release jitter to the highest priority task, and then will notice that they will have to use other protocol, like, mutexes with priority inheritance? Where did I hear about it already... ah! On Linux! It is widely known that if you use priority ceiling, you will have scheduling "latency." But it really helps to reduce the response time... so, what do you prefer Response Time or Scheduling latency (release jitter or any scheduling delay, scheduler overhead... or any other name people do for it)?
This confuses timeliness with temporal determinism. The mechanisms I discussed in my talk were for ensuring timeliness, i.e. the ability to meet deadlines. That is the critical property of hard RT systems. Temporal determinism (i.e. absence of jitter) is something you worry about in soft RT systems (which are generally not “critical” in the safety sense). RedHat Linux will probably be used a lot in this kind of soft-RT system, no-one in their right mind would use it in a safety-critical hard-RT system. If you want to prevent jitter, then there are ways to do this. A simple (although by far not most efficient) way is to defer whatever signal you need to produce to the last possible moment. There are other ways, supported by seL4, but what is the best solution will depend on the specifics of the application. Doing RT right generally requires a lot of analysis. In any case, my talk was about guaranteeing timeliness in mixed-criticality systems, not temporal determinism for soft RT. Gernot