New subject: On Common Infrastructure

26 Nov 2024

      Hi together,

during the proceedings of the last seL4 summit I participated in a discussion about Microkit usability. In particular we were talking about common patterns for IPC in Microkit, potential pitfalls and how the Microkit Tutorial introduces some of them implicitly (by requiring the user to apply them). At some point Kent suggested to put the topic on the mailing list, so here we go 😊.

When initially playing through the Microkit Tutorial, I assumed that the budget/period of a protection domain were providing uninterrupted time-slots. I was thus slightly surprised, when I found out that sending a notification to a higher priority PD would immediately preempt my current PD. Now, this on its own is not problematic, Gernot simply commented my mistake with "Yes, priorities are observed". However:

When playing the Microkit tutorial, I understood the way to solve Part 2 (Serial Server to Client Communication) to be as follows:

  *   [serial server] write received char to shared buffer.
  *   [serial server] notify client.
  *   [client] read char from shared buffer.

This is also the current approach found in the provided solution [1]. This works in some (most!) cases, but it is not reliable. It could be that chars get lost, due to the serial server overwriting the buffer before it was consumed form the client.
The specific details if this works depend upon the combination of:

  *   Data size (not-so-critical with one byte, but more data means more time is required to consume the buffer form the client)
  *   Budget/Period of the client (can the client read all data before it is preempted by exhaustion of its budget in the current period)
  *   Hardware Performance (same as above)
  *   Priorities (if the serial server is higher priority and gets triggered by another hardware event before the client can consume the char, it gets lost again)

This is problematic: the details of the scheduling are not very explicit in the manual (roughly one half of a page), but the first example of IPC taught via the tutorial relies on a lot of implicit circumstance on how PDs are scheduled to function correctly.

Now I have three suggestions to improve upon this.

  1.  In the tutorial, explicitly mention the possibility of using the same notification channel in the opposite direction for a read-ACK. If another serial interrupt comes, but no ACK was received, just log an error. The point is not, to introduce ring-buffers and a the machinery for robust communication -- that would like explode the scope for a simple tutorial. Rather, the point is to teach people that this IPC mechanism may silently fail (as in chars get dropped).
  2.  For certain scheduling + IPC scenarios, examples with a graphical representation would be good. Here are two examples: Sending data to a higher priority PD (via notifications + shared memory) [2] and sending data to a lower priority PD (via notification + shared memory + ACK notification) [3]. Please bear with me, my talent to intuitive graphical representation is of limited nature, this for sure could be made better looking. In the end, I'd like to have an engineer short on time quickly look up the current scenario (sharing data bi-/unidir?, other PD is strictly higher prio/lower prio?, silent loss of data tolerable or not?, ...) from a table, take a look at the timing diagram and pick the correct approach for the task at hand. Microkit does not provide very sophisticated IPC APIs, and that is good! It keeps Microkit small, and the API IMHO provides everything needed, for synchronous and asynchronous communication, for large and small data, ...

But composing the IPC correctly is not trivial, so anything we can to take sharp edges of the experience without bloating the API should be considered.
  3.  I think, there is a finite number of relevant IPC scenarios. Maybe one can build a tool to verify them (something something Promela?) and display them graphically (something something PlantUML Timing Diagrams?). This could either be used to fill a chapter in the Microkit Manual ("Common IPC Patterns, their Behavior and Pitfalls"?), or exposed as small (Web?) GUI Tool. Maybe this could become part of a student thesis even?

I hope the above might spark some thought on improving accessibility of Microkit.

Kind Regards, Wanja
[PlantUML diagram]
[PlantUML diagram]

[1] https://github.com/au-ts/microkit_tutorial/blob/6733b7864217831e4e77616647eb...
[2] https://www.plantuml.com/plantuml/uml/VL9DYzim4BthLppIGoxP49kwWHvADlHOIYc5FI...
[3] https://www.plantuml.com/plantuml/uml/XLDDYzim4BthLpnosKisnA4Ecq9BjlHOIkbbpr...

——————————————————————————
Deutsches Zentrum für Luft- und Raumfahrt (DLR)
Institut für Flugsystemtechnik | Sichere Systeme & Systems Engineering | Lilienthalplatz 7 | 38108 Braunschweig

Wanja Zaeske
Telefon 0531 295-1114 | Fax 0531 295-2931 | wanja.zaeske@dlr.de<mailto:wanja.zaeske@dlr.de>
DLR.de<http://www.dlr.de/>

On Common IPC Patterns in Microkit

Wanja.Zaeske＠dlr.de

Gernot Heiser

Ivan Velickovic

Anna Lyons

Ivan Velickovic

wanja.zaeske＠dlr.de

Wanja.Zaeske＠dlr.de

Indan Zupancic

tags

participants (6)