Multiboot and extra modules

Dave Richards

16 Dec 2018 16 Dec '18

9:15 a.m.

Is it (or would it be) possible to specify more than the single root server module and have seL4 create virtual addresses, cap. Spaces, threads (leaving them suspended) and pass the information about these modules to the root server? In this way it would be possible to have GRUB load a multi-server seL4 environment. Thanks, Dave

Attachments:

attachment.html (text/html — 2.2 KB)

Show replies by date

Andrew Warkentin

16 Dec 16 Dec

11:18 a.m.

On Sat, Dec 15, 2018, 3:19 PM Dave Richards <dave@synergy.org wrote:

...

Is it (or would it be) possible to specify more than the single root server module and have seL4 create virtual addresses, cap. Spaces, threads (leaving them suspended) and pass the information about these modules to the root server? In this way it would be possible to have GRUB load a multi-server seL4 environment

For UX/RT <https://gitlab.com/uxrt/>, the seL4-based OS I'm writing, I've added support for extra modules besides the root server, but all it does is make the Multiboot2 info structure and modules available to the root server. The root server is expected to load any further executables. Also, it currently requires the kernel and all modules to be embedded within a single filesystem image (using a special chain-loader) rather than loaded from separate disk files. UX/RT uses a process-per-subsystem-instance architecture rather than process-per-component, and its root server will contain everything needed to load normal user programs, so it will have no need for preloading executables besides itself. I did include support for preloading executables in my chain-loader (so the ELF loading code could be removed from the kernel), and it would be easy enough to remove the check that only one module has been preloaded, and it would also be relatively easy to change it to pass through modules that have been loaded from separate disk files, but having the kernel actually set up all the state for them would be a lot more difficult. It would be easier to just have the root server set up the state for the other executables if you want to split the core user-level components of your OS across multiple processes, but I'd say it would probably be a better idea to just do what I'm doing and put everything required to load regular processes in the root server. I really don't get why most microkernel OSes use process-per-component architectures. Security domains more often correspond to subsystems than components, and splitting subsystems into multiple processes usually just adds IPC overhead with little benefit for security or reliability.

Dave Richards

1:02 p.m.

I have not read the ukernel code, but I would imagine that most of the code needed to launch the multiple servers already exists, i.e. it is necessary to launch the root server. It may require re-factoring, of course. Right? It solves another immediate problem for me. I can develop the various servers without working on device drivers, I/O subsystem, ELF loader, etc. As for the OS architecture, I imagine many approaches are possible depending on the goals of the designer. The nice thing about seL4 is that it doesn't dictate any single approach. Thanks, Dave

Andrew Warkentin

3:16 p.m.

On 12/15/18, Dave Richards <dave@synergy.org> wrote:

...

I have not read the ukernel code, but I would imagine that most of the code needed to launch the multiple servers already exists, i.e. it is necessary to launch the root server. It may require re-factoring, of course. Right?

It solves another immediate problem for me. I can develop the various servers without working on device drivers, I/O subsystem, ELF loader, etc.

Yes, there's code to launch the root server, but it's not particularly general, and is just enough to set up the state for the root server's initial thread. It would probably be non-trivial to make it general enough to load multiple servers. Personally I don't think that the kernel should be setting up state for multiple servers though. I'd say that it should leave it up to the root server (my chain-loader's ELF preloading code would make that relatively easy to do from a root server without an ELF loader built in. although UX/RT's root server will just use its own built-in loader since it will treat modules as files rather than processes, and will only start /sbin/init directly, with init starting all other processes; I think a microkernel OS should have as few "special" servers as possible).

...

As for the OS architecture, I imagine many approaches are possible depending on the goals of the designer. The nice thing about seL4 is that it doesn't dictate any single approach.

What kind of OS are you trying to write? UX/RT is going to be a QNX-like OS with some concepts from Plan 9 thrown in, as well as a high degree of Linux compatibility.

Anna.Lyons＠data61.csiro.au

17 Dec 17 Dec

9:22 a.m.

Hi all, Just to confirm what has already been stated: while it would be possible for seL4 to load multiple images, however this is incongruent with L4 philosophy. We want to keep the kernel as small as possible to minimise the trusted computing base and maximise policy freedom. The root task can act as a loader to load multiple images according to whatever policies suit the system. One example loader we have is the capdl loader (https://github.com/seL4/capdl) which can load capabilities and set up processes based on a specification. Camkes (https://docs.sel4.systems/CAmkES/) can be used to initialise such specifications. We're working on improving our documentation and tools for setting up seL4 systems, so keep an eye out in the coming year. Cheers, Anna. ________________________________________ From: Devel <devel-bounces@sel4.systems> on behalf of Andrew Warkentin <andreww591@gmail.com> Sent: Sunday, 16 December 2018 3:16 PM To: devel@sel4.systems Subject: Re: [seL4] Multiboot and extra modules On 12/15/18, Dave Richards <dave@synergy.org> wrote:

...

I have not read the ukernel code, but I would imagine that most of the code needed to launch the multiple servers already exists, i.e. it is necessary to launch the root server. It may require re-factoring, of course. Right?

It solves another immediate problem for me. I can develop the various servers without working on device drivers, I/O subsystem, ELF loader, etc.

...

As for the OS architecture, I imagine many approaches are possible depending on the goals of the designer. The nice thing about seL4 is that it doesn't dictate any single approach.

What kind of OS are you trying to write? UX/RT is going to be a QNX-like OS with some concepts from Plan 9 thrown in, as well as a high degree of Linux compatibility. _______________________________________________ Devel mailing list Devel@sel4.systems https://sel4.systems/lists/listinfo/devel

Andrew Warkentin

9:49 p.m.

On 12/16/18, Anna.Lyons@data61.csiro.au <Anna.Lyons@data61.csiro.au> wrote:

...

The root task can act as a loader to load multiple images according to whatever policies suit the system. One example loader we have is the capdl loader (https://github.com/seL4/capdl) which can load capabilities and set up processes based on a specification. Camkes (https://docs.sel4.systems/CAmkES/) can be used to initialise such specifications.

CAmkES is good for building static embedded systems (excluding stuff running in VMs), but can't really do anything more AFAIK. If you're trying to build a general-purpose OS it's far from ideal. I guess it might be possible to build the lowest user-mode part of a general-purpose OS out of CAmkES components with one component acting as a dynamic loader, but that might over-complicate the design. For UX/RT I'm just going to build the lowest-level components (core VFS and process/memory management) into the root server since I can't see much benefit to splitting them across different processes (each component will be a separate Rust crate though).

Gernot.Heiser＠data61.csiro.au

10:49 p.m.

On 17 Dec 2018, at 21:49, Andrew Warkentin <andreww591@gmail.com<mailto:andreww591@gmail.com>> wrote: CAmkES is good for building static embedded systems (excluding stuff running in VMs), but can't really do anything more AFAIK. Yes, CAmkES is designed for static architectures, which is where we can give an assurance story. If you want something dynamic, i.e. more resembling a full OS, then have a look at Genode. But then all assurance and ability to reason about security properties goes out of the window. Gernot

Andrew Warkentin

11:27 p.m.

On 12/17/18, Gernot.Heiser@data61.csiro.au <Gernot.Heiser@data61.csiro.au> wrote:

...

If you want something dynamic, i.e. more resembling a full OS, then have a look at Genode. But then all assurance and ability to reason about security properties goes out of the window.

UX/RT is also going to be highly dynamic (it will be a relatively conventional QNX-like OS, and not an "OS framework" like CAmkES or Robigalia, and although it won't have the verifiability of a static system like CAmkES, it will have fine-grained security that will closely approximate a pure capability system in a Unix-like filesystem environment, making it theoretically more secure than any conventional Unix). It will be a replacement for general-purpose OSes like Linux, intended for workstations, servers, and large complex embedded systems for which a static framework is too limiting, whereas CAmkES is more of a replacement for static unikernels like eCos and RTEMS than anything else (for security-critical deeply embedded systems it would be preferable to either general-purpose OSes or unikernels).

G. Branden Robinson

18 Dec 18 Dec

3:09 a.m.

New subject: Dynamic environments and self-hosting (was: Multiboot and extra modules)

At 2018-12-17T11:49:58+0000, Gernot.Heiser@data61.csiro.au wrote:

...

CAmkES is good for building static embedded systems (excluding stuff running in VMs), but can't really do anything more AFAIK.

Yes, CAmkES is designed for static architectures, which is where we can give an assurance story.

If you want something dynamic, i.e. more resembling a full OS, then have a look at Genode. But then all assurance and ability to reason about security properties goes out of the window.

To invert a classic naïve programmer's question, isn't that true only in practice? In principle, once sufficient components are in place for seL4 to host its own build dependencies including the theorem prover, then it is a general enough development environment to compile and verify arbitrary extensions. Naturally, the theorem prover itself has to be added to the list of "assumed valid" components; we can't use it to verify itself without disproving Gödel (I think). At the same time, my intuition is that a theorem prover should be harder to Trojan than a compiler. When Thompson famously subverted the C compiler[1], he apparently did simple (but effective) pattern-matching on strings (in the symbol table, I suppose) to recognize whether the program being compiled was "login", the disassembler, or the C compiler itself. With knowledge of that trick in mind it should be straightforward to produce a version of Isabelle/HOL with symbols arbitrarily renamed. The challenge for attackers then is to develop a backdoor that can recognize a theorem prover by data flow analysis or an unlabeled call graph. And that in turn seems like[2] it should be practically impossible to do without invalidating the theorem prover for other purposes, which exposes the fact of its compromise[3]. This in turn is an argument for having proofs for many more components of the system, not just the kernel; even small and simple ones can contribute to building confidence that the theorem prover has not been subverted, and it also will help us build a literature for teaching the next generation of software engineers how to prove their programs. Imagine an update of Kernighan & Plauger's _Software Tools_ (1976), wherein the reader is taught not only how to develop modular, reusable software components, but how to validate their correctness as well. I can't help thinking that a brighter future lies ahead--a future that is only eight to fourteen years away. ;-) To return from my digression, what is the distance from the status quo to a trusted computing base sufficient to build and validate seL4? Regards, Branden [1] https://www.archive.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf [2] I apologize for repeated recourse to my own instinct and intuition to support my assertions; I raise the issues here not to flaunt my ignorance but because I hope the experts here can point me and other interested readers to the real literature on the subject raised. [3] I note also that running coverage analysis of Thompson's subverted C compiler against the Unix sources themselves should also have revealed the fact of its compromise. To truly escape detection, a Thompson-style Trojan must not only recognize the source code of all the tools that might be brought to bear against the subverted executable, but those developed in the future, which is too large a space to tractably exhaust in anticipation--at least in a thriving software ecosystem like that of Unix and its descendants. The pessimism that Thompson expresses is therefore not warranted on technical grounds--but it may still be for cultural ones, in that we don't have enough people subjecting compiled objects to skeptical analysis. Hence formal verification.

Gerwin.Klein＠data61.csiro.au

9 a.m.

New subject: Dynamic environments and self-hosting (was: Multiboot and extra modules)

Signed PGP part At 2018-12-17T11:49:58+0000, Gernot.Heiser@data61.csiro.au<mailto:Gernot.Heiser@data61.csiro.au> wrote: CAmkES is good for building static embedded systems (excluding stuff running in VMs), but can't really do anything more AFAIK. Yes, CAmkES is designed for static architectures, which is where we can give an assurance story. If you want something dynamic, i.e. more resembling a full OS, then have a look at Genode. But then all assurance and ability to reason about security properties goes out of the window. To invert a classic naïve programmer's question, isn't that true only in practice? Yes, that is a problem only in practice, not in theory. I.e. it is perfectly possible, just hard and work intensive to have a verified dynamic system. We even have a PhD project looking to make first steps in that direction. The main difference is that for static systems, it is (relatively) easy to provide automation for verification, and the CAmkES toolchain will be able to produce proofs automatically for you that it has implemented the desired architecture, which then plugs into the seL4 security theorems. It’s basically free for the user of the toolchain. In dynamic systems, that kind of automation will be much harder to achieve, and the properties that you get will likely be either weaker or more complex to formulate. This means, to get verified properties of a CAmkES system, you only need to use the CAmkES tool chain, but to get verified properties of a dynamic system, you will need to invest manual work (at least in the foreseeable future — there are probably classes of things that can be automated there as well, but that’s vague ideas instead of production-ready tools in the next few years). In principle, once sufficient components are in place for seL4 to host its own build dependencies including the theorem prover, then it is a general enough development environment to compile and verify arbitrary extensions. Yes. It’s even easier than that: you only need to be able to host a proof checker, not a full theorem prover. You can then produce the proof on a different system, but check it on a verified system. Proof checking is vastly simpler (and faster) than proof search. Naturally, the theorem prover itself has to be added to the list of "assumed valid" components; we can't use it to verify itself without disproving Gödel (I think). Incompleteness is overrated :-). It is true, of course, but the incompleteness theorem only bites, if you use precisely the logic that the prover implements for the verification of the prover itself. If you change only one of the axioms to something less powerful and still manage to verify your prover, the incompleteness theorem does not apply, and you still have a very verified prover. In essence, you will not have used the prover to verify itself, but you will have used a different prover. There is a project that does exactly that, and we’re very much looking forward to using that prover as a proof checker for the seL4 proofs (the CakeML language already works on top of seL4 in CAmkES components): https://github.com/CakeML/cakeml/tree/master/candle Given that you could even leave out file system and most other infrastructure by baking a proof representation into the image that is loaded into memory at boot, a fully verified binary-level proof checking system is not that far off once the Candle prover is usable and we have a powerful-enough export tool from Isabelle (which we use) into Candle. Hopefully much fewer than 8 years. Cheers, Gerwin

G. Branden Robinson

10:26 a.m.

New subject: Dynamic environments and self-hosting (was: Multiboot and extra modules)

At 2018-12-17T22:00:07+0000, Gerwin.Klein@data61.csiro.au wrote:

...

...
To invert a classic naïve programmer's question, isn't that true only in practice?

Yes, that is a problem only in practice, not in theory.

I.e. it is perfectly possible, just hard and work intensive to have a verified dynamic system. We even have a PhD project looking to make first steps in that direction. The main difference is that for static systems, it is (relatively) easy to provide automation for verification, and the CAmkES toolchain will be able to produce proofs automatically for you that it has implemented the desired architecture, which then plugs into the seL4 security theorems. It’s basically free for the user of the toolchain.

That's an exciting value proposition.

...

...
In principle, once sufficient components are in place for seL4 to host its own build dependencies including the theorem prover, then it is a general enough development environment to compile and verify arbitrary extensions.

Yes. It’s even easier than that: you only need to be able to host a proof checker, not a full theorem prover.

This makes it even more likely that we will eventually see a diversity of proof checkers. For many years I've been used to thinking in terms of security as being a matter of "reducing attack surface", which can be concretized in many ways from number of open network ports to lines of code; it is noteworthy to me that we stand to actually benefit by writing more code, when that code is part of a diverse polyculture (cf. monoculture) deployed for the purpose of mutual verification. Of course this idea is not original to me; it is a generalization of what I first encountered in David Wheeler's paper on "Diverse Double-Compiling"[1].

...

You can then produce the proof on a different system, but check it on a verified system. Proof checking is vastly simpler (and faster) than proof search.

I seem to remember having had a conversation like this before; it pays to remember that we don't always need the heavy lift of a prover.

...

Incompleteness is overrated :-). It is true, of course, but the incompleteness theorem only bites, if you use precisely the logic that the prover implements for the verification of the prover itself. If you change only one of the axioms to something less powerful and still manage to verify your prover, the incompleteness theorem does not apply, and you still have a very verified prover.

That's exciting but it gets me to wondering abstractly. Consider a pair of provers A and B which are duals in some sense. Each one has been weakened as you describe, but in different ways. Because I am not competent to contrive a believable example, I must settle for a dubious one[2]. Let us say that prover A cannot decide logical ORs, and prover B cannot decide logical ANDs. If--and it may be a big if--we can verify each of these two provers by using the other, then have we not solved every class of problem equivalent to Hilbert's Second Problem[3]? Of course, if two provers don't suffice to make this demonstration, we need not stop at two. As long as we can construct a set of provers of finite cardinality, each appropriately weakened, then assuming unbounded computing resources we have achieved the same thing. I assume there is some result in mathematical logic which has foreclosed this possibility. Perhaps that it is impossible to construct a finite set of mutually reinforcing provers. I would have guessed that your statement that we can use a weakened prover to verify a stronger one was already foreclosed by incompleteness. If not, you have persuaded me that generations of computer scientists have overestimated what, exactly, was foreclosed by Gödel. I should refresh myself with Nagel and Newman in the near future.

...

https://github.com/CakeML/cakeml/tree/master/candle

Given that you could even leave out file system and most other infrastructure by baking a proof representation into the image that is loaded into memory at boot, a fully verified binary-level proof checking system is not that far off once the Candle prover is usable and we have a powerful-enough export tool from Isabelle (which we use) into Candle.

I now imagine an object loader which invokes the proof-checking system, coupled with "fat binaries" that embed proofs about themselves inside a dedicated ELF section.

...

Hopefully much fewer than 8 years.

I admit that my time projection is due entirely to the Coen Brothers and not any sort of engineering estimate. Regards, Branden [1] https://dwheeler.com/trusting-trust/dissertation/wheeler-trusting-trust-ddc.... [2] My example is unconvincing because NOR is a functionally complete operator for Boolean logic. That is, one can construct both AND and OR operators through appropriate combinations of NOR gates. https://en.wikipedia.org/wiki/Functional_completeness [3] This is the one that Whitehead and Russell purported to have cracked, only for Gödel to have kicked the football away.

Dave Richards

4:34 a.m.

The existing code does not implement the below policy. If it did, it would not loop across all modules provided, create a region consisting of the concatenation of all module contents and mapping them into the root server (with no direct means of knowing how many were provided or where they are located). One could argue that the code is less secure because one can inject code into the root server at run-time that has never been analyzed. Since the loader is doing most of the work, I think size isn't very relevant here. This said, it's your code. I hope dynamic systems aare in important application of seL4. Dave Just to confirm what has already been stated: while it would be possible for seL4 to load multiple images, however this is incongruent with L4 philosophy. We want to keep the kernel as small as possible to minimise the trusted computing base and maximise policy freedom. The root task can act as a loader to load multiple images according to whatever policies suit the system. One example loader we have is the capdl loader (https://github.com/seL4/capdl) which can load capabilities and set up processes based on a specification. Camkes (https://docs.sel4.systems/CAmkES/) can be used to initialise such specifications. We're working on improving our documentation and tools for setting up seL4 systems, so keep an eye out in the coming year.

2430

Age (days ago)

2432

Last active (days ago)

List overview

Download

11 comments

6 participants

participants (6)

Andrew Warkentin
Anna.Lyons＠data61.csiro.au
Dave Richards
G. Branden Robinson
Gernot.Heiser＠data61.csiro.au
Gerwin.Klein＠data61.csiro.au

Multiboot and extra modules

tags

participants (6)