[seL4] Re: User-level proofs and capability revocation

26 Mar 2025

      On 26 Mar 2025, at 04:51, Anton Burtsev via Devel <devel@sel4.systems> wrote:

I have a couple of questions on how user-level proofs are constructed in
the seL4 ecosystem.

Just to start, can someone point me to an example of a user-level proof,
just so I can take a look at how specifications are exported to user
processes and how everything is stitched together.

There aren’t any fully stitched-together user-level proofs yet. Mostly, because user-level verification requires a concurrency logic on top of the kernel. There is work underway in multiple groups to address that.

One exception to needing concurrency is the user-level system initialiser (because it is the only process that runs at that time). Proofs here: https://github.com/seL4/l4v/tree/master/sys-init

Other current approaches to user-level verification are to ignore the concurrency aspect and use sequential verification via SMT or other techniques for user-level programs, with a manually crafted adaptation of the seL4 spec. The UNSW folks have had some success with that, and I’ll let them comment on it.

But also, I'm a bit confused on what kind of assumptions are needed to
build verified user-level code in the face of revocable memory.  I.e.,
it seems that one needs to assume (and trust) that the memory of any
verified (or even rather mission critical) process is never revoked.

Yes, of course. In static systems that is simple and guaranteed by correct system initialisation. In a normal CAmkES or Microkit system, nobody in the system has the authority to revoke the memory of other components after initialisation, and seL4’s integrity theorem gives you that this stays this way.

In dynamic systems, that is less easy and requires a proof about the components that do have authority to revoke memory of other components.

Maybe I'm wrong and missing something and you have a way in mind for how
to bypass such trust. Or maybe one can argue that such trust is
reasonable, which is probably true for a "flat" system in which the
mission critical domain is created right from the trusted boot process
or some trusted TCB above it.

Those systems don’t need to flat, they could easily be hierarchical. The important part is that they are static in the sense that authority to revoke no longer exists in the system after (verified) user-level initialisation is done. In that kind of system, the problem is not a problem. Most current high-assurance systems are in that class.

Dynamic systems are more interesting, but that is because resource re-use inherently has this problem. It doesn’t really come from anything related to seL4. If you can destroy a resource, the chain that leads to that action needs to be trusted. If you cannot destroy resources, but can create them, you immediately have the potential for denial of service. seL4 provides the mechanisms to do all that, but it doesn’t prescribe how you use them.

Yet it seems hard to argue that such trust
is reasonable when the recursion is deeper, e.g., the boot process
launches Linux, and then Linux tries to start some kind of a mission
critical process like a microkernel-protected software TEE or something
(out of its memory hence it can later revoke it). And one wants to build
a proof about this TCB yet such proof needs to deal with memory that can
be revoked at any time or assume that revocation is impossible.

You would never give Linux authority to directly create and revoke memory that others need to trust. In fact, something like Linux should never get much direct seL4 authority at all. It’ll usually be managed by a virtual machine monitor and even that would be constrained in authority as much as you can.

If you want an untrusted Linux process to start something trusted, you’d add a separate management component that is small and can be trusted, which Linux can then signal to start something. In a dynamic system, you’d usually want to do that anyway, just for keeping dynamic seL4 resource and authority handling confined and verifiable (you could imagine such management components to be nested and hierarchical as well).

I would still question the wisdom of letting a large untrusted code base like Linux trigger anything trusted. How do you know it does so when you want it to? It could just pretend to any client that it triggered the trusted component, but fake a result instead. Etc. There are ways around most of these problems, but it’d usually better to architect the system in a way that doesn’t require that trust inversion in the first place. The exception is when you build multiple system versions to migrate between architectures, then an intermediate state like that makes sense.

Just wandering what is the take on this. I apologize if this all is
covered somewhere in seL4 docs and I just didn't dig deep enough.

I don’t think it is covered explicitly in the docs, it’s too high level for that. It would be a good idea write up some of higher-level concepts of how things can be organised to minimise trust.

Cheers,
Gerwin

[seL4] Re: User-level proofs and capability revocation

Gerwin Klein