[seL4] seL4 + TrustZone: Spanning both worlds

4 Jun 2020

      On TrustZone-enabled Arm systems, seL4 could span both the secure and normal worlds. To do so would require extending untyped and frame capabilities with a bit associating them with either the secure or non-secure physical address space. In this email, I’ll touch on what a software stack based on this idea might look like, why that’s interesting, and exactly what extensions to the kernel it would require.

I’m interested in your thoughts and feedback on this informal proposal. If there’s interest, I’ll draft a patch to illustrate it more precisely.

# TrustZone background

The security of a TrustZone-enabled system is achieved by partitioning all of the SoC’s hardware and software resources so that they exist in one of two worlds - the secure world for the security subsystem, and the normal world for everything else. From the point of view of an ARMv8-A processor, the system has two physical address spaces: one for secure transactions and another for non-secure transactions. Page table entries contain a field called the NS (non-secure) bit, which determines whether a page is mapped from the secure or non-secure physical address space. Furthermore, the processor state contains a global NS bit, which can only be modified by software executing at the highest exception level, EL3. When the NS processor state bit is set, the processor is said to be executing in the non-secure state, and can’t issue secure bus transactions, regardless of the values of the NS bits in its translation tables.

A typical TrustZone software stack hosts a TEE (Trusted Execution Environment) on secure-world resources, and a REE (Rich Execution Environment) on normal-world resources. A so-called “secure monitor” runs in EL3, and uses the NS processor state bit to implement a coarse context switch called a “world switch”. The simplicity of this coarse context switch minimizes the attack surface of the secure monitor.

# seL4 + TrustZone

A formally verified microkernel such as seL4 can be trusted to protect secure hardware resources, and the TEE they host, using translation tables alone. That is, seL4 allows for a design where the processor always runs in state NS=0 (“secure state”), and normal-world software is restricted to non-secure world hardware resources using just the NS bits in the translation tables controlled by seL4.

Such a software stack, with seL4 spanning both the secure and non-secure worlds, is attractive on multiple fronts. seL4 provides a higher level of assurance than typical EL3 monitor firmware, which we can now eliminate from the TCB. seL4 also provides a richer API than typical EL3 firmware, allowing for security critical secure-world components to be developed and verified at a higher level of abstraction. Furthermore, while the two-world model is organized around just one strong security boundary, seL4 allows for an unbounded number of equally strong security boundaries, as isolation is based on translation tables rather than the less flexible NS processor state bit.

This design allows a seL4-based system to utilize non-secure hardware resources, while protecting the kernel and sensitive components inside secure hardware resources. This permits a threat model similar to that of a typical TrustZone-based TEE (such as OP-TEE or Trusty), which includes some hardware attacks, but without limiting seL4 userland components to the secure world alone.

Such a software stack has been especially interesting since the release of ARMv8.4-A, which enables EL2 in the secure state (S-EL2). seL4 running in S-EL2 can host a rich operating system such as Linux to drive non-secure hardware resources. Compare this to a seL4-based software stack running a Linux VM without the help of TrustZone. In particular, consider a case where the Linux VM drives most of the hardware platform’s devices via direct passthru access. Reasoning about the interactions between hardware components on a SoC is difficult. Even after correctly programming the SMMU, a great deal of SoC-specific detail must be covered in order to convince yourself that a driver or VM can’t violate its isolation indirectly through a device within its reach. This sort of reasoning is an active area of research (see, for example, the Barrelfish OS project). However, for use cases where the trust model aligns with the SoC’s boundary between secure and non-secure hardware resources, TrustZone provides a straightforward means of ensuring the protection of critical software components from those with access to non-secure devices.

# Extending seL4

Currently, untyped objects are either general purpose memory or device memory. To support the design I’ve proposed, untyped objects would need to be extended with another dimension: secure memory or non-secure memory. Non-secure general purpose memory would be subject to the same restrictions as device memory, with the exception that frames derived from non-secure general purpose untyped may be used as IPC buffers. Unlike the general-purpose/device dimension, the secure/non-secure dimension affects the mapping of a frame. The kernel would set the NS bit in page table entries according to whether that frame is secure.

This extension would require modifying just a few lines of code in the kernel itself and would, of course, be optional and configurable.

- - - -
Nick Spinale (nickspinale.com)
Arm Research, Cambridge UK
- - - -

The ideas presented in this email constitute research activity at Arm rather than an official position or recommendation.

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.