On Wed, Dec 1, 2021 at 9:59 AM Sam Leffler via Devel <devel(a)sel4.systems> wrote:
Some updates inline. Especially interested in answers for #5 and #8.
On Wed, Nov 24, 2021 at 9:12 PM Sam Leffler <sleffler(a)google.com> wrote:
> I've been investigating how capd-loader works and have some questions. My
> apologies if these were previously discussed.
> 1. CDL_Model is mutable. The capDL specification looks like a perfect
> candidate for being in .rodata but it's stored+treated as mutable. Why? I
> see capdl-loader uses the mutability to store mapped page frame caps and,
> for risc-v, VSpace roots, but both can be easily handled w/o modifying the
> spec. Having this immutable has many benefits.
It's currently mutable because of the updates to it as it is
processed. If these updates were removed, then it would be a good idea
to make it immutable. How would you handle the mapped_frame_cap
bookkeeping without storing it in the spec as a single frame could
have multiple shared mappings which each require a cap?
> 2. All page frames are treated as shared.
CAPDL_SHARED_FRAMES is #define'd
> in the code so every frame is treated as shared. This inflates the size of
> the orig_caps array and the kernel capabilities table but seems
> unnecessary. Why? I've verified that instead of doing the Copy op on each
> page you can just check the return of the seL4_Page_Map call and if
> necessary clone the cap. Better would be for camkes to identify shared page
> frames so this doesn't require 2x syscalls (which spam's the console w/
> CONFIG_PRINTING enabled).
What you suggest would be possible. For some object types (like an
endpoint) the original cap is special and can be used to revoke all
children caps that have been derived from it. All frame types don't
have this property which means that the original cap is no different
from any copies of it. Embedding whether the frame is shared into the
object as an attribute would make this easier to optimize as I agree
that it becomes pretty inefficient in typical systems with many more
private frames than shared.
3. Why does
copy_addr_with_pt exist? There are two memory regions created
for mapping page frames into the rootserver address space for doing fill
ops. One is 64KB and the other is 1 page (nominally 4KB). init_frame tries
the 4KB region first then falls back to the 64K region if mapping into the
4KB region fails. Why (i.e. why would the 1st mapping call fail)? On my
risc-v system I never use the 64K region and I'm not sure when it might be
used? Even if it were used, why is 64K always allocated; seems like the
region should be sized according to the target platform. Having this extra
region has a cost that can be significant on small systems.
I got this backwards; on risc-v at least the 64KB region is used but not
the page immediately after the ipc buffer. On my system I eliminated the
latter and shrank the bss region to 1 page. Looking for feedback on whether
this will fail on other platforms.
Short answer: For copy_addr_with_pt, yes we could make the 64kiB
definition aarch32 specific and 4kiB on other platforms.
Long answer: It looks like copy_addr_with_pt is defined to be an
address range that is guaranteed to have mapped page tables down to
the last level of the page table hierarchy and is therefore suitable
for mapping 4kiB pages into (or 64k pages only on aarch32 which is the
only supported arch which allows larger pages to also be mapped in at
the lowest level page table entries. copy_addr on the other hand is
guaranteed to not have a mapped page table at the lowest level of the
page table hierarchy which allows for 1MiB, 2MiB, 4MiB and 16MiB page
sizes to be mapped here and achieves this by picking a virtual address
with alignment of 16MiB above the end of the image but still expecting
this to be within the page table that is one level up from the lowest
level page table in the hierarchy.
Then init_frame tries either region when trying to initialize a frame
of varying size between 4kiB and 16MiB. Frames the next size up,
1GiB, would not be able to be mapped by the loader without adding a
new copy_addr variable that has larger alignment to guarantee it
doesn't collide with a page table at the next level up.
4. console output. Why is sel4_DebugPutChar (or similar) not a standard
part of the kernel? This forces capdl-loader to depend on
libsel4platsupport for console output which in turn affects cap space. I
can't imagine an seL4 system that cannot write to the console and I think
it's entirely reasonable to have a critical system service like the
rootserver use that.
The verified kernel configurations don't have a uart driver in the
kernel. This decision follows from an L4 design principle that the
kernel doesn't provide an implementation for anything that could be
provided with an implementation outside of the kernel. In practice,
if you know that the loader is only going to be used on a specific
platform, then you could write a simple specialized uart driver that
you link in instead of the entirety of libplatsupport.
> 5. duplicate_caps. The scheme for constructing a
running system from a
> spec is straightforward except for duplicating caps. Why are all TCB's and
> CNode caps dup'd? Why does init_tcbs call init_tcb on the original object
> but immediately configure_tcb on the dup? There are other places where the
> dup cap is used instead of the original cap (e.g. init_cnode_slot). None of
> this seems to matter on risc-v; maybe this matters on some other
I don't know the reason for this and it's older than the git history I
have access to. It looks like its unnecessary for TCBs, and is likely
still necessary for CNodes when loading specs that have CNodes that
are moved into slots in other CNodes when constructing a multi-cnode
6. On risc-v
all pages appear to be marked executable--including ipc
buffers! This seems just wrong and a potential security hole.
This looks to be a bug. Marking the NX bit on risc-v when the grant right
is not set gives me what I want. arm already did this.
It's a bug because it doesn't do it the same way as Arm, but the way
arm does it is a bit misleading. The kernel itself only uses the read
and write bits from cap rights for frame mappings, and a cap with
write-only rights gets downgraded to kernel-only mapping, so really
the kernel only lets you describe read-write, read-only or none. For
both read-write and read-only cap abilities, the frame could be mapped
exec or non-exec. The read right implies executable rights from the
kernel's perspective. The capdl loader does add a little bit of extra
security with using the grant right in the spec to imply whether the
frame has executable permissions and tries to create the mapping
accordingly (except on riscv which is a bug), but if the frame cap for
this mapping is also given to the component in order to do things like
cache maintenance invocations, then the component could use that
capability to convert it's non-exec mappings to executable mappings.
There's probably a good argument for overhauling how the kernel
handles frame rights so that you can express executeNever as a
capability right and associate it with the grant or grantreply bit in
the existing capRights. I think this wasn't done originally because
either there were no spare bits in the cap structure at the time, or
armv6 didn't support executeNever mappings (but you'd have to
fact-check me on that).
7. Why does seL4_TCB_WriteRegisters take a _mutable_ seL4_UserContext? I'm
guessing this is a mistake in the xml spec (but haven't looked).
This looks to be a shortcoming in the python script that generates the
> 8. init_cspaces mystifies me. There are two loops over all CNodes; one
> that does "COPY" operations and one that does "MOVE" operations.
> nothing explaining what's happening. I've read this code many times and
> still don't get it; please explain.
As I mentioned higher up, some caps are special as originals and allow
additional operations to be performed on them compared to their copies
even if the cap rights are the same for both. Initializing all of the
cnodes in the system involves moving capabilities (seL4_CNode_Move) or
deriving capabilities from source capabilities either with the same
rights (seL4_CNode_Copy) or lesser rights (seL4_CNode_Mint). Its
split into two passes because the derivation of capabilities need to
happen before the source capabilities are moved into the destination
> 9. According to comments, the use of a global
static to setup the
> UserContext for a new TCB is done because one cannot take the address of a
> local variable. On what architecture is this true? On risc-v the context is
> copied into "registers" before calling the kernel so this doesn't
> be an issue with crossing the user-kernel protection boundary.
This is true for the kernel with the proofs. The very first
implementation of this loader was as part of a project that wanted to
verify it using the same tooling as the kernel proofs. So that
comment is probably documenting a requirement that is no longer
reclamation. I'm unclear what happens to the memory used by
capdl-loader. The process does a TCB_Suspend call when it's done. Does this
cause the CSpace & VSpace to be reclaimed? For CAmkES this doesn't matter
as there's no support for allocating memory but I'm building a system where
this isn't true. How do I release memory (in particular) back to the
This was discussed on the developers zoom call so no need to belabor it.
Sounds like I've got my work cut out for me...
Yea, this one is a bit more involved.
I have a bunch of niggly code
complaints/questions but let's start with
Devel mailing list -- devel(a)sel4.systems
To unsubscribe send an email to devel-leave(a)sel4.systems