capdl-loader questions

25 Nov 2021

      I've been investigating how capd-loader works and have some questions. My
apologies if these were previously discussed.

1. CDL_Model is mutable. The capDL specification looks like a perfect
candidate for being in .rodata but it's stored+treated as mutable. Why? I
see capdl-loader uses the mutability to store mapped page frame caps and,
for risc-v, VSpace roots, but both can be easily handled w/o modifying the
spec. Having this immutable has many benefits.

2. All page frames are treated as shared. CAPDL_SHARED_FRAMES is #define'd
in the code so every frame is treated as shared. This inflates the size of
the orig_caps array and the kernel capabilities table but seems
unnecessary. Why? I've verified that instead of doing the Copy op on each
page you can just check the return of the seL4_Page_Map call and if
necessary clone the cap. Better would be for camkes to identify shared page
frames so this doesn't require 2x syscalls (which spam's the console w/
CONFIG_PRINTING enabled).

3. Why does copy_addr_with_pt exist? There are two memory regions created
for mapping page frames into the rootserver address space for doing fill
ops. One is 64KB and the other is 1 page (nominally 4KB). init_frame tries
the 4KB region first then falls back to the 64K region if mapping into the
4KB region fails. Why (i.e. why would the 1st mapping call fail)? On my
risc-v system I never use the 64K region and I'm not sure when it might be
used? Even if it were used, why is 64K always allocated; seems like the
region should be sized according to the target platform. Having this extra
region has a cost that can be significant on small systems.

4. console output. Why is sel4_DebugPutChar (or similar) not a standard
part of the kernel? This forces capdl-loader to depend on
libsel4platsupport for console output which in turn affects cap space. I
can't imagine an seL4 system that cannot write to the console and I think
it's entirely reasonable to have a critical system service like the
rootserver use that.

5. duplicate_caps.  The scheme for constructing a running system from a
spec is straightforward except for duplicating caps. Why are all TCB's and
CNode caps dup'd? Why does init_tcbs call init_tcb on the original object
but immediately configure_tcb on the dup? There are other places where the
dup cap is used instead of the original cap (e.g. init_cnode_slot). None of
this seems to matter on risc-v; maybe this matters on some other
architectures?

6. On risc-v all pages appear to be marked executable--including ipc
buffers! This seems just wrong and a potential security hole.

7. Why does seL4_TCB_WriteRegisters take a _mutable_ seL4_UserContext? I'm
guessing this is a mistake in the xml spec (but haven't looked).

8. init_cspaces mystifies me. There are two loops over all CNodes; one that
does "COPY" operations and one that does "MOVE" operations. I see nothing
explaining what's happening. I've read this code many times and still don't
get it; please explain.

9. According to comments, the use of a global static to setup the
UserContext for a new TCB is done because one cannot take the address of a
local variable. On what architecture is this true? On risc-v the context is
copied into "registers" before calling the kernel so this doesn't appear to
be an issue with crossing the user-kernel protection boundary.

10. Memory reclamation. I'm unclear what happens to the memory used by
capdl-loader. The process does a TCB_Suspend call when it's done. Does this
cause the CSpace & VSpace to be reclaimed? For CAmkES this doesn't matter
as there's no support for allocating memory but I'm building a system where
this isn't true. How do I release memory (in particular) back to the
untyped pool?

I have a bunch of niggly code complaints/questions but let's start with the
above.

-Sam

Sam Leffler

tags

participants (2)