Thanks Mark and Adrian for your elaborate and helpful replies!
On 05.02.2016 01:46, Adrian Danis wrote:
First an explanation of the kernel branches/versions.
The master branch
has never had the retype at offset functionality, this has always been
solely on the experimental branch. As the master branch is what is
verified, this is what we release and are versioning. You could continue
to use the experimental branch, and retain the use of retype at offset,
but you will have to leave behind the nice versioning.
I am currently trying to find the right balance between effort and being
future proof. I wonder, will the master version eventually follow the
experimental version? If so, I would certainly like to continue using
the retype-at-offset mechanism. If not, I would rather sooner than later
adapt Genode's allocation scheme to the seL4 way. What would you recommend?
Now for a slightly more detailed exploration of retype
strategies.
Mark had things mostly correct here. Originally the kernel required you
to retype an entire untyped at once. That is, you could take a single
untyped and split it into N objects of equal size. Thus resulting in the
common paradigm of splitting them in half as this provided an easy way
to book keep and implement a user level allocator. But this is also
inefficient, as you end up with many intermediate untyped capabilities.
This was reason I hesitated to follow this route initially. I was afraid
of the book keeping needed for all those capability selectors. Now,
after reading Mark's email, I may consider the buddy-allocator approach.
But right now, I still cannot clearly see where to store / whom to
account the meta data of this allocator - so I need some further
thinking. Thanks Mark for this direction!
Another (rather extreme) approach would be to split the initial untyped
memory regions into (let's say) 4K untyped memory regions where each 4K
region would be named after its corresponding physical page number. This
would be like a giant slab allocator with no further meta data needed.
The CNode containing the capability selectors for the 4K untyped memory
regions could be preallocated depending on the amount of physical
memory. For 4GiB of physical memory, I needed 16MiB for the CNode. This
way I would be able to rededicate untyped memory at the granularity of
4K and keep Genode's existing memory management. The obvious downside
would be that I could never create kernel objects larger than 4K, i.e.,
super pages. But the latter raises a bunch of other problems anyway.
To make efficient allocation practical untypeds were
changed for so
called 'incremental retype'. This is what you see currently in the
verified kernel, with the freeIndex. The reason the freeIndex is
monotonic increasing is because the kernel has no way to know if any
previous memory is now free or not. The kernel only knows two things
* Initially none of the memory of the untyped is used, so freeIndex is
set to 0
* After the untyped is revoked all retyped objects are destroyed, so we
know none of the memory is used, so freeIndex can be set to 0
But any other question of 'is there an object at this address?' cannot
be efficiently answered. I say 'efficiently' because it is not
impossible, rather the operation would be potentially very long running
and a malicious user could exploit this. In particular a user could
force an O(n) operation, where N is the amount of memory under control
of the user.
Thank you for providing this rationale.
But being able to retype many objects directly out of
a single untyped
does not help if you if you want to potentially free them and allocate
other objects. For example, if you want to create some object of size X,
you instead need to create an untyped of size X and then your object.
Now if you want to delete that object you can, and create a new object
of size X (or multiple objects of size < X) using that intermediate
untyped.
This is actually my use case.
On the experimental kernel there is a different in
kernel data structure
for managing capabilities (specifically a heap instead of a doubly
linked list). Now the same question of 'is there an object at this
address?' can be answered in O(log(n)) time, which is considered far
more reasonable, and so retype at offset can be written.
From my perspective, this represents a significant
difference between
both branches. Each approach calls for a different design of the
user
land. With Genode, I'd like to implement only one way. Which way would
be the best to go, master or experimental?
For writing actual user level allocators what Mark
said is basically
what we tend to do. Have a buddy allocator, manage different untyped
pool sizes etc. You can see examples of an untyped manager that we wrote
for using on the master branch here
https://github.com/seL4/seL4_libs/blob/master/libsel4allocman/src/utspace/s…,
note that it is part of a larger allocator but should still give some of
the logic.
Thank you very much for the pointer and for covering my questions so
thoroughly.
Cheers
Norman
--
Dr.-Ing. Norman Feske
Genode Labs
http://www.genode-labs.com ·
http://genode.org
Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth