Thanks Mark and Adrian for your elaborate and helpful replies! On 05.02.2016 01:46, Adrian Danis wrote:
First an explanation of the kernel branches/versions. The master branch has never had the retype at offset functionality, this has always been solely on the experimental branch. As the master branch is what is verified, this is what we release and are versioning. You could continue to use the experimental branch, and retain the use of retype at offset, but you will have to leave behind the nice versioning.
I am currently trying to find the right balance between effort and being future proof. I wonder, will the master version eventually follow the experimental version? If so, I would certainly like to continue using the retype-at-offset mechanism. If not, I would rather sooner than later adapt Genode's allocation scheme to the seL4 way. What would you recommend?
Now for a slightly more detailed exploration of retype strategies.
Mark had things mostly correct here. Originally the kernel required you to retype an entire untyped at once. That is, you could take a single untyped and split it into N objects of equal size. Thus resulting in the common paradigm of splitting them in half as this provided an easy way to book keep and implement a user level allocator. But this is also inefficient, as you end up with many intermediate untyped capabilities.
This was reason I hesitated to follow this route initially. I was afraid of the book keeping needed for all those capability selectors. Now, after reading Mark's email, I may consider the buddy-allocator approach. But right now, I still cannot clearly see where to store / whom to account the meta data of this allocator - so I need some further thinking. Thanks Mark for this direction! Another (rather extreme) approach would be to split the initial untyped memory regions into (let's say) 4K untyped memory regions where each 4K region would be named after its corresponding physical page number. This would be like a giant slab allocator with no further meta data needed. The CNode containing the capability selectors for the 4K untyped memory regions could be preallocated depending on the amount of physical memory. For 4GiB of physical memory, I needed 16MiB for the CNode. This way I would be able to rededicate untyped memory at the granularity of 4K and keep Genode's existing memory management. The obvious downside would be that I could never create kernel objects larger than 4K, i.e., super pages. But the latter raises a bunch of other problems anyway.
To make efficient allocation practical untypeds were changed for so called 'incremental retype'. This is what you see currently in the verified kernel, with the freeIndex. The reason the freeIndex is monotonic increasing is because the kernel has no way to know if any previous memory is now free or not. The kernel only knows two things * Initially none of the memory of the untyped is used, so freeIndex is set to 0 * After the untyped is revoked all retyped objects are destroyed, so we know none of the memory is used, so freeIndex can be set to 0 But any other question of 'is there an object at this address?' cannot be efficiently answered. I say 'efficiently' because it is not impossible, rather the operation would be potentially very long running and a malicious user could exploit this. In particular a user could force an O(n) operation, where N is the amount of memory under control of the user.
Thank you for providing this rationale.
But being able to retype many objects directly out of a single untyped does not help if you if you want to potentially free them and allocate other objects. For example, if you want to create some object of size X, you instead need to create an untyped of size X and then your object. Now if you want to delete that object you can, and create a new object of size X (or multiple objects of size < X) using that intermediate untyped.
This is actually my use case.
On the experimental kernel there is a different in kernel data structure for managing capabilities (specifically a heap instead of a doubly linked list). Now the same question of 'is there an object at this address?' can be answered in O(log(n)) time, which is considered far more reasonable, and so retype at offset can be written.
From my perspective, this represents a significant difference between both branches. Each approach calls for a different design of the user land. With Genode, I'd like to implement only one way. Which way would be the best to go, master or experimental?
For writing actual user level allocators what Mark said is basically what we tend to do. Have a buddy allocator, manage different untyped pool sizes etc. You can see examples of an untyped manager that we wrote for using on the master branch here https://github.com/seL4/seL4_libs/blob/master/libsel4allocman/src/utspace/sp...,
note that it is part of a larger allocator but should still give some of the logic.
Thank you very much for the pointer and for covering my questions so thoroughly. Cheers Norman -- Dr.-Ing. Norman Feske Genode Labs http://www.genode-labs.com · http://genode.org Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth