Capability unwrapping

Norman Feske

11 Feb 2015 11 Feb '15

10:48 a.m.

Hello, after a rather long break, I picked up my work with porting Genode to seL4. More specifically, I try to find a good way to realize Genode's RPC mechanism with seL4 IPC. But admittedly, I am struggling a bit. Ideally, each Genode RPC should be realized as a seL4 IPC call. Unfortunately, however, I find the kernel interface too restrictive to do that. There are two issues. 1. Delegation of multiple capabilities at once According to Chapter 4.2.2 of the manual, the kernel allows the delegation of merely a single capability for each IPC whereas the Genode API does not have such a restriction. It effectively renders my idea for working around the capability re-identification problem [1] by representing each Genode capability by a tuple (or triple) of seL4 endpoint capabilities moot. [1] http://sel4.systems/pipermail/devel/2014-November/000112.html Is there a fundamental reason for this restriction? If not, would you be open to make the kernel more flexible with regard to the maximum number of delegations per IPC? 2. Aliasing of unwrapped capabilities with delegated capabilities I understand that the kernel will automatically "unwrap" capabilities if the receiver imprinted a badge into the received capability. In my experiments, the mechanism works as expected. However, how can the receiver determine, which capability got unwrapped if multiple capabilities were transferred? For example, if the sender specifies three capabilities, two of them "unwrappable" and one delegated, the receiver will see 3 capabilities. The delegated cap is written to the specified 'ReceivePath' whereas the badges of the two unwrapped caps can be read from the message buffer. But I see no way for the receiver to decide, which of the three capability arguments got unwrapped and which got delegated. How could the receiver determine this information? Best regards Norman -- Dr.-Ing. Norman Feske Genode Labs http://www.genode-labs.com · http://genode.org Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth

Show replies by date

Thomas Sewell

12 Feb 12 Feb

12:54 a.m.

Hello Norman. I'm not really qualified to talk about why the kernel only allows copying of a single cap. The kernel designers certainly wanted to put a hard limit on the amount of work that can happen in a single IPC, but I don't know why they allowed three cap lookups on the sender side but only one cap save on the receiver. I can talk about the cap-unwrapping mechanism. Let me clarify how it works somewhat. The kernel will unwrap capabilities if they are to the same endpoint as the receiver is receiving on. It does not matter whether or not the capabilities have badges. The idea is that there's little point copying to a server a collection of capabilities with which to talk to itself. The receiver can identify which of the sender's capabilities were unwrapped by inspecting the "msgCapsUnwrapped" field. That's its kernel name - libsel4 calls it the capsUnwrapped field of seL4_MessageInfo. It's a bitfield of length 3, with one bit per possible cap transfer attempt. It tells you which caps were unwrapped. For example, if the sender tried to send caps A, B, C to server X, and caps A and C are in fact caps to the server's endpoint anyway, then the server will receive a message with extraCaps=11 (three caps used in the message) and capsUnwrapped=101 (first and third cap were unwrapped) and it will find a delegate of cap B wherever it specified its receive pointer. If it A, B, C are sent and only A is a cap to the server's endpoint, the server will receive a delegate of B, C will be dropped, and capsUnwrapped=001 will indicate that A was unwrapped (note the first cap is in bottom position). I think that the server will receive extraCaps=10 - that is, it will not be told that the sender was attempting to send three caps. I suppose this should be part of the client protocol, and the kernel mechanisms should focus on telling the server what has actually been modified. I hope that's helpful, Thomas. On 11/02/15 21:48, Norman Feske wrote:

...

Hello,

after a rather long break, I picked up my work with porting Genode to seL4. More specifically, I try to find a good way to realize Genode's RPC mechanism with seL4 IPC. But admittedly, I am struggling a bit.

Ideally, each Genode RPC should be realized as a seL4 IPC call. Unfortunately, however, I find the kernel interface too restrictive to do that. There are two issues.

1. Delegation of multiple capabilities at once

According to Chapter 4.2.2 of the manual, the kernel allows the delegation of merely a single capability for each IPC whereas the Genode API does not have such a restriction. It effectively renders my idea for working around the capability re-identification problem [1] by representing each Genode capability by a tuple (or triple) of seL4 endpoint capabilities moot.

[1] http://sel4.systems/pipermail/devel/2014-November/000112.html

Is there a fundamental reason for this restriction? If not, would you be open to make the kernel more flexible with regard to the maximum number of delegations per IPC?

2. Aliasing of unwrapped capabilities with delegated capabilities

I understand that the kernel will automatically "unwrap" capabilities if the receiver imprinted a badge into the received capability. In my experiments, the mechanism works as expected. However, how can the receiver determine, which capability got unwrapped if multiple capabilities were transferred?

For example, if the sender specifies three capabilities, two of them "unwrappable" and one delegated, the receiver will see 3 capabilities. The delegated cap is written to the specified 'ReceivePath' whereas the badges of the two unwrapped caps can be read from the message buffer. But I see no way for the receiver to decide, which of the three capability arguments got unwrapped and which got delegated. How could the receiver determine this information?

Best regards Norman

________________________________ The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.

Norman Feske

9:34 a.m.

Hi Thomas,

...

The receiver can identify which of the sender's capabilities were unwrapped by inspecting the "msgCapsUnwrapped" field. That's its kernel name - libsel4 calls it the capsUnwrapped field of seL4_MessageInfo. It's a bitfield of length 3, with one bit per possible cap transfer attempt. It tells you which caps were unwrapped.

For example, if the sender tried to send caps A, B, C to server X, and caps A and C are in fact caps to the server's endpoint anyway, then the server will receive a message with extraCaps=11 (three caps used in the message) and capsUnwrapped=101 (first and third cap were unwrapped) and it will find a delegate of cap B wherever it specified its receive pointer.

If it A, B, C are sent and only A is a cap to the server's endpoint, the server will receive a delegate of B, C will be dropped, and capsUnwrapped=001 will indicate that A was unwrapped (note the first cap is in bottom position). I think that the server will receive extraCaps=10 - that is, it will not be told that the sender was attempting to send three caps. I suppose this should be part of the client protocol, and the kernel mechanisms should focus on telling the server what has actually been modified.

I hope that's helpful,

thank you very much for this thorough and helpful explanation. I misinterpreted capsUnwrapped as being a counter. But it makes perfect sense that it is a bitfield (and it is actually stated so in the manual). I could successfully reproduce your examples. Best regards Norman -- Dr.-Ing. Norman Feske Genode Labs http://www.genode-labs.com · http://genode.org Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth

Gerwin Klein

2:42 a.m.

Hi Norman, we’ll have an internal discussion whether we can allow more cap transfers in one IPC. I don’t remember any fundamental limitation there, but we’ll need to dig into it a bit. As Tom pointed out, there will have to be some limit on the number of caps transferred, just to get a bounded and not too large WCET, but that limit can probably be quite a bit larger than 1. How many would your application need at most, and more generally speaking, what would you think is a good limit? Cheers, Gerwin

...

On 11 Feb 2015, at 21:48, Norman Feske <norman.feske@genode-labs.com> wrote:

Hello,

after a rather long break, I picked up my work with porting Genode to seL4. More specifically, I try to find a good way to realize Genode's RPC mechanism with seL4 IPC. But admittedly, I am struggling a bit.

Ideally, each Genode RPC should be realized as a seL4 IPC call. Unfortunately, however, I find the kernel interface too restrictive to do that. There are two issues.

1. Delegation of multiple capabilities at once

According to Chapter 4.2.2 of the manual, the kernel allows the delegation of merely a single capability for each IPC whereas the Genode API does not have such a restriction. It effectively renders my idea for working around the capability re-identification problem [1] by representing each Genode capability by a tuple (or triple) of seL4 endpoint capabilities moot.

[1] http://sel4.systems/pipermail/devel/2014-November/000112.html

Is there a fundamental reason for this restriction? If not, would you be open to make the kernel more flexible with regard to the maximum number of delegations per IPC?

2. Aliasing of unwrapped capabilities with delegated capabilities

I understand that the kernel will automatically "unwrap" capabilities if the receiver imprinted a badge into the received capability. In my experiments, the mechanism works as expected. However, how can the receiver determine, which capability got unwrapped if multiple capabilities were transferred?

For example, if the sender specifies three capabilities, two of them "unwrappable" and one delegated, the receiver will see 3 capabilities. The delegated cap is written to the specified 'ReceivePath' whereas the badges of the two unwrapped caps can be read from the message buffer. But I see no way for the receiver to decide, which of the three capability arguments got unwrapped and which got delegated. How could the receiver determine this information?

Best regards Norman

-- Dr.-Ing. Norman Feske Genode Labs

http://www.genode-labs.com · http://genode.org

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth

_______________________________________________ Devel mailing list Devel@sel4.systems https://sel4.systems/lists/listinfo/devel

Norman Feske

11:19 a.m.

Hi Gerwin,

...

we’ll have an internal discussion whether we can allow more cap transfers in one IPC. I don’t remember any fundamental limitation there, but we’ll need to dig into it a bit.

great that you are considering this change!

...

As Tom pointed out, there will have to be some limit on the number of caps transferred, just to get a bounded and not too large WCET, but that limit can probably be quite a bit larger than 1.

How many would your application need at most, and more generally speaking, what would you think is a good limit?

I reviewed all of Genode's RPC interfaces. The maximum number of Genode-capability arguments per RPC call is 2. If I represent each Genode capability by a triple of seL4 endpoint capabilities, the kernel would need to support the delegation of up to 6 seL4 caps. Regarding the interface, let me share my experience with other kernels. The receiver needs a way to declare the designated names for the received capabilities. Other kernels (such as NOVA and Fiasco.OC) allow the receiver to specify a single naturally-aligned "receive window" within the receiver's capability space. All delegated capabilities will end up within the window. In my experience, this approach is far from perfect. Since the receive window must be void of any existing capability (i.e., NOVA does not allow "overmap"), the receiver has to allocate a completely free window dimensioned (and aligned) with the maximum number of capabilities to expect, e.g., 4. If a capability comes in, it will populate the first slot of the receive window but the other 3 slots will remain empty. Because the window is polluted with a cap now, the receiver needs to allocate an entirely new window for the next IPC. Consequently, this approach tends to populate the receiver's capability space quite sparsely and wastes capability slots.

...

From my perspective, it would be much better if the receiver was able to specify a 'ReceivePath' for each individual capability. The receiver could thereby allocate each index individually instead of allocating a window of consecutive indices.

Thanks again for discussing my request within your group. I look forward to the outcome of it. Cheers Norman -- Dr.-Ing. Norman Feske Genode Labs http://www.genode-labs.com · http://genode.org Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth

Gernot Heiser

11:39 a.m.

Norman, Thanks for the explanation. Yes, the receive window is very V2-ish, and has distasteful side effects. The receiver clearly needs more control. Which means that things get ore complicated when more caps are supported. I’d like to understand the specific use cases better. I accept that one cap isn’t enough, there are likely unsolvable races, which require >1 caps to be transferred atomically. My intuition is that 3 should enough to solve all fundamental problems, and anything beyond that boils down to efficiency arguments/trade-offs. Gernot

...

On 12 Feb 2015, at 22:19 , Norman Feske <norman.feske@genode-labs.com> wrote:

Hi Gerwin,

...
we’ll have an internal discussion whether we can allow more cap transfers in one IPC. I don’t remember any fundamental limitation there, but we’ll need to dig into it a bit.

great that you are considering this change!

...
As Tom pointed out, there will have to be some limit on the number of caps transferred, just to get a bounded and not too large WCET, but that limit can probably be quite a bit larger than 1.

How many would your application need at most, and more generally speaking, what would you think is a good limit?

I reviewed all of Genode's RPC interfaces. The maximum number of Genode-capability arguments per RPC call is 2. If I represent each Genode capability by a triple of seL4 endpoint capabilities, the kernel would need to support the delegation of up to 6 seL4 caps.

Regarding the interface, let me share my experience with other kernels. The receiver needs a way to declare the designated names for the received capabilities. Other kernels (such as NOVA and Fiasco.OC) allow the receiver to specify a single naturally-aligned "receive window" within the receiver's capability space. All delegated capabilities will end up within the window. In my experience, this approach is far from perfect. Since the receive window must be void of any existing capability (i.e., NOVA does not allow "overmap"), the receiver has to allocate a completely free window dimensioned (and aligned) with the maximum number of capabilities to expect, e.g., 4. If a capability comes in, it will populate the first slot of the receive window but the other 3 slots will remain empty. Because the window is polluted with a cap now, the receiver needs to allocate an entirely new window for the next IPC. Consequently, this approach tends to populate the receiver's capability space quite sparsely and wastes capability slots.

From my perspective, it would be much better if the receiver was able to specify a 'ReceivePath' for each individual capability. The receiver could thereby allocate each index individually instead of allocating a window of consecutive indices.

Thanks again for discussing my request within your group. I look forward to the outcome of it.

Cheers Norman

-- Dr.-Ing. Norman Feske Genode Labs

http://www.genode-labs.com · http://genode.org

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth

_______________________________________________ Devel mailing list Devel@sel4.systems https://sel4.systems/lists/listinfo/devel

Norman Feske

12:27 p.m.

Hi Gernot,

...

I’d like to understand the specific use cases better. I accept that one cap isn’t enough, there are likely unsolvable races, which require >1 caps to be transferred atomically. My intuition is that 3 should enough to solve all fundamental problems, and anything beyond that boils down to efficiency arguments/trade-offs.

I agree that more than 3 caps per IPC are unlikely. Actually, in Genode's current RPC interfaces, the maximum is 2. The value of 6 accounts for the fact that I cannot directly represent a Genode capability by an seL4 endpoint capability. As discussed in the responses to [1], seL4 does not allow the receiver to re-identify a capability that he forwarded and then received back. However, Genode relies on a way to re-identify capabilities. The badge is not enough because it cannot be used by intermediate components. To solve this problem, I came up with the idea to represent each Genode capability by 3 seL4 endpoint capabilities. * The first is the actual capability referring to the object at the server. * The second is an endpoint capability locally created within the component when the Genode capability entered the component. It is badged using a component-local endpoint. Hence, it can be re-identified when passed to another component and handed back. * The third is an endpoint capability created by the direct originator of the capability. It corresponds to the second capability at the originating component. Under the hood, when passing a Genode capability as argument to an RPC call, all three seL4 endpoint capabilities will be transferred. When such a Genode capability is handed back to the component, the third received seL4 capability can be used to re-identify the context associated with the Genode capability because its badge was imprinted locally by the component. Effectively, the second and third seL4 capabilities record the last two steps of the delegation history. They are never invoked or used otherwise. The approach does not entirely solve the general re-identification problem but, as far as I can see, it covers the patterns found in Genode. I admit that it seems to be a bit wasteful. But I have not come up with a better solution. The alternative approaches (see [2]) are even less satisfying. [1] http://sel4.systems/pipermail/devel/2014-November/000112.html [2] http://sel4.systems/pipermail/devel/2014-November/000114.html Cheers Norman -- Dr.-Ing. Norman Feske Genode Labs http://www.genode-labs.com · http://genode.org Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth

Tim Newsham

5:20 p.m.

On Thu, Feb 12, 2015 at 2:27 AM, Norman Feske <norman.feske@genode-labs.com> wrote:

...

Under the hood, when passing a Genode capability as argument to an RPC call, all three seL4 endpoint capabblilities will be transferred. When such a Genode capability is handed back to the component, the third received seL4 capability can be used to re-identify the context associated with the Genode capability because its badge was imprinted locally by the component.

Doesn't the fact that these three capabilities are not bound together in any way lead to problems? What if a malicious server juggled a few capabilities, replacing the third capability in a response with a different third capability from an earlier request, for example?

...

Norman

-- Tim Newsham | www.thenewsh.com/~newsham | @newshtwit | thenewsh.blogspot.com

Norman Feske

13 Feb 13 Feb

11:46 a.m.

Hi Tim,

...

Doesn't the fact that these three capabilities are not bound together in any way lead to problems? What if a malicious server juggled a few capabilities, replacing the third capability in a response with a different third capability from an earlier request, for example?

that is indeed an important question. But I am confident that this is not a problem. Please consider that the two supplemental capabilities are merely used by the receiver as a key to look up an existing Genode capability (triple of seL4 caps) at the receiver. The receiver will never use the endpoint capability (the first one of the triple) that came from the sender, but will keep using the looked-up (known-good) Genode capability. In the worst case, the sender could replace the supplemental caps of a Genode capability A by the ones of another Genode capability B, and pass the forged version of capability A to the receiver. The lookup at the receiver would indeed wrongly find B. But what would be the benefit for the sender? It could have specified B instead of the forged version of A in the first place. Cheers Norman -- Dr.-Ing. Norman Feske Genode Labs http://www.genode-labs.com · http://genode.org Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth

Raoul Duke

12 Feb 12 Feb

6:55 p.m.

...

How many would your application need at most, and more generally speaking, what would you think is a good limit?

I woulda thunk the only reasonable software engineering answer to this is: "0, 1, max_int" (or whatever), no? Some other arbitrary value (like 6) sure seems a poor design to me. (Now, sure, there are times when it actually truly is not so easy - consider things like Scala's tupling only going up to 5 http://www.scala-lang.org/api/current/index.html#scala.Function$ - although it is still kinda sad in those cases too.)

Gernot Heiser

8:50 p.m.

On 13 Feb 2015, at 5:55 , Raoul Duke <raould@gmail.com> wrote:

...

...
How many would your application need at most, and more generally speaking, what would you think is a good limit?

I woulda thunk the only reasonable software engineering answer to this is: "0, 1, max_int" (or whatever), no? Some other arbitrary value (like 6) sure seems a poor design to me.

From the software-engineerin PoV you’re right. But those principles don’t apply at such low-level things like a microkernel. Eg teh number of hardware registers is somewhere in the 1-infinity range, we care about worst-case execution time, and a fundamental microkernel-design principle is to provide the minimum needed, and not more. There are cases when functionally 2 ≠ 1+1, e.g. when there is a need to do things atomically. But there will be some number >1 which is sufficient for a universal mechanism. This is what we’d like to find. Gernot ________________________________ The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.

Thomas Sewell

11:30 p.m.

In principle the reasonable limit is 1. If a sender can transfer 1 cap per message, it can transfer any number of caps by sending a lot of messages. Some protocol can be built to reassemble the intended message from the flurry of messages actually being sent. This is all necessary if the user code needs to send an arbitrary number of capabilities. However if in practice the number is always 1, 2 or 3, it makes sense for the kernel to support these cases directly to simplify the user implementations and to avoid wasting as much time context switching. Cheers, Thomas. On 13/02/15 05:55, Raoul Duke wrote:

...

...
How many would your application need at most, and more generally speaking, what would you think is a good limit? I woulda thunk the only reasonable software engineering answer to this is: "0, 1, max_int" (or whatever), no? Some other arbitrary value (like 6) sure seems a poor design to me.

(Now, sure, there are times when it actually truly is not so easy - consider things like Scala's tupling only going up to 5 http://www.scala-lang.org/api/current/index.html#scala.Function$ - although it is still kinda sad in those cases too.)

_______________________________________________ Devel mailing list Devel@sel4.systems https://sel4.systems/lists/listinfo/devel

Raoul Duke

11:51 p.m.

...

However if in practice the number is always 1, 2 or 3

Harry Butterworth

13 Feb 13 Feb

12:39 a.m.

Is it possible to send 1 cap to transfer ownership of a cnode full of more caps? On Feb 13, 2015 7:54 AM, "Raoul Duke" <raould@gmail.com> wrote:

...

...
However if in practice the number is always 1, 2 or 3

I understand that there is a pragmatic choice to be made. And obviously I don't have a leg to stand on since (a) I do not contribute to the project, (b) nor do I even use the project! So this is one of those, "somebody is wrong on the Internet" kind of bike-shedding true-scotsman type nit picking threads on my part.

Still... so far I haven't liked the wording used. I mean, how can anybody know what the "in practice" number "always" is? It seemed like /one person/ mentioned their one use case, and the magic value that they personally like/need. That doesn't seem to be anywhere near, "is always" to me. What if I decided tomorrow *I* need 7? Or even what if the person who said 6 comes back in a month and says, "oh, golly, did I say 6? I really meant to say 12." Yes, there is some amount of reductio ad absurdum involved ;-). But the way I have read the messages here it is as if God told us 6 is the right number. :-)

I'll do my best to shut up now :-) :-)

_______________________________________________ Devel mailing list Devel@sel4.systems https://sel4.systems/lists/listinfo/devel

Gerwin Klein

3:12 a.m.

Yes, that would be possible. It may imply a similar amount of book-keeping to sending caps sequentially, though, because you may need to allocate this cap storage every time. Depends a lot on the application. Cheers, Gerwin On 13 Feb 2015, at 11:39, Harry Butterworth <heb1001@gmail.com<mailto:heb1001@gmail.com>> wrote: Is it possible to send 1 cap to transfer ownership of a cnode full of more caps? On Feb 13, 2015 7:54 AM, "Raoul Duke" <raould@gmail.com<mailto:raould@gmail.com>> wrote:

...

However if in practice the number is always 1, 2 or 3

I understand that there is a pragmatic choice to be made. And obviously I don't have a leg to stand on since (a) I do not contribute to the project, (b) nor do I even use the project! So this is one of those, "somebody is wrong on the Internet" kind of bike-shedding true-scotsman type nit picking threads on my part. Still... so far I haven't liked the wording used. I mean, how can anybody know what the "in practice" number "always" is? It seemed like /one person/ mentioned their one use case, and the magic value that they personally like/need. That doesn't seem to be anywhere near, "is always" to me. What if I decided tomorrow *I* need 7? Or even what if the person who said 6 comes back in a month and says, "oh, golly, did I say 6? I really meant to say 12." Yes, there is some amount of reductio ad absurdum involved ;-). But the way I have read the messages here it is as if God told us 6 is the right number. :-) I'll do my best to shut up now :-) :-) _______________________________________________ Devel mailing list Devel@sel4.systems<mailto:Devel@sel4.systems> https://sel4.systems/lists/listinfo/devel _______________________________________________ Devel mailing list Devel@sel4.systems<mailto:Devel@sel4.systems> https://sel4.systems/lists/listinfo/devel ________________________________ The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.

Mark Jones

6:25 a.m.

There may well be some benefit in allowing the transfer of multiple capabilities in a single IPC. But let’s remember that this discussion actually started with Norman's request for a different feature, which he characterized as “capability re-identification” [1]. We only began a transition to the current conversation when Norman proposed a cunning encoding to represent a Genode capability by a triple of seL4 capabilities [2]. Are there alternative ways to solve Norman’s original problem, perhaps even without requiring a change to the current seL4 design? I proposed one possible approach during the original conversation (the second half of [3]). Like Harry’s suggestion, it involves the use of a CNode object (although I wasn’t thinking about using it for the purpose of transferring multiple capabilities). In addition, with my proposal, that CNode is created as a side effect of a one-time registration operation and is shared between multiple CSpaces. Among other things, this should eliminate the overhead that Gerwin mentioned of having to allocate a new CNode for every transfer. Back in November, it wasn’t entirely clear whether my proposal could be adapted to Genode to solve the problem that Norman had described, and we didn’t explore it further. However, even if that specific approach won’t work, perhaps there is a variation, or a different strategy altogether, that might do the trick instead? In short, I still think it might be worth exploring other options to make sure that we’re treating the problem rather than the symptom … All the best, Mark [1] http://sel4.systems/pipermail/devel/2014-November/000112.html <http://sel4.systems/pipermail/devel/2014-November/000112.html> [2] http://sel4.systems/pipermail/devel/2014-November/000114.html <http://sel4.systems/pipermail/devel/2014-November/000114.html> [3] http://sel4.systems/pipermail/devel/2014-November/000118.html <http://sel4.systems/pipermail/devel/2014-November/000118.html>

Norman Feske

1:38 p.m.

Hi Mark, On 02/13/2015 07:25 AM, Mark Jones wrote:

...

There may well be some benefit in allowing the transfer of multiple capabilities in a single IPC. But let’s remember that this discussion actually started with Norman's request for a /different feature/, which he characterized as “capability re-identification” [1]. We only began a transition to the current conversation when Norman proposed a cunning encoding to represent a Genode capability by a triple of seL4 capabilities [2]. Are there alternative ways to solve Norman’s original problem, perhaps even without requiring a change to the current seL4 design?

that is a very good point! I stated earlier that no Genode RPC interface requires more than 2 capability arguments. In fact, in the very few instances where two capability arguments are used, the passed capabilities actually originate from the server. So those capabilities could be unwrapped instead of delegated. So you are spot on: My immediate requirement for delegating multiple caps would not exist without my approach to solving the re-identification problem.

...

I proposed one possible approach during the original conversation (the second half of [3]). Like Harry’s suggestion, it involves the use of a CNode object (although I wasn’t thinking about using it for the purpose of transferring multiple capabilities). In addition, with my proposal, that CNode is created as a side effect of a one-time registration operation and is shared between multiple CSpaces. Among other things, this should eliminate the overhead that Gerwin mentioned of having to allocate a new CNode for every transfer. Back in November, it wasn’t entirely clear whether my proposal could be adapted to Genode to solve the problem that Norman had described, and we didn’t explore it further.

I liked the idea of the shared CNodes but could not see how to bring it together with Genode without introducing difficult new problems. In particular, the approach would ultimately need a shared CNode for each communication relationship. The CNode is a physical resource. This raises the question of who should allocate the CNodes and how to organize the namespaces of all the CNodes a component shares with others. In my perception, it comes down to similar problems as I described in my other posting [1] of today (the second option): The server would need to keep state per client but the number of clients is unbounded. In Genode, we have the notion of "sessions", which enable a server to maintain client-specific state using a memory budget provided by the client. But this session concept is built on top of the basic RPC mechanism. The mechanism you proposed requires also a notion of sessions (in the sense of state to be kept at both communication partners), but at a lower level. It thereby raises the same problems that we have solved at the higher level. Btw, if you are interested in learning more about Genode's session concept, let me recommend the Chapter 3 of the forthcoming Genode manual [2].

...

However, even if that specific approach won’t work, perhaps there is a variation, or a different strategy altogether, that might do the trick instead? In short, I still think it might be worth exploring other options to make sure that we’re treating the problem rather than the symptom …

Your posting has actually provoked me to reconsider the problem from another angle: In Genode, each component has a relationship with the root task (called "core"). Instead of letting each component manage their CSpaces locally, core could manage their CSpaces. At the creation time of a component, core would allocate a CNode as CSpace for the component and would keep it shared between core and the component (quite similar to your idea). Now, if a component wants to perform an RPC call with capabilities as arguments, it would not isssue an seL4 IPC call directly to the server, but an seL4 IPC call to core, supplying the local names of the capability arguments along with the local name of the invoked capability. Because core has a global view of all CSpaces, it can copy the capabilities from the sender CSpace to the server's CSpace and forward the IPC call to the server by translating the local names of the capability arguments to the CSpace of the server. I have not fully wrapped my head around all the minor details of the forwarding mechanism, but the approach would in principle use core as a proxy for "heavy weight" RPC calls. The semantics I need for Genode (like the re-identification of capabilities) could be provided by core. Still, RPCs without capability arguments (as is the case for all performance-critical RPCs anyway) would go straight to the server. In contrast to your original idea, we would not need one CNode per communication relationship but only one per component. This memory resources of the CNode can be trivially accounted to the respective component. I have to think it through but it seems like a promising alternative approach to the original re-identification problem. It actually lowers the requirements with regard to the kernel as the delegation of capabilities via IPC would remain unused. Thanks Mark, for pushing me in this direction! :-) [1] http://sel4.systems/pipermail/devel/2015-February/000222.html [2] http://genode.org/files/e01096b9ffe3f416157f6ec46c467725/manual-2015-01-23.p... Cheers Norman -- Dr.-Ing. Norman Feske Genode Labs http://www.genode-labs.com · http://genode.org Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth

Gernot Heiser

2:18 a.m.

On 13 Feb 2015, at 10:30 , Thomas Sewell <Thomas.Sewell@nicta.com.au> wrote:

...

In principle the reasonable limit is 1. If a sender can transfer 1 cap per message, it can transfer any number of caps by sending a lot of messages.

that is true. But I’m not sure that sending 3 caps in 3 messages (plus appropriate protocols) is in all circumstances semantically equivalent to sending 3 caps in one message. An example of 1+1<2 is round-trip IPC. One might argue that providing two IPC operations (send+receive) in a single system call is unnecessary, as you could simply make two system calls. Turns out, the functionality of the combined IPC cannot be implemented with two individual IPCs, for two reasons: 1) a call-type IPC creates the one-shot reply cap which allows to provide the server with a reply channel without trusting it. This could probably be sort-of modelled without the combined IPC, but would require many system calls (creating an endpoint, send with EP transfer, wait, destroy EP) and at best would be really expensive 2) the combined IPC is atomic, guaranteeing that the sender is ready to receive as soon as the send part is done. With two calls, the sender could be preempted between send and receive phase. With the forthcoming new real-time scheduling model, this is even more important, as it allows the server to run exclusively on client-provided time, which it couldn’t do if it had to do an explicit wait after the reply (it has no budget to run on after the reply completes). I’m not convinced that you don’t run into similar problems with cap transfer if there’s only one cap allowed per message. I’m hoping that Norman will have some insight here. Gernot ________________________________ The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.

Norman Feske

12:36 p.m.

Hello,

...

...
In principle the reasonable limit is 1. If a sender can transfer 1 cap per message, it can transfer any number of caps by sending a lot of messages.

that is true. But I’m not sure that sending 3 caps in 3 messages (plus appropriate protocols) is in all circumstances semantically equivalent to sending 3 caps in one message.

An example of 1+1<2 is round-trip IPC. One might argue that providing two IPC operations (send+receive) in a single system call is unnecessary, as you could simply make two system calls. Turns out, the functionality of the combined IPC cannot be implemented with two individual IPCs, for two reasons:

1) a call-type IPC creates the one-shot reply cap which allows to provide the server with a reply channel without trusting it. This could probably be sort-of modelled without the combined IPC, but would require many system calls (creating an endpoint, send with EP transfer, wait, destroy EP) and at best would be really expensive

2) the combined IPC is atomic, guaranteeing that the sender is ready to receive as soon as the send part is done. With two calls, the sender could be preempted between send and receive phase. With the forthcoming new real-time scheduling model, this is even more important, as it allows the server to run exclusively on client-provided time, which it couldn’t do if it had to do an explicit wait after the reply (it has no budget to run on after the reply completes).

There is another fundamental problem with issuing RPC calls in a non-atomic way: Servers do not want to trust their clients. Let me elaborate this a bit by merely looking at the send phase of an RPC call (which we call "RPC request" in Genode). If a client would split an RPC request into multiple seL4 IPCs, the first IPC would possibly contain the information about the number of subsequent IPCs that belong to the same RPC request. The server would need to wait for the arrival of all parts before processing the RPC function. While the server is waiting for one of the subsequent IPCs, another client could issue an RPC request. What would the server do? There are two principle options, (1) blocking RPC requests of other clients until the completion of all parts of the current RPC request, or (2) keeping track of the states on an RPC request per client. Both options are futile. (1) The server could stall RPC requests by other clients by performing a closed wait for IPCs coming from the initiator of the current RPC request. (side note: as far as I know, such a closed wait is not possible on seL4) The kernel would block all other clients that try to issue an IPC to the endpoint until the server performs an open wait for the next time. Unfortunately, this approach puts the availability of the server at the whim of each single client. A misbehaving client could issue an RPC request that normally consists of two parts but deliver only the first part. The server would infinitely stay in the closed wait and all other clients would block forever. This is unacceptable. (2) The server could accept incoming IPCs from multiple clients but would keep a state machine for each client. The state machine would track the completion of an individual RPC request. E.g., after receiving the initial part of a three-part RPC request, it would keep the information that two parts are still missing before the server-side RPC function can be invoked. Also, the state machine would need to keep the accumulated message content. Unfortunately, there is no way to pre-allocate the memory needed for those state machines at the server. The number of state machines needed ultimately depends on the number of concurrent RPC requests. E.g., if a capability for a server-side object was delegated to a number of different components, each component could issue RPC requests at any time. For each request, the server would require an individual state machine. It would eventually allocate the backing store for a state machine on the arrival of the initial part of an RPC request. What would happen when a misbehaving client issued the first part of a two-part RPC request in an infinite loop? Right: The server would try to allocate an infinite amount of state machines. Hence, any client could drive such a simple denial-of-service attack at the server. Even without going into detail about the performance overhead of server-side dynamic memory allocations per RPC request, or the added complexity of the options described above, I hope that it becomes clear that splitting RPCs into multiple parts is not a sensible approach. Cheers Norman -- Dr.-Ing. Norman Feske Genode Labs http://www.genode-labs.com · http://genode.org Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth

3832

Age (days ago)

3834

Last active (days ago)

List overview

Download

18 comments

8 participants

participants (8)

Gernot Heiser
Gerwin Klein
Harry Butterworth
Mark Jones
Norman Feske
Raoul Duke
Thomas Sewell
Tim Newsham