Why send-only IPC doesn't and shouldn't return a success indicator

newer
Possible seL4 developer day Dec...

older
quesion about TK1 Linux irq

Gernot.Heiser＠data61.csiro.au

24 Sep 2016 24 Sep '16

4:47 p.m.

We have just received this comment on github, which I think merits some explanation:

...

There are times when IPC needs to happen, but a sending process cannot wait around for another process to be receiving. seL4_NBSend is basically the system call for this, but it never provides any feedback whether or not IPC took place.

Summary: I accept that this is a valid use case, but the suggested solution is not the right one. There are two reasons why I’m saying this: 1) the current behaviour of send-only IPC is correct and cannot be changed, 2) send-only IPC is the wrong way to address this use case. Re (1): Anyone who tried to use seL4_Send() or seL4_NBSend() will likely have noticed that they are of very little use, because you don’t know whether the message was transferred. So, why can't we return at least a boolean value, indicating whether a message has been delivered or it was dropped on the floor? It’s simple: this would violate the dataflow guarantees provided by seL4. Send-only IPC is there to enable one-way information flow, and consequently can be invoked using a send-only cap. But a send-only cap must not allow any information flow from the receiver to the sender (or else all our isolation proofs would break down). Basically, a return status that can be influenced by the receiver constitutes a back channel that is not authorised by a cap, and as such is verboten. This means that send-only IPC is of very little practical use, except where you really want one-way information transfer, or you have a higher-level protocol that lets you get away with not knowing whether the message was lost. (Eg if the receiver being not ready is a protocol violation by the receiver, and it has only itself to blame for losing the message.) So, could we make send-only IPC a bit more useful by changing the semantics slightly? For example, we could have a return value that indicates successful message delivery if the send was invoked with a send+receive cap, but will always return false if the invocation used a send-only cap? In principle that would be doable without violating security requirements. The cost would be extra checking. It would also be somewhat unsystematic if the semantics of a send-only operation would depend on whether your cap has receive rights as well. It’d be arguably misleading, as you’re doing more than sending. This gets me to (2): send-only IPC is the wrong way to solve the problem. The above use case is clearly one where info flow is inherently two-way: you send something and need an acknowledgment of receipt. This means call-type IPC is the right mechanism, it does exactly that. And on top of that it is more efficient than separate send-receiver, and avoids (thanks to reply caps) the need to set up a separate reply channel. The problem, of course, is that seL4_Call() is always blocking, there is no polling variant of this syscall. Should there be an seL4_NBCall()? Arguably yes, just as the RT branch already provides a seL4 SignalRecv() syscall, which is the non-blocking equivalent of seL4_ReplyRecv(). Will it happen? Possibly, but not in mainline the near future. The reason is that to add this (fairly simple) feature to the mainline kernel will require verifying it, and our verification engineers are presently very busy delivering on contracted milestones. We’ll discuss this internally. In the meantime I’ve added some explanation of the semantics of seL4_Send()/seL4NBSend() to the FAQ on the seL4 wiki. Gernot

Show replies by date

Richard Habeeb

28 Sep 28 Sep

8:05 a.m.

So perhaps I am misunderstanding the design differences between the different IPC system calls. Let me back up and explain what I was working on. I was basically designing a kerberos-style IPC policy system. The IPC policy server is basically a trusted middle man which sets up communication between other processes. I originally thought that an EP would need to be allocated per pairwise communication, client to server. In this line of thinking, each child process had RW rights and a badged cap to communicate with the server. I am pretty new to this, and I didn't realize that seL4_Call wasn't just a function that combined seL4_Send and seL4_Recv. I see now that it grants the receiving process it's own cap to the EP, and then immediately Recvs from that EP. In this case, it seems like there only needs to one EP for the server. Clients have write and grant access, and the server gets read. Does the kernel ensure that the calling process gets the Reply? Or is a MITM possible in this scenario? Additionally, if a client does a normal send, instead of a call, the kernel just throws an exception when the server tries to reply. This sounds like it shouldn't be acceptable behavior. Finally, I am confused how a client with WG cap access is able to receive a reply, simply by using the call function. It should have no read access to the EP right? TL;DR I am testing other designs that don't use NBSend. I didn't realize that seL4_Call wasn't just a function that combined seL4_Send and seL4_Recv. Does the kernel ensure that the calling process gets the Reply? Thanks Richard Habeeb On Sat, Sep 24, 2016 at 2:48 AM <Gernot.Heiser@data61.csiro.au> wrote:

...

We have just received this comment on github, which I think merits some explanation:

...
There are times when IPC needs to happen, but a sending process cannot wait around for another process to be receiving. seL4_NBSend is basically the system call for this, but it never provides any feedback whether or not IPC took place.

Summary: I accept that this is a valid use case, but the suggested solution is not the right one.

There are two reasons why I’m saying this: 1) the current behaviour of send-only IPC is correct and cannot be changed, 2) send-only IPC is the wrong way to address this use case.

Re (1): Anyone who tried to use seL4_Send() or seL4_NBSend() will likely have noticed that they are of very little use, because you don’t know whether the message was transferred. So, why can't we return at least a boolean value, indicating whether a message has been delivered or it was dropped on the floor?

It’s simple: this would violate the dataflow guarantees provided by seL4. Send-only IPC is there to enable one-way information flow, and consequently can be invoked using a send-only cap. But a send-only cap must not allow any information flow from the receiver to the sender (or else all our isolation proofs would break down). Basically, a return status that can be influenced by the receiver constitutes a back channel that is not authorised by a cap, and as such is verboten. This means that send-only IPC is of very little practical use, except where you really want one-way information transfer, or you have a higher-level protocol that lets you get away with not knowing whether the message was lost. (Eg if the receiver being not ready is a protocol violation by the receiver, and it has only itself to blame for losing the message.)

So, could we make send-only IPC a bit more useful by changing the semantics slightly? For example, we could have a return value that indicates successful message delivery if the send was invoked with a send+receive cap, but will always return false if the invocation used a send-only cap?

In principle that would be doable without violating security requirements. The cost would be extra checking. It would also be somewhat unsystematic if the semantics of a send-only operation would depend on whether your cap has receive rights as well. It’d be arguably misleading, as you’re doing more than sending.

This gets me to (2): send-only IPC is the wrong way to solve the problem.

The above use case is clearly one where info flow is inherently two-way: you send something and need an acknowledgment of receipt. This means call-type IPC is the right mechanism, it does exactly that. And on top of that it is more efficient than separate send-receiver, and avoids (thanks to reply caps) the need to set up a separate reply channel.

The problem, of course, is that seL4_Call() is always blocking, there is no polling variant of this syscall.

Should there be an seL4_NBCall()? Arguably yes, just as the RT branch already provides a seL4 SignalRecv() syscall, which is the non-blocking equivalent of seL4_ReplyRecv(). Will it happen? Possibly, but not in mainline the near future. The reason is that to add this (fairly simple) feature to the mainline kernel will require verifying it, and our verification engineers are presently very busy delivering on contracted milestones.

We’ll discuss this internally. In the meantime I’ve added some explanation of the semantics of seL4_Send()/seL4NBSend() to the FAQ on the seL4 wiki.

Gernot _______________________________________________ Devel mailing list Devel@sel4.systems https://sel4.systems/lists/listinfo/devel

-- Richard Habeeb Research Assistant, Computer Science, USF http://habeebr.bitbucket.org/

Anna.Lyons＠data61.csiro.au

8:34 a.m.

New subject: Why send-only IPC doesn't and shouldn't return a success indicator

Hi Richard, This is a common misunderstanding: seL4_Call doesn’t generate a cap to the ep, but a cap to the thread that the seL4_Call went through (called the ‘reply cap’). seL4_Reply(Wait) sends directly to this thread via that capability, so the IPC is guaranteed to go to the caller. If you are dealing with multiple threads with interleaved messages, you can use seL4_CNode_SaveCaller to save the current reply cap, and invoking seL4_Send on the saved slot will do the reply. Cheers, Anna. From: Devel [mailto:devel-bounces@sel4.systems] On Behalf Of Richard Habeeb Sent: Wednesday, 28 September 2016 8:06 AM To: Heiser, Gernot (Data61, Kensington NSW) <Gernot.Heiser@data61.csiro.au>; devel@sel4.systems Subject: Re: [seL4] Why send-only IPC doesn't and shouldn't return a success indicator So perhaps I am misunderstanding the design differences between the different IPC system calls. Let me back up and explain what I was working on. I was basically designing a kerberos-style IPC policy system. The IPC policy server is basically a trusted middle man which sets up communication between other processes. I originally thought that an EP would need to be allocated per pairwise communication, client to server. In this line of thinking, each child process had RW rights and a badged cap to communicate with the server. I am pretty new to this, and I didn't realize that seL4_Call wasn't just a function that combined seL4_Send and seL4_Recv. I see now that it grants the receiving process it's own cap to the EP, and then immediately Recvs from that EP. In this case, it seems like there only needs to one EP for the server. Clients have write and grant access, and the server gets read. Does the kernel ensure that the calling process gets the Reply? Or is a MITM possible in this scenario? Additionally, if a client does a normal send, instead of a call, the kernel just throws an exception when the server tries to reply. This sounds like it shouldn't be acceptable behavior. Finally, I am confused how a client with WG cap access is able to receive a reply, simply by using the call function. It should have no read access to the EP right? TL;DR I am testing other designs that don't use NBSend. I didn't realize that seL4_Call wasn't just a function that combined seL4_Send and seL4_Recv. Does the kernel ensure that the calling process gets the Reply? Thanks Richard Habeeb On Sat, Sep 24, 2016 at 2:48 AM <Gernot.Heiser@data61.csiro.au<mailto:Gernot.Heiser@data61.csiro.au>> wrote: We have just received this comment on github, which I think merits some explanation:

...

There are times when IPC needs to happen, but a sending process cannot wait around for another process to be receiving. seL4_NBSend is basically the system call for this, but it never provides any feedback whether or not IPC took place.

Gernot.Heiser＠data61.csiro.au

10:09 a.m.

New subject: Why send-only IPC doesn't and shouldn't return a success indicator

On 28 Sep 2016, at 8:34 , Anna.Lyons@data61.csiro.au<mailto:Anna.Lyons@data61.csiro.au> wrote: This is a common misunderstanding: seL4_Call doesn’t generate a cap to the ep, but a cap to the thread that the seL4_Call went through (called the ‘reply cap’). seL4_Reply(Wait) sends directly to this thread via that capability, so the IPC is guaranteed to go to the caller. What Anna describes is the implementation viewpoint. The conceptual view is that seL4_Call() generates a transient (virtual) endpoint to which the invoker of Call gets a (transient, virtual) receive cap, and is made to block on a receive from that virtual EP. The destination of the Call gets the send cap to the virtual EP (which is called the Reply cap). The virtual EP vanishes once the reply cap is invoked. Interposing on an EP should not be a problem. Gernot

Richard Habeeb

29 Sep 29 Sep

1:44 a.m.

Thanks Anna and Gernot for clarifying. I think I should be able to make it work with Call/Reply. I don't really like that the calling process isn't able to verify the identity/badge of the replying process, but I realize why this is the case now. I have more questions: Is there a way on the receiving end to know if you have received a reply cap? Could the calling process somehow maliciously cause the replying process to block on reply? Thanks again, Richard On Tue, Sep 27, 2016 at 8:10 PM <Gernot.Heiser@data61.csiro.au> wrote:

...

On 28 Sep 2016, at 8:34 , Anna.Lyons@data61.csiro.au wrote:

This is a common misunderstanding: seL4_Call doesn’t generate a cap to the ep, but a cap to the thread that the seL4_Call went through (called the ‘reply cap’). seL4_Reply(Wait) sends directly to this thread via that capability, so the IPC is guaranteed to go to the caller.

What Anna describes is the implementation viewpoint.

The conceptual view is that seL4_Call() generates a transient (virtual) endpoint to which the invoker of Call gets a (transient, virtual) receive cap, and is made to block on a receive from that virtual EP. The destination of the Call gets the send cap to the virtual EP (which is called the Reply cap). The virtual EP vanishes once the reply cap is invoked.

Interposing on an EP should not be a problem.

Gernot _______________________________________________ Devel mailing list Devel@sel4.systems https://sel4.systems/lists/listinfo/devel

-- Richard Habeeb Research Assistant, Computer Science, USF http://habeebr.bitbucket.org/

Gernot.Heiser＠data61.csiro.au

9:40 a.m.

New subject: Why send-only IPC doesn't and shouldn't return a success indicator

On 29 Sep 2016, at 1:44 , Richard Habeeb <richard.habeeb@gmail.com<mailto:richard.habeeb@gmail.com>> wrote: I think I should be able to make it work with Call/Reply. I don't really like that the calling process isn't able to verify the identity/badge of the replying process, but I realize why this is the case now. Actually, the system provides the guarantees you need: The reply message can only have been sent by invoking the reply cap, so it can only come from whoever received the original message, or whoever they handed the reply cap to. That’s as much guarantee as you can hope for. In other words, Call (ep, msg, &reply) is logically roughly equivalent to: replEp = create_ep () sendCap = mint (replEp, send_only) Send (ep, {msg,sendCap}) Recv (replyEp, &reply) destroy (replyEp) except it’s atomic, no race conditions etc I have more questions: Is there a way on the receiving end to know if you have received a reply cap? Could the calling process somehow maliciously cause the replying process to block on reply? The server probably cannot be sure that it received a reply cap (someone correct me if I’m wrong here). However, it doesn’t matter, the server doesn’t care: If no reply cap was received, the reply-cap slot remains invalid. An attempt by the server to use the reply cap will simply fail, the server can’t block that way. In the normal case of the server using seL4_ReplyRecv(), the send part of the call will silently fail (message delivered to the bitbucket) and switch to the receive phase immediately. The server won’t notice the difference. Gernot

Matt Rice

12:06 p.m.

On Wed, Sep 28, 2016 at 4:40 PM, <Gernot.Heiser@data61.csiro.au> wrote:

...

On 29 Sep 2016, at 1:44 , Richard Habeeb <richard.habeeb@gmail.com> wrote:

I think I should be able to make it work with Call/Reply. I don't really like that the calling process isn't able to verify the identity/badge of the replying process, but I realize why this is the case now.

Actually, the system provides the guarantees you need: The reply message can only have been sent by invoking the reply cap, so it can only come from whoever received the original message, or whoever they handed the reply cap to. That’s as much guarantee as you can hope for.

In other words,

Call (ep, msg, &reply)

is logically roughly equivalent to:

replEp = create_ep () sendCap = mint (replEp, send_only) Send (ep, {msg,sendCap}) Recv (replyEp, &reply) destroy (replyEp)

except it’s atomic, no race conditions etc

FWIW, I finally found where this is documented in the manual in http://sel4.systems/Info/Docs/seL4-manual-latest.pdf#section.2.2 under seL4_Call, I was looking for it in all of the places for performing a reply,

...

I have more questions:

Is there a way on the receiving end to know if you have received a reply cap? Could the calling process somehow maliciously cause the replying process to block on reply?

The server probably cannot be sure that it received a reply cap (someone correct me if I’m wrong here).

I don't see how one can, seL4_CNode_SaveCaller will set the cap to a null cap and call userError() if debugging is enabled, but it doesn't cause any actual error return code afaict. perhaps if there is a public mechanism for checking whether a cap is a null cap? I haven't been able to find a mechanism for checking cap type

...

However, it doesn’t matter, the server doesn’t care: If no reply cap was received, the reply-cap slot remains invalid. An attempt by the server to use the reply cap will simply fail, the server can’t block that way. In the normal case of the server using seL4_ReplyRecv(), the send part of the call will silently fail (message delivered to the bitbucket) and switch to the receive phase immediately. The server won’t notice the difference.

3236

Age (days ago)

3241

Last active (days ago)

List overview

Download

6 comments

4 participants

participants (4)

Anna.Lyons＠data61.csiro.au
Gernot.Heiser＠data61.csiro.au
Matt Rice
Richard Habeeb