[seL4] Re: Confidentiality and realtime requirements

10 Aug 2023

      "For literally 100% of the cases I deal with, time partitioning is completely
impractical.  The inability of a time-partitioned system to adapt to
workload changes means that it is not even worth considering."

I agree Demi. Maybe the problem is trying to solve everything wirh the same
hardware and software. With current kind of hardware/CPUs is very difficult
to solve both Desktop (human operated) devices general purpose software and
embedded, task specific devices software challenges as they have, by
nature, different problems to be solved. Human interaction with a computer
is not deterministic, so forget about deterministic solutions... Instead
try to solve just the most sensible parts of the full puzzle, so you can
use specific software/hardware doing the job where errors are not an option
and keep other pieces of the puzzle with "standard" software/hardware.
Mixing all in a big soap of software is, nowadays, and with the horrible
hardware support, an impossible mission.

El mié., 9 ago. 2023 19:38, Demi Marie Obenour <demiobenour@gmail.com>
escribió:
...
On 8/9/23 04:47, Gernot Heiser wrote:
...
On 9 Aug 2023, at 06:28, Demi Marie Obenour <demiobenour@gmail.com>
wrote:
...
...
...
...
and full speculative taint tracking is required.
I don’t follow. If you clean all µarch state you don’t have to worry
about speculation traces, that’s (among others) the gist of Ge et al.
...
...
Does it prevent Spectre v1?  A bounds check will almost always predict
as in-bounds and that is potentially a problem.
...
Taint tracking does prevent Spectre v1 because the speculatively read
data is guaranteed to be unobservable.  Strict
temporal isolation also mitigates this, but IIUC it is also
incompatible with load-balancing and therefore only practical
in limited cases.
Spectre v1 uses speculation to put secrets into the cache, combined with
a covert timing channel to move it across security domains. Without the
covert channel it’s harmless. Time protection prevents the covert channel.
You cannot use time protection in a fully dynamic system, _especially_
not a desktop system.  I should have made it clear that I was referring
to dynamic systems.
...
Speculation taint-tracking is a complex point-defence against one
specific attack patterns, that needs to take a pessimistic approach to
enforcing what the hardware thinks *might* be security boundaries,
irrespective what the actual security policy is.
Time protection is a general, policy-free mechanism that prevents µarch
timing channels under control of the OS, which can deploy it where needed.
The problem with time protection is that it is all-or-nothing.
A general purpose system _cannot_ enforce time protection, because
doing so requires statically allocating CPU time to different security
domains.  This is obviously impossible in any desktop system, because
it is the human at the console who decides what needs to run and when.
...
...
...
As I keep saying, the seL4 mechanism that is (unfortunately,
somewhat misleadingly and for purely historic reasons) called “IPC”
shouldn’t be considered a general-purpose communication mechanism, but a
protected procedure call – the microkernel equivalent to a Linux system
call. As such, the trust relationship is not symmetric: you use it to
invoke some more privileged operation (and definitely need to trust the
callee to a degree).
I should have said “not-mutually-trusting”, then.
Makes a big difference: Enforcement then comes down to enforcing the
security policy, which may or may not require temporal isolation. If it
does, there’s an (unavoidable) cost. If not then not.
Or maybe the security policy requires something between “nothing”
and “full temporal isolation”.
Consider a server that performs cryptographic operations.  The security
policy is that clients cannot access or alter data belonging to
other clients and that secret keys cannot be extracted by any client.
Since the cryptographic operations are constant-time there is no need
for temporal isolation, _provided that speculative execution does not
cause problems_.  Enforcing temporal isolation would likely cause such
a large performance penalty that the whole concept is not viable.
...
fence.t (or something similar) is the mechanism you need to let the OS
do it’s job, and it is simple and cheap to implement, and costs you no more
than the L1-D flush you need anyway, as Wistoff et al demonstrated.
...
I missed the “costs you no more than the L1-D flush you need anyway”
part.
On x86, instructions like IBPB can easily take thousands of cycles
IIUC.
Invalidating the L1-D cache takes 1000s of cycles, and is unavoidable
for temporal isolation – D-cache timing attacks are the easiest ones to do.
The point is that compared to the inevitable D-cache flush, everything
else is cheap, and can be done in parallel to the D-cache flush, so doesn’t
affect overall latency.
I’m not sure why IBPB should take 1000s of cycles (unless Intel executes
complex microcode to do it). Resetting flip-flops is cheap. What makes the
D-cache expensive is the need to write back dirty data. Other µarch state
can be reset without any write-back as it caches R/O information.
There is a separate issue of indirect costs, which can be significant,
but not in a time-partitioned system. If the security policy requires
isolating a server from it’s client, then these costs would become
significant, but that’s inherent in the problem.
For literally 100% of the cases I deal with, time partitioning is
completely impractical.  The inability of a time-partitioned system to
adapt to workload changes means that it is not even worth considering.
Any feasible solution needs to be able to allocate 90+% of system CPU
time to security domain X, and then allocate 90+% of system CPU time to
security domain Y, _without knowing in advance that these changes will
happen_.  Time partitioning is awesome for your static embedded systems
that will only ever run workloads known in advance, but for the systems
I work on, it is a complete non-starter both now and in the foreseeable
future.
...
...
Would fence.t have equally catastrophic overhead on an out-of-order
RISC-V
processor?
https://riscv-europe.org/media/proceedings/posters/2023-06-08-Nils-WISTOFF-a...
seems simple to implement in hardware, but does not seem efficient.
https://carrv.github.io/2020/papers/CARRV2020_paper_10_Wistoff.pdf
claims
to be decently efficient, but is for an in-order CPU.
It is highly efficient, and completely hidden behind the D-cache flush.
Implementation on an OoO processor isn’t published yet, but confirms the
results obtained on the IO CV6.
...
Also, in the future, would you mind including the full URL of any
articles?
I don’t know what “Wistoff et al” and “Ge et al” refer to, and my mail
client
is configured to only display plain text (not HTML) because the attack
surface
of HTML rendering is absurdly high.
All papers are listed on the time-protection project page.
[Ge et al, EuroSys’19]:
https://trustworthy.systems/publications/abstracts/Ge_YCH_19.abstract
[Wistoff et al, DATE’21]:
https://trustworthy.systems/publications/abstracts/Wistoff_SGBH_21.abstract
Both won Best-Paper awards, btw
I’m not surprised!  For the systems that _can_ use it, it is awesome.
--
Sincerely,
Demi Marie Obenour (she/her/hers)
_______________________________________________
Devel mailing list -- devel@sel4.systems
To unsubscribe send an email to devel-leave@sel4.systems

[seL4] Re: Confidentiality and realtime requirements

Hugo V.C.