[seL4] Re: Conclusions regarding speculation

15 Nov 2023

      [re-send 2/2]

----- Forwarded message from "G. Branden Robinson" <g.branden.robinson@gmail.com> -----

Date: Mon, 13 Nov 2023 04:03:16 -0600
From: "G. Branden Robinson" <g.branden.robinson@gmail.com>
To: devel@sel4.systems
Subject: Re: [seL4] Re: Conclusions regarding speculation

Hi Demi,

At 2023-11-13T03:39:00-0500, Demi Marie Obenour wrote:
...
...
The difference is QubesOS do not rely on a verified hypervisor while
the hypothetical system I'm talking about it will have seL4 booting,
which is a verified piece of code.
I would love that, but seL4 currently is missing at least five
critical security features on x86:
1. Interrupt remapping in the IOMMU.
This is the sort of thing that gets solved by putting together a small
team of engineers, some very thick manuals from the manufacturer, a
contact inside the vendor to address points "accidentally" omitted or
erroneously documented, and a not insignificant amount of money.  A
Simple Matter of Programming, in other words.
...
2. Protection against speculative execution attacks the moment the
   embargo breaks.
3. The ability to either choose the appropriate mitigations itself,
   or to let userspace choose them.
4. Properly implemented speculative execution mitigations.
These 3 seem to be what you and Gernot are arguing about and I feel
unqualified to get into the middle of that.
...
5. Microcode update support.
This is like (1) except it will never happen with verifiability and
reliability because that would either expose, or enable the replacement
of, the "secret sauce", by which the profit margin on the chip is
protected.

And that is why we need fully open cores subject to third-party
correctness verification.  I've never even seen an argument that you can
combine these properties with binary blobs or trade secrets.

And as long as that is true we're sure to keep getting trashed by spec
ex attacks or the next great innovation in system compromise.
...
Without all five of these, seL4 is _less_ secure than Xen, which
defeats the entire purpose of using it.  As per the discussion in
https://github.com/seL4/seL4/issues/1108 this will not be fixed any
time soon.
I may not be completely understanding your argument, but it sounds like
you're saying that with Xen/QubesOS you essentially achieve
compartmentalization by partitioning one process (in the Unix sense)
per processor (in the cores plus shared caches sense).  Once you do
this, doesn't it start to look a lot like OpenVMS?  This would initially
be a good way to drive the KVM-switch-in-a-laptop into the market but I
think it would run into scaling problems pretty quickly.
...
That’s why I asked about desktop-class hardware I can actually buy:
the only non-x86 desktop-class hardware I know of are Macs with Apple
silicon and POWER9 machines from Raptor,
...insanely expensive (5 figures USD) last I looked...

And Apple won't give up secret sauce of any kind before the extinction
of our species.  They still curse John Sculley's name.
...
and seL4 supports neither.
Apple silicon has lots of nonstandard IOMMUs that seL4 would need to
support, and POWER is not supported at all.
For the time being, we are simply screwed.  We are the cows from which
economic rents are milked; this pleases the Street.  The only rescue
from our plight is Joseph Schumpeter's "creative destruction", opinions
of which seem to vary in a consistent way depending on one's temporal
distance from an IPO.
...
(Fun fact: you probably have some formally verified cryptographic code
installed on your system _right now_, since NSS uses fiat-crypto for
some of its cryptographic primitives last I checked.)
!

Share a link next time.  :)
...
To be clear: I don’t fault seL4 for focusing on embedded right now.
The reward/effort ratio is vastly more favorable there.
That, and I think the larger players in the industry perceive potential
risks to the status quo.  The promise of lower expenses down the road
due to fewer security incidents is a really hard thing to quantify, and
much decision making is tied to what can be said in the next quarterly
10-C filing with the SEC.

Yes, Heartbleed can give you heartburn big time.  But sufficiently
large and reliable income streams salve any scald.  Quality of
implementation is simply not the sort of thing anybody on the financial
side of any firm cares about unless it's as dramatic as an airplane
falling out of the sky.  (I would have said "or life support equipment
not being fit for purpose", but that appears not to be the case.[1][2])
...
Servers are significantly harder and I suspect desktops and laptops
are the worst-case scenario, because _everything_ is dynamic and
subject to the whim of a human who can and will break any simplifying
assumptions one has made.  And this is even before one starts talking
about GPUs and other accelerators, which are a huge attack surface and
yet can be the difference between a system being usable and unusable
for the tasks people expect of it.  There is a reason Qubes OS is
going to be supporting GPU acceleration at some point.
That’s why I am so pessimistic about time protection on desktop: all
of the stuff I have read about it assumes at least some static
partitioning, and in the desktop world, static _anything_ simply
doesn’t exist.
I think it's bad, but maybe not quite as bad as you do.

I'm going to risk coming off as naïve for a moment.  Or saying a bunch
of stuff you (or everyone on this list) already knows.

Getting back to CS 101 fundamentals, what does a computing system do?

A.  It computes things (runs instructions).
B.  It remembers things (has storage).
C.  It communicates with the world (performs I/O operations).

Time division multiplexing is our solution to (A) and (C) when we have
more things we want to compute and communicate than we have machine
resources for.  This is well-studied and reasonably well controlled.  We
tried cooperative multitasking; it didn't work out because we couldn't
solve the halting problem (but mainly because programmers are not
infallible).  Fine.  Limited bandwidth and measurable latencies are so
pervasive that we _assume_ unreliability of all I/O (but largely hide it
inside network stacks or bus protocols).

That leaves (B).

So the dynamic system problem boils down to the multiplexing of memory.

And I venture that is mainly a problem because when DEC added virtual
memory to the PDP-11 and the Berkeley CSRG rewrote the Unix kernel to
use it (thus "vmunix"), we all fell into a trap.  That trap has a name.

Memory overcommit.

Your userland application can't overcommit CPU cycles.  You'll get
preëmpted.  You can't overcommit I/O, either; something will throttle
you.

So we've spent a few decades being fantastically reckless with memory,
especially in C and C++.  (And, worst of all, JavaScript, according to
my htop(1).)  Running out of memory was never supposed to be problem in
real life; rare enough that, if you were a slouch, you used the return
value of malloc/calloc/realloc without thinking, and if you weren't, you
just fatal-errored out, proclaiming defeat.

I expect that the next generation will start breaking the wretched habit
of assuming unbounded memory availability.  Here's why.

We (desktop users) drastically under-use all these parallel cores in our
systems.  If we were to partition memory consumption at the application
level for large problems, we could more easily apply divide-and-conquer
strategies to them.  This is not a new insight.  "MapReduce" was a
buzzword 20+ years ago.  As often happens, stuff that got figured out at
the "enterprise" level last generation trickles down to the desktop for
this one.  This time maybe we can do better than J2EE.

So, as I see it, the solution to the "insanely dynamic system" problem
is to stop being insane.  Accept a bound on memory consumption.  Stop
overcommitting memory, which already causes frustrating problems.[3]
Use something like Massif[4] to profile your heap usage.  Ask for as
much as you will need at process launch, or refactor your memory-hog
functions to work within one or more smaller arenas.  See what happens.
Yes, some stuff that never used to crash will start crashing because its
authors didn't bother writing with highly parallel systems in mind, or
assumed that memory was infinite.  We also used to watch things crash
because people treated pointers and ints as equivalent.  I'm old enough
to remember the DEC Alpha teaching a lot of hard lessons in this area
(and also with respect to unaligned memory accesses).  We learned.  Now,
we will slowly learn to stop asking for more than we need, and how to
carve up the memory arenas we use.

We also may continue our slow migration away from worse-is-better
behemoths like C/C++ and Unix kernels.

Am I wrong?  Could be.  Point me to reading material.

I try not to talk too much on this list.  Tonight I reckon I blew
through my quota for the year, and I still had something I wanted to
post from the summit...  :-|

Regards,
Branden

[1] https://www.fiercebiotech.com/medtech/getinge-life-support-systems-temporari...
[2] Take note of the dates mentioned in article [1] and compare it with
    the five-year share price history of the manufacturer.

    https://www.google.com/search?q=getinge+share+price

    If you'd bought in when the first recall started in "late 2020" and
    held ever since, you would not have lost money.  And in fact you
    could have cashed out with 100+% profit in the second half of 2021.
[3] https://lwn.net/Articles/668126/
[4] https://valgrind.org/docs/manual/ms-manual.html

----- End forwarded message -----