[seL4] Re: Potential vulnerabilities

7 Nov 2024


      Wow... this requires coffee.... Thanks Indan. I'll come back when I digest
and more or less understand your very interesting explanation.

On Thursday, November 7, 2024, Indan Zupancic <indan@nul.nu> wrote:
...
Hello Hugo,
On 2024-11-07 08:09, Hugo V.C. wrote:
...
Can or can not the kernel be crashed in a exploitable way? An seL4
system halt is not a problem to me. An execution manipulation at
kernel level it would be. A halt is a halt. No problem. But seL4 is
formally verified, so, my concern is if an attacker can do more than
halting the seL4.
The bugs are normally limited to halting the kernel in both cases for all
user space setups:
- The cache invalidate causes a halt because there is no write access.
  This is a user space app calling cache maintenance syscalls on a page
  cap that is mapped read-only. Here seL4 was more permissive than the
  actual hardware. The only problem is the kernel halt, there are no other
  side effects.
- The VCPU bug only affects ARM SMP configurations (otherwise it would
have
  been found by verification). Normally the kernel has no memory mapped at
  address 0, you can check the memory map to verify if this is the case
for
  your platform. If there is no mapping for address 0, the kernel will
halt
  and there are no other side effects.
The rest of the analysis is for the highly unlikely case that you do run
an unverified kernel build on SMP ARM with HYP enabled, where an untrusted
or compromised task has a cap to a disassociated VCPU, with a kernel that
has memory mapped starting at address 0:
If for some reason it has, the NULL access itself is limited to a read of
the NULL TCB's affinity, which will be used to decide whether to send an
IPI
and to which core (offset depends on the size of arch_tcb_t, but will be
less than 4kb). If you have device memory at that address, that read may
have some side effect (although I think ARM hardware usually doesn't).
Otherwise that invalid read itself will have no side-effects and is
harmless.
If it reads a valid core affinity, then things end up harmless:
handleVCPUInjectInterruptIPI() will handle the IPI and do what was
requested,
it doesn't use or touch the NULL TCB in any way, it just does some
internal
bookkeeping.
If it reads an invalid core affinity, things get more interesting. It will
try to send an IPI to core >= NUM_CORES. This results in an out-of-bound
write
past big_kernel_lock.node_owners[] in generic_ipi_send_mask(). The write
is
limited to writing the value '1' to the third word in one 64 or 32 bytes
aligned
chunk past the big kernel lock, limited in range to about 64 or 32 * 64
bytes,
for 64-bit and 32-bit platforms respectively (less if L1 cache lines are
32 bytes).
You can check your kernel.elf binary to see what could be affected this
way if
you're interested. If you're unlucky it's within range of the kernel
stack and
the attacker can modify one stack word to the value of 1.
Assuming the attacker has no control over memory content at address
0-4kb, this
is most likely useless, as they can't choose the address, nor the value
written.
This one write is all they get, because after that the kernel will hang
forever
in ipi_wait(), without releasing the big kernel lock, so any write they
did to
kernel memory can't affect other cores either.
If the attacker has control over address range 0-4kb and all other
requirements
are met, and the attacker gets very lucky, then the VCPU bug might be
enough to
take over the system. But it's still very unlikely, as all they can write
is 1
to limited memory targets. That one write has to both escalate privileges
and
avoid the hang in ipi_wait() to be useful. They can't unlock the kernel
lock
because that requires a write of 0 to the wrong offset.
...
This kind of questions are not just for fun guys, this is what
customers will ask you every single time a non verified code bug
arises. And the infosec community needs that kind of analysis to
increase trust.
Usually customers understand their own system enough to know whether
a bug affects their system or not. An analysis like I did above is
required for your specific system before you can be assured that
known bugs don't affect your systems.
Greetings,
Indan
...
Thank you in advance 🙏
On Tuesday, November 5, 2024, Indan Zupancic <indan@nul.nu> wrote:
...
Hello Hugo,
The bugs you listed are only a problem for untrusted code. That is,
it gives
...
threads with vcpu control or vspace caps the ability to cause a
system halt,
...
effectively giving them more rights than they should have, hence it
being a
...
vulnerability. But those operations, if done in good faith, would be
user
...
space bugs.
In the case of the webserver example, if you can trigger either of
the two bugs
...
in any way, with a current kernel it will avoid the kernel crashing,
but it will
...
still mean your user space will most likely be broken, resulting in
the same end
...
user experience of a non-working system. Because as far as I know,
the webserver
...
example has no fault detection, recovery or restart mechanism
implemented.
...
Greetings,
Indan
On 2024-11-05 10:34, Hugo V.C. wrote:
...
Hi Gerwin,
thanks for the fast response! Yes, this.I already know. But I'm
interested
...
...
in this specific case:
https://docs.sel4.systems/projects/sel4webserver/
as I guess it may not vulnerable as this example (by default) does
not run
...
...
SMP seL4. For "for cache maintenance ops" I'm not sure as this qemu
environment. Any hints?
On Tuesday, November 5, 2024, Gerwin Klein <kleing@unsw.edu.au>
wrote:
...
...
...
HI Hugo,
The CHANGES.md file lists the vulnerable versions of seL4 for each
of
...
...
these (https://github.com/seL4/seL4/blob/master/CHANGES.md)
...
- for VCPU/SMP: seL4 versions 12.0.0 and 12.1.0.
- for cache maintenance ops on AArch64: all versions before 13.0.0
from
...
...
5.0.0
...
Cheers,
Gerwin
On 5 Nov 2024, at 10:02, Hugo V.C. <skydivebcn@gmail.com> wrote:
I'm forwarding this question here (tried on Mattermost Trustworthy
Systems
...
...
...
group first) hoping someone can put some light on this?
---
Hi, I'm having a look to the vulns (in areas of the kernel that
have not
...
...
...
been formally verified) patched in seL4 13.0.0.
We have:
1) "NULL pointer dereference when injecting an IRQ for a
non-associated
...
...
...
VCPU on SMP configurations." 2) "On AArch64, when seL4 runs in EL1
the
...
...
...
kernel would fault with a data abort in
seL4_ARM_Page_Invalidate_Data and
...
...
...
seL4_ARM_VSpace_Invalidate_Data when the user requested a dc ivac
cache
...
...
...
maintenance operation on a page that is not mapped writeable."
Extremely simple question: running version < 13.0.0 on top of Qemu
(in
...
...
...
example like https://docs.sel4.systems/projects/sel4webserver/)
would it
...
...
be
...
vulnerable to any of those?
---
Best,