since there are no converse suggestions/ideas by anyone, I opened up a
pull request which fixes (according to our TLB flush test) this seL4
kernel bug on x86 in SMP setups. The patch is used with 9.0.0, however
cleanly applies to latest seL4 release (10.1.1).
On 15.11.18 14:39, Alexander Boettcher wrote:
On 15.11.18 13:20, Stefan Kalkowski wrote:
yesterday I've started to develop a
low-level platform test to stress several
multi-processor aspects on top of the Genode API. Thereby, I've
recognized that on sel4/x86_64 the TLB is practically not always
cleaned properly in SMP systems.
The test spawns threads of one protection domain on each available cpu.
Each thread accesses one of the first words of a shared dataspace.
Afterwards the main thread on CPU 0 detaches and destroys the
dataspace. Unfortunately, the threads on other CPUs still successfully
access the corresponding memory instead of triggering a fault.
To be able to reproduce the problem, you might investigate the simple
test case here:
My colleague Alexander Boettcher already investigated the problem in
part, and might share his insights with you.
as far as I can tell, the invalidation is skipped at this line .
If one removes the conditional check, and ever do the invalidation call
- then the test succeeds (so, the thread faults on the remote CPU as
The line looks like a "optimization" attempt which breaks things. I have
to point out that CONFIG_SUPPORT_PCID is unset in our kernel
configuration. Nevertheless, on x86 SMP setups you have ever to notify
remote CPUs to flush/invalidate the TLB, which seems to be avoided at .
Interestingly, on ARM , at least it seems, that you handle things
unconditional in the unmapPage code handling.
I hope the information are helpful to fix the issue.
Genode Labs GmbH - Amtsgericht Dresden - HRB 28424 - Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth