Hello, On 15.11.18 13:20, Stefan Kalkowski wrote:
yesterday I've started to develop a low-level platform test to stress several multi-processor aspects on top of the Genode API. Thereby, I've recognized that on sel4/x86_64 the TLB is practically not always cleaned properly in SMP systems.
The test spawns threads of one protection domain on each available cpu. Each thread accesses one of the first words of a shared dataspace. Afterwards the main thread on CPU 0 detaches and destroys the dataspace. Unfortunately, the threads on other CPUs still successfully access the corresponding memory instead of triggering a fault.
To be able to reproduce the problem, you might investigate the simple test case here:
https://github.com/skalk/genode/commit/2d2d7f982555f551227ad5dcbf9df46705d7d...
My colleague Alexander Boettcher already investigated the problem in part, and might share his insights with you.
as far as I can tell, the invalidation is skipped at this line [0]. If one removes the conditional check, and ever do the invalidation call - then the test succeeds (so, the thread faults on the remote CPU as expected). The line looks like a "optimization" attempt which breaks things. I have to point out that CONFIG_SUPPORT_PCID is unset in our kernel configuration. Nevertheless, on x86 SMP setups you have ever to notify remote CPUs to flush/invalidate the TLB, which seems to be avoided at [0]. Interestingly, on ARM [1], at least it seems, that you handle things unconditional in the unmapPage code handling. I hope the information are helpful to fix the issue. Cheers, Alex. [0] https://github.com/seL4/seL4/blob/9.0.1/src/arch/x86/kernel/vspace.c#L761 [1] https://github.com/seL4/seL4/blob/9.0.1/src/arch/arm/64/kernel/vspace.c#L140... -- Alexander Boettcher Genode Labs https://www.genode-labs.com - https://www.genode.org Genode Labs GmbH - Amtsgericht Dresden - HRB 28424 - Sitz Dresden Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth