Hi Indian,
-BOOT_BSS static volatile int node_boot_lock;
+static volatile int node_boot_lock;
In seL4 kernel code try_init_kernel_secondary_core every core waits on
node_boot_lock to start initialization. As you can see above node_boot_lock
has been defined as BOOT_BSS is a specific usage context of the .bss
section during the boot process. Apparently, the Morello processor
introduces some unique features and behaviors that might affect how memory
regions like the BOOT_BSS section handles cache coherence in a
multi-threaded environment. My research shows that sometimes threads don't
see node_boot_lock changes and don't start initialization. I have removed
BOOT_BSS and it is booted properly now
Thanks,
Leonid
On Fri, May 3, 2024 at 8:13 AM Indan Zupancic
Hello Leonid,
On 2024-04-25 21:37, Leonid Meyerovich wrote:
I run my seL4 application on 4 core ARMv8.
Still the new platform port I assume?
Recently I have developed some more application code (I mention it because file system size is increased an I believe it may change boot time) and from time to time I observed that boot process failed (stuck after the following print)
Does this only happen with a bigger application? And is it always stuck for a specific binary, or does the same binary sometimes work?
The application isn't started until after the kernel is done booting, so its size should not affect cores coming up or not. If it does, it may be a bug in Elfloader. If that is the case, it should either always succeed or always fail.
Some time I see that only 2 CPU have been booted (like CPU1 and CPU3, or CPU2 and CPU3) Last print is the last one in init_kernel() function
Core 0 will only finish if all other cores have done their init too.
I recommend adding some debugging to try_init_kernel_secondary_core() and init_cpu() to see what where it gets stuck.
Also be aware that debug printing may disappear if multiple cores print at exactly the same time, depending on the UART driver.
Greetings,
Indan