Hi Christian, On 2026-01-15 10:06, Christian Bruel wrote:
This is interesting that you had to use BPIALL before ret from enable_mmu()
It's not my code, I don't know why arm_enable_mmu() does BPIALL, while arm_enable_hyp_mmu() does not.
Unfortunately I didn't find the equivalent support for the cortex-A35 AArch64, but this is not general to all ARMv8-A processors; for instance, the Cortex-A72 can control the branch target buffer with the CPUACTLR_EL1 register, which is reserved on the Cortex-A35.
Some misunderstanding here: From what you said I understood there was an equivalent instruction in AArch64. And when I looked at the TRM, I found the BPMaint bit in the 64-bit section, but missed that it's part of ID_MMFR3_EL1, "AArch32 Memory Model Feature Register 3".
I am curious to see this w/a impact on zynqmp if you can test, does it still fail sometime ?
I can't reproduce the problem when I build it myself, it seems to depend on the compiler version or CI does something else special I can't reproduce easily. CI was unstable for different reasons lately, so I'm waiting for it to settle down a bit before testing different potential solutions. The zynqmp problem occurs both with and without HYP, so the BPIALL shouldn't make a difference. It's probably totally unrelated, but similar in symptoms. I'll try your work-around, who knows.
Do you see the same problem when booting with one core, with unmodified code? It would help to limit this to unicore to rule out any SMP problems.
I have never seen the issue when booting without SMP. When booting with SMP I see this issue sometime in core0, sometime in core1.
Okay, this changes everything. Is "Boot cpu id = " message always the same, or does it sometimes boot on core 0 and sometimes on core 1? Did Elfloader relocate? That is, do you get this: "ELF loader relocated, continuing boot" Have you tried flushing all instruction and data caches at the beginning of smp_boot()? Do you get the message: "Jumping to kernel-image entry point..." Or is it skipped? If you're booting with U-boot, try adding dcache flush; icache flush; before booting seL4, see if that makes a difference.
For the port, if I don't find another explanation to the issue, a proposal to implement the w/a without impacting other platforms is to to specialize the arm_enable_mmu() epilogue function with a platform macro. Lets see...
More likely this is an SMP boot problem that should be fixed for everyone, otherwise other people will run into the same issue in the future for other ports. Have you tried Microkit or seL4-Rust? They have their own loaders, would be interesting to know whether they run into the same issue. Greetings, Indan