Hi Indan, Thank for your insight and sharing the experience on zynqmp.
We have a similar issue in 32-bit mode on the zynqmp, but there it only fails sometimes. Does it consistently fail for you or just sometimes? I suspect the latter, as Cortex-A35 is mostly in-order (with some out-of-order memory accesses).
If fails consistently. With the workaround it succeeds consistently. This is interesting that you had to use BPIALL before ret from enable_mmu() Unfortunately I didn't find the equivalent support for the cortex-A35 AArch64, but this is not general to all ARMv8-A processors; for instance, the Cortex-A72 can control the branch target buffer with the CPUACTLR_EL1 register, which is reserved on the Cortex-A35. I am curious to see this w/a impact on zynqmp if you can test, does it still fail sometime ?
However, other than having unnecessarily wrong branch predictions, I don't see how clearing branch predictor state would be architecturally required. It seems more likely that the correct branch prediction causes issues that don't show up when mispredicting the branch. Do you have an aarch64 reference?
The only (incomplete) information I found is from Armv8-A Instruction Set Architecture Issue 1.1, Chapter 10: Function Calls, which explicitly states that all Cortex-A processors support branch prediction affecting the RET instruction. But no details on the different micro-architecture implementations.
Do you see the same problem when booting with one core, with unmodified code? It would help to limit this to unicore to rule out any SMP problems.
I have never seen the issue when booting without SMP. When booting with SMP I see this issue sometime in core0, sometime in core1.
Does adding BPIALL make it boot too? I guess it will because it has the same effect.
BPIALL is an AArch32 instruction, not supported on AArch64.
Or are you running some old version
I am re-based on top of the master branch, so we are aligned. For the port, if I don't find another explanation to the issue, a proposal to implement the w/a without impacting other platforms is to to specialize the arm_enable_mmu() epilogue function with a platform macro. Lets see... Best Regards Christian Best Regards Christian
Greetings,
Indan