Hello,

I realize that the only officially supported SMP platform is the i.mx6 but I did see some code on 6.0 to master for SMP and the zynq7000 so I am trying to test out some SMP functionality on my zc702 board.

I am wondering if anyone has this working because my system is failing when compiling from master and the 6.0 and compatible tags.

 

My second core seems to be coming up but the system ultimately fails and prints out:

Bo

  ot

nKgE RaNlEl L fDinAiTsAh eAdBO, RdT!r

ppFaeud lttio nugs eirn stspraucctei o

n: 0xe001d1b0                       o

FAR: 0xfffffff8 DFSR: 0x807

halting...

Kernel entry via Syscall, number: 1, Call

Cap type: 1, Invocation tag: 37

Which seems to be “Booting all finished, dropped to user space” from core0 and “KERNEL DATA ABORT!” from core1.

The “0xe001d1b0” seems to be the label of the “idle_thread” function.

 

While stepping through via JTAG, I have verified that core1 gets through “init_kernel” and then enters “restore_user_context” at some point in “restore_user_context” the fault registers as shown in the printed output are set. I think it is either in the c_exit_hook in restore_user_context or after the program branches to “0xFFF0010” which is “ldr pc,0xFFFF0030”. This branches to the “arm_data_abort_exception” label, which goes to “kernel_data_fault” label and then to “kernel data abort”.

I’m having trouble exactly pin pointing where the fault occurs but it seems to be close to there.

Has anyone had similar issues with SMP? It seems to get fairly far without setting the fault registers.

 

I have tried to step through the execution over JTAG and here are some of my (verbose) notes

 

   | CORE0 Address | Core0 Function              | Core0 Instruction      | CORE1 Address | Core1 Function                     | Core1 Instruction    |       DFSR |       DFAR | Note |

   |---------------+-----------------------------+------------------------+---------------+------------------------------------+----------------------+------------+------------+------|

   |    0x10000000 | label: start                | =cpsid aif=            |    0xFFFFFF34 |                                    | =mvn r0,#0x0f=       | 0x00000000 | 0x00005000 |      |

   |    0x10003A2C | call: platform_init         | =bl -x10003DD8=        |    0xFFFFFF30 |                                    | =wfe=                | 0x00000000 | 0x00005000 |      |

   |    0x10003ACC | call: smp_boot              | =bl 0x100039FC=        |    0xFFFFFF34 |                                    | =mvn r0,#0x0f=       | 0x00000000 | 0x00005000 |      |

   |    0x10003ADO | ret: smp_boot               | =bl 0x10005C54=        |    0x10000020 | in: non_boot_core                  | =orr r0,r0,#0x40=    | 0x00000000 | 0x00005000 |    2 |

   |    0x10003ADC | =if(is_hyp_mode())=         | =beq 0x10003AF0=       |    0x10002200 | label: arm_disable_dcaches         | =push {r14}=         | 0x00000000 | 0x00005000 |      |

   |    0x10003AFC | call: arm_enable_mmu        | =bl 0x10002174=        |    0xE0006190 | in: try_init_kernel_secondary_core | =beq 0xE0001680=     | 0x00000000 | 0x00005000 |    1 |

   |    0xE0001D70 | label: init_kernel          | =push {r11,r14}=       |    0xE0001680 | in: try_init_kernel_secondary_core | =beq 0xE0001680=     | 0x00000000 | 0x00005000 |    1 |

   |    0xE0001814 | label: try_init_kernel      | =push {r11,r14}=       |    0xE0001690 | in: try_init_kernel_secondary_core | =beq 0xE0001680=     | 0x00000000 | 0x00005000 |    1 |

   |    0xE0001B80 | call: create_initial_thread | =str r0, [r11,#-0x14]= |    0xE0001680 | in: try_init_kernel_secondary_core | =beq 0xE0001680=     | 0x00000000 | 0x00005000 |    1 |

   |    0xE0001C48 | call: SMP_COND_STATEMENT    | =bl 0xE0003C20=        |    0xE0001680 | in: try_init_kernel_secondary_core | =beq 0xE0001680=     | 0x00000000 | 0x00005000 | 1, 3 |

   |    0xE0001C4C | call: SMP_COND_STATEMENT    | =bl 0xE00017D8=        |    0xE0001680 | in: try_init_kernel_secondary_core | =beq 0xE0001680=     | 0x00000000 | 0x00005000 | 1, 4 |

   |    0xE0001C50 | NODE_LOCK_SYS               | =bl 0xE0019280=        |    0xE0019288 | in: getCurrentCPUIndex             | =sub r13,r13,#0x8=   | 0x00000000 | 0x00005000 |    5 |

   |    0xE0001D1C | call: arch_pause            | =bl 0xE0019DB0=        |    0xE0019290 | in: getCurrentCPUIndex             | =str r0,[r11,#-0x8]= | 0x00000000 | 0x00005000 |    6 |

   |    0xE0001D40 | in: clh_lock_acquire        | =uxtb r3,r3=           |    0xE00037F4 | in: init_core_state                | =pop {r4,r11,pc}=    | 0x00000000 | 0x00005000 |      |

   |    0xE0001D20 | in: clh_lock_acquire        | =mov r2,#0xE800=       |    0xE0003754 | in: init_core_state                | =movw r2,#0xE8E0=    | 0x00000000 | 0x00005000 |    7 |

   |    0xE0001D40 | in: clh_lock_acquire        | =uxtb r3,r3=           |    0xE00017C4 | in: try_init_kernel_secondary      | =mov r3,#0x1=        | 0x00000000 | 0x00005000 |    8 |

   |    0xE0001D40 | in: clh_lock_acquire        | =uxtb r3,r3=           |    0xE002A06C | in: schedule                       | =push {r11,r14}=     | 0x00000000 | 0x00005000 |    9 |

   |    0xE0001D40 | in: clh_lock_acquire        | =uxtb r3,r3=           |    0xE002979C | in: activateThread                 | =push {r11, r14}=    | 0x00000000 | 0x00005000 |   10 |

   |    0xE0001D40 | in: clh_lock_acquire        | =uxtb r3,r3=           |    0xE001D24C | label: Arch_activateIdleThread     | =push {r11}=         | 0x00000000 | 0x00005000 |   11 |

   |    0xE0001D38 | in: clh_lock_acquire        | =ldr r3,[r3,#0x4]=     |    0xE0000054 | in: start                          | =b 0xE001CEC8=       | 0x00000000 | 0x00005000 |   12 |

  

 

Notes  

    1. Core1 is in a =while (!node_boot_lock)= loop

    2. In =smp_boot=, CORE1 changes after =init_cpus= (branch location: ZSR:10003A08)

       - In =smp_boot=, =boot_cpus= is called

       - This sets the =CPU_JUMP_PTR= =*((volatile uint32_t*)CPU_JUMP_PTR) = (uint32_t)entry;=

       - calls =dsb= (data synchronization barrier)

               - After this call, CPU1 goes to =FFFFFF2C: dsb sy=

       - And then =sev=

               - After this call, CPU1 goes to the =non_boot_core= label

               - SEV

                  - SEV causes an event to be signaled to all cores within a multiprocessor system. If SEV is implemented, WFE must also be implemented.

               - WFE

                  - If the Event Register is not set, WFE suspends execution until one of the following events occurs:

                    - an IRQ interrupt, unless masked by the CPSR I-bit

                    - an FIQ interrupt, unless masked by the CPSR F-bit

                    - an Imprecise Data abort, unless masked by the CPSR A-bit

                  - a Debug Entry request, if Debug is enabled

                  - an Event signaled by another processor using the SEV instruction.

       - If the Event Register is set, WFE clears it and returns immediately.

       - If WFE is implemented, SEV must also be implemented.

       - After CPU0 executes =arm_enable_mmu()= from the =main= function

       - by the end of =smp_boot= core1 is just starting =non_boot_main=

    3. The =SMP_COND_STATEMENT= is calling =clh_lock_init=

    4. The =SMP_COND_STATEMENT= is calling =release_secondary_cpus=

    5. right after Core0 returned from releasing secondary cpus

       - First time Core1 has exited the loop

       - Core1's stack is

               - =getCurrentCPUINdex=

               - =init_core_state=

               - =try_init_kernel_secondary_core=

               - =init_kernel=

    6. Core0 is in a =while(big_kernel_lock.node_owners[cpu].next->value ! = CLHState_Granted)=

       - Core0 is in a static inline function =clh_lock_acquire= in =try_init_kernel=

       - Core1 is in =getCurrentCPUIndex= but being called from =tcbDebugAppend=

               - =tcbDebugAppend= is being called from a =for= loop in =init_core_state=

    7. Core0 is still in the previously mentioned while loop

       - Core1 is in "init_core_state" and has exited the for loop that called =tcbDebugAppend(NODE_STATE_ON_CORE(ksIdleThread, i))=

    8. Core0 is still in the previously mentioned while loop

       - Core1 has returned to =try_init_kernel_secondary_core= from =init_core_state= and is at the end of the function

    9. Core0 is still in the previously mentioned while loop

       - Core1 has entered the =init_kernel= call and then the =schedule= function.

    10. Core0 is still in the previously mentioned while loop

               - Core1 has entered the =activateThread= call after =schedule= in =init_kernel=

    11. Core0 is still in the previously mentioned while loop

               - Core1 seems to have dropped into the =case ThreadState_IdleThreadState:= case when switching on =switch (thread_state_get_tsType(NODE_STATE(ksCurThread)->tcbState))=

                 - This was in =activateThread=

    12. Core0 is still in the previously mentioned while loop

               - Core1 has exited =init_kernel= and is now branching to =restore_user_context=

 

 

To test everything out I am using the  “camkes-sols-master” manifest and building the “CAmkES Hello World application with events and dataports”.

The changes I made are

·        I edited the top level CAmkES file to set the affinity for two separate cores.

·        Upped the Max Number of CPU nodes to 2

The rest of the config is pretty standard. I have it attached to this message.

The FSBL and ps7_init script I use are the standard ones created for the zc702 from the 2017.2 version of the Xilinx XSDK.

I am booting from jtag and first run the ps7_init script and then flash the fsbl and then the “capdl-loader-experimental-image-arm-zynq7000” that was built.

 

I am wondering if anyone is using a modified fsbl or ps7_init that does something else, if there is config value that I missed, or if it is still in development? If it is still in development I’d like to work with whoever is

 

Thanks,

Jesse Millwood