Hi Leonid,
I forgot to ask, what seL4 version are you using? Is it from the Data61 Github repo? If you also use CAmkES, which version is it? It would be helpful if you could isolate the 2 VMs, so we could have a smaller system to understand. Based on your description, my understanding is you use one VMM running on core 0 to manage two VMs running on core 1 and core 2?
R5 is the always-on core? What does the application do?
Regards,
Yanyan
-----Original Message-----
From: Leonid Meyerovich
Sent: Friday, 4 October 2019 8:26 PM
To: Shen, Yanyan (Data61, Kensington NSW) ; devel@sel4.systems
Subject: RE: [seL4] Zynq UltraScale+ locks up after hours running
Hi Yanyan,
Yes, VMM is running on core 0, creates 2 VM and run them on core 1 and core2. VM don't have any access to hardware and they receive virtual timer and maintenance interrupts.
I am not sure I can stop running all processes on core 3, I should think about it, but in this case the whole condition will change. Also I probably didn't mention that R5 also runs an application and communicates to the application, which is running on core3 through openAmp.
Thanks,
Leonid
-----Original Message-----
From: Shen, Yanyan (Data61, Kensington NSW)
Sent: Friday, October 4, 2019 8:16 AM
To: Leonid Meyerovich ; devel@sel4.systems
Subject: RE: [seL4] Zynq UltraScale+ locks up after hours running
Hi Leonid,
What do you mean by "hypervisor on core 0"? Do you mean the VMM? I assume you create a VMM for each VM running, and also pin the VMMs on the corresponding physical cores? If so, core 1 and core 2 also should receive virtual timer interrupts and VGIC maintenance interrupts. Is it possible that you stop running the processes on core 3 and just keep running the VMs on different cores? Do the VMs have any accesses to physical hardware, for instance, clocks or watchdogs?
Regards,
Yanyan
-----Original Message-----
From: Leonid Meyerovich
Sent: Thursday, 3 October 2019 8:45 PM
To: Shen, Yanyan (Data61, Kensington NSW) ; devel@sel4.systems
Subject: RE: [seL4] Zynq UltraScale+ locks up after hours running
Hi Yanyan,
I am running initial task and hypervisor on core 0.
Hypervisor creates 2 VM and running them on core 1 and 2 Core 3 runs 7 processes that communicate through notification objects and shared memory (in pairs) On process on core 3 implements UART based connection (, this is PL uart, Rx use interrupt). On of core 3 process also runs SADA driver (also uses interrupt)
VMs communicate to the rest of the system through 'virtual channels' - exceptions and shared memory.
All hardware interrupts are processed by core 0 (please, correct me if I am wrong). But as far as I understand PL2 physical timer interrupt runs on every core.
Every processes prints some messages on terminal. I have never seen that these messages have been printed partially what the system lock up, process completes printing the message and has never scheduled again.
I have also printed some messages from inside of ISR after getting time interrupt (print interrupt counter once a second) and I don't see these messages when the system locks up.
Thank you,
Leonid
-----Original Message-----
From: Shen, Yanyan (Data61, Kensington NSW)
Sent: Thursday, October 3, 2019 2:44 AM
To: Leonid Meyerovich ; devel@sel4.systems
Subject: Re: [seL4] Zynq UltraScale+ locks up after hours running
Hi Leonid,
Could you provide a bit more about your software configuration? For instance, do you have multiple VMs running on dedicated hardware cores?
How are the VM and processes configured?
Also, you mean there were no interrupts at all on all the four cores?
Regards,
Yanyan
On Wed, 2019-10-02 at 16:01 +0000, Leonid Meyerovich wrote:
Hello,
We are running seL4 microkernel on 4 cores Zynq UltraScale+ (zcu102
board). The implementation includes multiple processes, hypervisor and
virtual machine running on dedicated core. After several hours running
(it could be 2 or even 8 hours) the whole microkernel locks up. After
some investigation I have found that no interrupts generated anymore -
at least there is no interrupts coming to ISR.
Inside ISR I have monitored PL2 Physical Timer Control register, which
feeds a scheduler and didn't find any problems - it stays enabled and
not masked.
I will appreciate any idea/direction for approaching this problem.
Thank you,
Leonid
This message and all attachments are PRIVATE, and contain information
that is PROPRIETARY to Intelligent Automation, Inc. You are not
authorized to transmit or otherwise disclose this message or any
attachments to any third party whatsoever without the express written
consent of Intelligent Automation, Inc. If you received this message
in error or you are not willing to view this message or any
attachments on a confidential basis, please immediately delete this
email and any attachments and notify Intelligent Automation, Inc.
_______________________________________________
Devel mailing list
Devel@sel4.systems
https://sel4.systems/lists/listinfo/devel
________________________________
This message and all attachments are PRIVATE, and contain information that is PROPRIETARY to Intelligent Automation, Inc. You are not authorized to transmit or otherwise disclose this message or any attachments to any third party whatsoever without the express written consent of Intelligent Automation, Inc. If you received this message in error or you are not willing to view this message or any attachments on a confidential basis, please immediately delete this email and any attachments and notify Intelligent Automation, Inc.
________________________________
This message and all attachments are PRIVATE, and contain information that is PROPRIETARY to Intelligent Automation, Inc. You are not authorized to transmit or otherwise disclose this message or any attachments to any third party whatsoever without the express written consent of Intelligent Automation, Inc. If you received this message in error or you are not willing to view this message or any attachments on a confidential basis, please immediately delete this email and any attachments and notify Intelligent Automation, Inc.