Hi:
> So you would expect that the reported metric would scale following Amdahl's law based on the proportion of an operation that is serialized inside the kernel lock which would potentially vary across platforms.
> It cannot simply be throughput, else the doubled delay should be reflected in a significantly reduced throughput, but it has almost no effect.
1. This is what I also feel confused, I think whether it is depend on the implementation of ipc_normal_delay
ipc_normal_delay will call below code:
delay = OVERHEAD_FIXUP(REXP(id) * current_delay_cycle, overhead);
insert a command such as 'zigset(...)'for every core, The ziggurat method for generating random variables, when double delay, the random variables coefficient will reduce the delay influence?
2. " the proportion of an operation that is serialized inside the kernel lock ", CLH lock is serialized between cores, if 4 cores will have big different performance on different ARM platform?
> For 3 or 4 cores the combined latency of two IPCs is larger than the 500cy delay and you expect lock contention, resulting in reduced scaling, while it should still scale almost perfectly with the 1000cy delay.
> > > Addition:
> > Our seL4_Call performance is same with other platform
> > XXXX IMX8MM_EVK_64 TX2_64
> > seL4_Call 367(0) 378(2) 492(16) client->server, same vspace, ipc_len is 0
> > seL4_ReplyRecv 396(0) 402(2) 513(16) server->client, same vspace, ipc_len is 0
1. the seL4_Call and seL4_ReplyRecv are measured on SMP kernel
2. or 3 or 4 cores the combined latency of two IPCs is larger than the 500cy delay and you expect lock contention ---- that's what I thought at first, But It can't be explained very well on IMX8MM_EVK_64, base on the test seL4_Call data, when run on 3 cores and 4cores, IMX8MM_EVK_64 still have a good scalability, so I think not all seL4_Call and ReplyRecv flow which cost maybe 300 or 400 is locked(like el trap), TX2 should have a good scalability as expected when run on 3 or 4 cores, but it did not show good, from data below:1497740(3core, 500cycles),1545872(4core)
> > My test results below:
> > ARM platform
> > Test item XXX IMX8MM_EVK_64 TX2
> > mean(Stddev)
> > 500 cycles, 1 core 636545(46) 625605(29) 598142(365)
> > 500 cycles, 2 cores 897900(2327) 1154209(44) 994298(94)
> > 500 cycles, 3 cores 1301679(2036) 1726043(65) 1497740(127)
> > 500 cycles, 4 cores 1387678(549) 2172109(12674) 1545872(109)
> > 1000 cycles, 1 core 636529(42) 625599(22) 597627(161)
> > 1000 cycles, 2 cores 899212(3384) 1134110(34) 994437(541)
> > 1000 cycles, 3 cores 1297322(5028) 1695385(45) 1497547(714)
> > 1000 cycles, 4 cores 1387149(456) 2174605(81) 1545716(614)
> I notice your standard deviations for 2 and 3 cores are surprisingly high (although still small in relative terms).
>
> Did you try running the same again? Are the numbers essentially the same or are multiple runs all over the shop?
1. Yes,I test several times,I always get the same results, have little different
2. I think the stddev is small relative to our test results, IMX8MM_EVK_64 also have a big stddev(500 cycles, 3core) from the data.
-----邮件原件-----
发件人: devel-request(a)sel4.systems [mailto:devel-request@sel4.systems]
发送时间: 2021年12月7日 11:45
收件人: devel(a)sel4.systems
主题: Devel Digest, Vol 127, Issue 5
Send Devel mailing list submissions to
devel(a)sel4.systems
To subscribe or unsubscribe via email, send a message with subject or body 'help' to
devel-request(a)sel4.systems
You can reach the person managing the list at
devel-owner(a)sel4.systems
When replying, please edit your Subject line so it is more specific than "Re: Contents of Devel digest..."
Today's Topics:
1. Re: Use TimeServer by Group Components Questions (Kent Mcleod)
2. Re: Incorporating seL4 into your project, risc-v architecture
(Kent Mcleod)
3. Re: some performance problem when test 4 cores SMP benchmark of seL4bench project 答复: Devel Digest, Vol 127, Issue 1
(Kent Mcleod)
4. Re: Use TimeServer by Group Components Questions
(15852538526(a)139.com)
5. Re: Use TimeServer by Group Components Questions (Kent Mcleod)
6. Re: some performance problem when test 4 cores SMP benchmark of seL4bench project 答复: Devel Digest, Vol 127, Issue 1
(Gernot Heiser)
----------------------------------------------------------------------
Message: 1
Date: Tue, 7 Dec 2021 13:03:00 +1100
From: Kent Mcleod <kent.mcleod72(a)gmail.com>
Subject: [seL4] Re: Use TimeServer by Group Components Questions
To: 15852538526(a)139.com
Cc: devel <devel(a)sel4.systems>
Message-ID:
<CA+-ozWeJgZm3B902EKBU6dUGxy6wU3Cp92jEUYwRKNFcVS=7DA(a)mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
On Fri, Dec 3, 2021 at 2:51 PM <15852538526(a)139.com> wrote:
>
> I got an problem when I use TimeServer with Group Components. I wish more than one component in the same Group Components, that can use TimeServer. While the cdl only have one ep for all components in the Group to communicate to TimeServer with an badge value that not equal to zero. And TimeServer must to distinguish requester by the badge, so only one component can receive TimeServer's response. How can I resolve this problem?
The TimeServer component cannot be used as part of a Group component.
The Group component mechanism isn't designed to transparently work for any component and the Connection types that the TimeServer uses are not compatible with the Group mechanism.
> _______________________________________________
> Devel mailing list -- devel(a)sel4.systems To unsubscribe send an email
> to devel-leave(a)sel4.systems
------------------------------
Message: 2
Date: Tue, 7 Dec 2021 13:14:30 +1100
From: Kent Mcleod <kent.mcleod72(a)gmail.com>
Subject: [seL4] Re: Incorporating seL4 into your project, risc-v
architecture
To: so_s(a)gmx.de
Cc: devel <devel(a)sel4.systems>
Message-ID:
<CA+-ozWeU7Bi-byXhsvo9jnQ_tcn90U7QRa8_hh39p2rVgUcKzA(a)mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
On Fri, Dec 3, 2021 at 8:14 AM <so_s(a)gmx.de> wrote:
>
> Dear seL4 devs and community.
>
> I try to create a simple project using the seL4 micro kernel. For setting up my environment I followed this (https://github.com/manu88/SeL4_101) and this (https://docs.sel4.systems/projects/buildsystem/incorporating.html) article. As kernel I'm using the last release of SeL4 (https://github.com/seL4/seL4/releases/tag/12.1.0). I created some kind of user application, like Manu88 did with a simple Hello World C-Program. While building with the delivered init-build.sh script I got a lot of repeating errors like this:
> CMake Error at tools/cmake-tool/helpers/simulation.cmake:179 (add_custom_command):
> Error evaluating generator expression:
>
> $<TARGET_PROPERTY:rootserver_image,KERNEL_IMAGE_NAME>
>
> Target "rootserver_image" not found.
> Call Stack (most recent call first):
> CMakeLists.txt:22 (GenerateSimulateScript) While going through
> Manu88s guide I saw that he also got this errors. But after adding a CMakeList.txt file inside his user code, and declaring it as rootserver inside the CMakeList.txt file (I just copied the contents from here: https://github.com/manu88/SeL4_101/blob/master/projects/Hello/CMakeLists.txt) he got rid of the errors. I still get them and searching through the internet could not find a solution for me. Bellow I added the folder structure of my project.
> Project/
> ├── kernel/
> ├── projects/
> │ ├── Hello/
> │ ├── musllibc/
> │ ├── utils_libs/
> │ ├── seL4_libs/
> ├── build/
> ├── tools/
> │ └── cmake-tool/
> │ └── elfloader-tool/
> ├── init-build.sh -> tools/cmake-tool/init-build.sh
> ├── CMakeLists.txt
> ├── application_settings.cmake ->
> tools/cmake-tool/helpers/application_settings.cmake
> ├── settings.cmake
>
> So I cannot tell where the issue lies. Any pointers would be appreciated.
If you are using
https://github.com/manu88/SeL4_101/blob/master/projects/Hello/CMakeLists.txt
inside projects/Hello/ then you need to make sure that the top level CMakeLists inside Project/ is either a symlink to tools/cmake-tool/default-CMakeLists.txt or is similar to https://github.com/manu88/SeL4_101/blob/master/CMakeLists.txt. With this file layout, ../init-build.sh will end up calling the following CMake invocation: `cmake -DCMAKE_TOOLCHAIN_FILE=../kernel/gcc.cmake -G Ninja -DSEL4_CACHE_DIR=../.sel4_cache -C "../settings.cmake" "../"`.
What is the contents of your settings.cmake file?
>
> Kind regards,
> Sophia
> _______________________________________________
> Devel mailing list -- devel(a)sel4.systems To unsubscribe send an email
> to devel-leave(a)sel4.systems
------------------------------
Message: 3
Date: Tue, 7 Dec 2021 13:43:48 +1100
From: Kent Mcleod <kent.mcleod72(a)gmail.com>
Subject: [seL4] Re: some performance problem when test 4 cores SMP
benchmark of seL4bench project 答复: Devel Digest, Vol 127, Issue 1
To: Gernot Heiser <gernot(a)unsw.edu.au>
Cc: "devel(a)sel4.systems" <devel(a)sel4.systems>
Message-ID:
<CA+-ozWcihN6my4WOyPbpCcfp3=rksYSnnEgZE2kc6c-BV4Kymw(a)mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
On Thu, Dec 2, 2021 at 9:28 PM Gernot Heiser <gernot(a)unsw.edu.au> wrote:
>
>
>
> > On 2 Dec 2021, at 16:17, yadong.li <yadong.li(a)horizon.ai> wrote:
> >
> > First, I got the data of IMX8MM_EVK_64 and TX2 from
> > https://github.com/seL4/sel4bench/actions/runs/1469475721#artifacts,
> > the sel4bench-results-imx8mm_evk file and sel4bench-results-tx2 file, unpack the file out, I find xxxx_SMP_64.json Secondly, the test is the smp benchmark form sel4bench-manifest project, the source file is sel4bench/apps/smp/src/main.c The test scenario look like below:
> > A pair thread of ping-pong on the same core, the ping thread will
> > wait for "ipc_normal_delay" time then send 0 len ipc message to pong
> > thread, then return. I think the 500 cycles mean how long
> > ipc_normal_delay will really delay
>
> > The above scenario will test on one core, or mutil core. If we run 4 cores, every core will have a ping thread and a pong thread run like above description, then record the sum of all cores ping-pong counts.
>
> ok, but what is the metric reported? [Apologies for not being on top
> of the details of our benchmarking setups.]
Looking at the sel4bench smp benchmark implementation, the metric is the total number of "operations" in a single second. An operation is a round trip intra address space seL4_Call + seL4_ReplyRecv between 2 threads on the same core with each thread delaying for the cycle count before performing the next operation. After 1 second of all cores performing these operations continuously and maintaining a core-local (on a separate cache line) count, the total number of operations is added together and reported as the final number. So you would expect that the reported metric would scale following Amdahl's law based on the proportion of an operation that is serialized inside the kernel lock which would potentially vary across platforms.
>
> It cannot simply be throughput, else the doubled delay should be reflected in a significantly reduced throughput, but it has almost no effect.
>
> > I think this experiment is used to illustrate in multi core, our seL4 kernel big lock will not affect mutli-core performance, am I right ?
>
> Not quite. As there’s only one big lock, only one core can execute the kernel at any time. If one core is in the IPC while another core is trying to IPC, even though both IPCs are core-local, the second will have to wait until the first gets out of the lock.
>
> As the delay is higher than the syscall latency, you’d expect perfect scalability from one core to two (with the lock essentially synchronising the threads). For 3 or 4 cores the combined latency of two IPCs is larger than the 500cy delay and you expect lock contention, resulting in reduced scaling, while it should still scale almost perfectly with the 1000cy delay. This is exactly what you see for the i.MX8 and the TX2.
>
> > Addition:
> > Our seL4_Call performance is same with other platform
> > XXXX IMX8MM_EVK_64 TX2_64
> > seL4_Call 367(0) 378(2) 492(16) client->server, same vspace, ipc_len is 0
> > seL4_ReplyRecv 396(0) 402(2) 513(16) server->client, same vspace, ipc_len is 0
>
> OK, so baseline performance is good. But are these measured on a single-core or SMP kernel (i.e. is locking included)?
>
> > My test results below:
> > ARM platform
> > Test item XXX IMX8MM_EVK_64 TX2
> > mean(Stddev)
> > 500 cycles, 1 core 636545(46) 625605(29) 598142(365)
> > 500 cycles, 2 cores 897900(2327) 1154209(44) 994298(94)
> > 500 cycles, 3 cores 1301679(2036) 1726043(65) 1497740(127)
> > 500 cycles, 4 cores 1387678(549) 2172109(12674) 1545872(109)
> > 1000 cycles, 1 core 636529(42) 625599(22) 597627(161)
> > 1000 cycles, 2 cores 899212(3384) 1134110(34) 994437(541)
> > 1000 cycles, 3 cores 1297322(5028) 1695385(45) 1497547(714)
> > 1000 cycles, 4 cores 1387149(456) 2174605(81) 1545716(614)
>
> I notice your standard deviations for 2 and 3 cores are surprisingly high (although still small in relative terms).
>
> Did you try running the same again? Are the numbers essentially the same or are multiple runs all over the shop?
>
> There are some issues with our benchmarking methodology. Fixing up sel4bench is one of the projects I’d like to do if I got a student for it, or maybe someone from the community would want to help?
>
> But just from looking at the data I’m not sure that’s the issue here.
>
> Gernot
> _______________________________________________
> Devel mailing list -- devel(a)sel4.systems To unsubscribe send an email
> to devel-leave(a)sel4.systems
------------------------------
Message: 4
Date: Tue, 07 Dec 2021 03:06:38 -0000
From: 15852538526(a)139.com
Subject: [seL4] Re: Use TimeServer by Group Components Questions
To: devel(a)sel4.systems
Message-ID:
<163884639883.1070957.2394639262660252936(a)mattermost.seL4.systems>
Content-Type: text/plain; charset="utf-8"
Thank you very much for answering my question. I still have a doubt about the TimeServer. I can manually modify the cdl file and then I get the the expected source file(capdl_spec.c) from using capdl-parse. I also need to change the return value of timeserver <IF>_notification API. I have try and I can run the program as I expected. Is there any way to create adapted cdl file automatic by modifying the ADL parse project?
I am looking forward to getting your answer again. Thank you very much.
my ADL file:
import <std_connector.camkes>;
import <global-connectors.camkes>;
import <TimeServer/TimeServer.camkes>;
component Client {
control;
uses Timer timeout;
}
assembly {
composition {
group grp {
component Client c1;
component Client c2;
}
component TimeServer time_server;
connection seL4TimeServer ts1(
from grp.c1.timeout, to time_server.the_timer);
connection seL4TimeServer ts2(
from grp.c2.timeout, to time_server.the_timer);
}
configuration {
time_server.timers_per_client = 1;
}
}
------------------------------
Message: 5
Date: Tue, 7 Dec 2021 14:23:54 +1100
From: Kent Mcleod <kent.mcleod72(a)gmail.com>
Subject: [seL4] Re: Use TimeServer by Group Components Questions
To: 15852538526(a)139.com
Cc: devel <devel(a)sel4.systems>
Message-ID:
<CA+-ozWestyqAM4wy+x359CLZLf13PzdCRBNFSfPY-3nyo9Y6wA(a)mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
On Tue, Dec 7, 2021 at 2:09 PM <15852538526(a)139.com> wrote:
>
> Thank you very much for answering my question. I still have a doubt about the TimeServer. I can manually modify the cdl file and then I get the the expected source file(capdl_spec.c) from using capdl-parse. I also need to change the return value of timeserver <IF>_notification API. I have try and I can run the program as I expected. Is there any way to create adapted cdl file automatic by modifying the ADL parse project?
There isn't a supported way to modify the generated cdl file as CAmkES expects to generate a full system specification.
What is your motivation for trying to make the timeserver a group component instead of a separate component? As part of a group component it will still have its own kernel objects for its threads, code, data and heap memories so you still have the costs of isolation but without the benefits because it still shares the virtual address space and capability space with other components in the same group.
> I am looking forward to getting your answer again. Thank you very much.
>
> my ADL file:
> import <std_connector.camkes>;
> import <global-connectors.camkes>;
> import <TimeServer/TimeServer.camkes>;
>
> component Client {
> control;
> uses Timer timeout;
> }
>
> assembly {
> composition {
> group grp {
> component Client c1;
> component Client c2;
> }
> component TimeServer time_server;
>
> connection seL4TimeServer ts1(
> from grp.c1.timeout, to time_server.the_timer);
> connection seL4TimeServer ts2(
> from grp.c2.timeout, to time_server.the_timer);
> }
>
> configuration {
> time_server.timers_per_client = 1;
> }
> }
> _______________________________________________
> Devel mailing list -- devel(a)sel4.systems To unsubscribe send an email
> to devel-leave(a)sel4.systems
------------------------------
Message: 6
Date: Tue, 7 Dec 2021 03:45:09 +0000
From: Gernot Heiser <gernot(a)unsw.edu.au>
Subject: [seL4] Re: some performance problem when test 4 cores SMP
benchmark of seL4bench project 答复: Devel Digest, Vol 127, Issue 1
To: "devel(a)sel4.systems" <devel(a)sel4.systems>
Message-ID: <17FC6BC4-3407-44AE-B7AA-4623EFF76988(a)unsw.edu.au>
Content-Type: text/plain; charset="utf-8"
On 7 Dec 2021, at 13:43, Kent Mcleod <kent.mcleod72(a)gmail.com> wrote:
>
> Looking at the sel4bench smp benchmark implementation, the metric is
> the total number of "operations" in a single second. An operation is
> a round trip intra address space seL4_Call + seL4_ReplyRecv between 2
> threads on the same core with each thread delaying for the cycle count
> before performing the next operation. After 1 second of all cores
> performing these operations continuously and maintaining a core-local
> (on a separate cache line) count, the total number of operations is
> added together and reported as the final number. So you would expect
> that the reported metric would scale following Amdahl's law based on
> the proportion of an operation that is serialized inside the kernel
> lock which would potentially vary across platforms.
Thanks for the explanation, Kent.
Observations:
1) The metric is essentially independent of the delay. Looking at the single-core figures for the i/MX8, I get 1598.5 ns in both cases, the difference being 15ps. Doesn’t make sense to me.
2) Assuming this processor runs at the 1.8GHz it seems speced for, this corresponds to 2877 cycles, which is huge, even if the 1000cy delay is subtracted!
3) As I said before, intra-AS IPC is a meaningless metric we should never use (but that’s incidental to the particular thing we want to measure here).
4) Having to do these calculations to understand the numbers is a sure indication that the results are presented in an unsuitable form.
I can’t see how these figures make sense.
Gernot
------------------------------
Subject: Digest Footer
_______________________________________________
Devel mailing list -- devel(a)sel4.systems To unsubscribe send an email to devel-leave(a)sel4.systems
------------------------------
End of Devel Digest, Vol 127, Issue 5
*************************************
Hello seL4 community,
FOSDEM 2022 will be an online event (again) with the traditional
Microkernel devroom as part of it. Thanks to Martin Decky & my colleague Sebastian Sumpf for
engagement! The original Call for Papers follows.
Best regards,
Alexander.
--
Alexander Boettcher
Genode Labs
https://www.genodians.org - https://www.genode.org
------------------------------------------------------------------------------------------------------
Microkernel and Component-based OS Devroom at FOSDEM 2022
CALL FOR PARTICIPATION
On-line version of this CfP: http://fosdem.microkernel.info/
The developers and users of several free and open source
microkernel-based and component-based operating systems will meet
on-line as part of FOSDEM 2022 [1] and will share a developer room.
The devroom is currently looking for content in the form of talks and
activities related to the area of microkernel-based, unikernel-based,
component-based operating systems and similar topics.
The devroom has been preliminarily scheduled to Saturday February 5th
2022. The devroom will take place during European day hours.
Possible topics of the devroom include, but are not limited to:
* introduction of a specific OS or framework
* design of subsystems and the general architecture of an OS
* languages, tools and toolchains used
* enabling support for hardware (architectures, device drivers) and
programming languages
* development processes, debugging
* maintenance, testing and release engineering
* safety, security and robustness
* trends and challenges
* research and open questions
* community and governance
* use cases, experiences and status updates
* best practices and lessons learned
* demos
This is a call for your participation. We kindly ask you to submit
your proposals no later than on December 28th 2021. Please use the
Pentabarf web site [2] to submit your proposals. If you already have
a user account in Pentabarf, please do not create a new one.
Addtionally, make sure to select the Microkernel and Component-based
OS Devroom as the track when submitting your talk and include at
least the following information in your proposal:
* title of your talk
* your full name
* short abstract of your talk (one or two paragraphs)
* duration of your talk (at least 20 minutes and no longer than 50
minutes)
* your short bio
* your photo
The talks will be pre-recorded in advance in January and streamed
automatically during the event. This means that you will need to
complete and submit your talk recording by January 15th 2022.
However, each speaker should be also present on-line just after the
streaming of their talk for a live Q&A. You can specify your prefered
time slot of your talk. The communication language of the devroom is
English.
The official devroom schedule (along with the accepted talks) will be
announced on December 31st 2021 on the devroom's mailing list [3] and
the speakers will be notified via e-mail. The schedule will also be
published on the FOSDEM web site.
For any comments, questions and suggestions (e.g. suggestions for a
different type of event during the devroom), please do not hesitate
to use the devroom's mailing list [3].
About the Devroom
Since the first Microkernel OS Devroom at FOSDEM 2012 [4], this
devroom has been a part of each following FOSDEM (with slight
variations of the name). The focus gradually widened to include
component-based, unikernel-based and other operating systems. By now,
it has become a somewhat institutionalized tradition for the open
source operating systems community and it is one of the few places
where microkernel enthusiasts and people with different views on how
operating systems should work meet regularly.
To this date, over a dozen projects have participated in one way or
another. Many of the projects face similar challenges but come up
with partially different solutions. Therefore, the goal of the
devroom is to bring the various projects together, let them exchange
ideas, cross-pollinate and socialize.
Extraordinary Circumstances in 2022 (Again)
Due to the extraordinary mode of organization of FOSDEM 2022 (being
an on-line event again), the devroom talks will be required to be
pre-recorded in January and streamed during the event (with a live
Q&A).
The talk videos will be published under Creative Commons license by
FOSDEM. By submitting a recorded talk, the speaker agrees to have it
made available publicly indefinitively. For organizational purposes,
we also need contact information of the speakers of accepted talks.
Please check the official CfP page [5] of the devroom regularly. We
will keep updating it as technical guidelines will be set by the
FOSDEM organizers.
Microkernel Dinner
Sadly, it will not be possible to arrange our traditional microkernel
dinner and other in-person social gatherings in Brussels this time
again. However, we plan a closing event of the devroom as a loose
approximation of the microkernel dinner. The details will be
announced later. We encourage the on-line participats to bring their
own food and beverages.
About FOSDEM
FOSDEM [6] is a two-day event organized by volunteers to promote the
widespread use of free and open source software. FOSDEM is widely
recognized as one of the best such conferences worldwide. FOSDEM
covers a wide spectrum of free and open source software and hardware
projects and offers a platform for people to collaborate.
To this end, FOSDEM has set up developer rooms (devrooms) where teams
can meet and showcase their projects. Devrooms are a place for teams
to discuss, hack and publicly present latest directions, lightning
talks, news and proposals. Besides developer rooms, FOSDEM also
offers main tracks, lightning talks, certification exams and project
stands.
In recent years, FOSDEM has been hosting more than 5000 developers
annually at the ULB Solbosch campus. Due to the circumstances, FOSDEM
2022 will be organized as an on-line event. The participation in
FOSDEM is totally free, although the organizers gratefully accept
donations and sponsorship. No registration is necessary, but
attendees are expected to follow the FOSDEM's code of conduct [7].
Important Dates Recap
* 2021-12-28: Deadline for talk proposal submission
* 2021-12-31: Schedule published and speakers notified of acceptance
* 2022-01-15: Deadline for talk pre-recoding submission
* 2022-02-05: Devroom taking place on-line
Contact
In case of any comments, questions and suggestions, please do not
hesitate to contact the devroom organizers via the devroom's mailing
list [3].
The primary organizers of the devroom in 2022 are:
* Sebastian Sumpf
* Martin Decky
Links
[1] https://fosdem.org/2022/
[2] https://penta.fosdem.org/submission/FOSDEM22/
[3] https://lists.fosdem.org/listinfo/microkernel-devroom
[4] https://archive.fosdem.org/2012/schedule/track/microkernel_os_devroom.html
[5] http://fosdem.microkernel.info/
[6] https://fosdem.org/
[7] https://fosdem.org/2022/practical/conduct/
I got an problem when I use TimeServer with Group Components. I wish more than one component in the same Group Components, that can use TimeServer. While the cdl only have one ep for all components in the Group to communicate to TimeServer with an badge value that not equal to zero. And TimeServer must to distinguish requester by the badge, so only one component can receive TimeServer's response. How can I resolve this problem?
Hi professor Heiser:
To understand what’s going on I’d need to know what these numbers are:
- what is being measured, and what’s the 500/100cy parameter?
- which web site are the “official” numbers from, they aren’t at https://sel4.systems/About/Performance/
First, I got the data of IMX8MM_EVK_64 and TX2 from https://github.com/seL4/sel4bench/actions/runs/1469475721#artifacts, the sel4bench-results-imx8mm_evk file and sel4bench-results-tx2 file, unpack the file out, I find xxxx_SMP_64.json
Secondly, the test is the smp benchmark form sel4bench-manifest project, the source file is sel4bench/apps/smp/src/main.c
The test scenario look like below:
A pair thread of ping-pong on the same core, the ping thread will wait for "ipc_normal_delay" time then send 0 len ipc message to pong thread, then return. I think the 500 cycles mean how long ipc_normal_delay will really delay
The above scenario will test on one core, or mutil core. If we run 4 cores, every core will have a ping thread and a pong thread run like above description, then record the sum of all cores ping-pong counts.
I think this experiment is used to illustrate in multi core, our seL4 kernel big lock will not affect mutli-core performance, am I right ?
Addition:
Our seL4_Call performance is same with other platform
XXXX IMX8MM_EVK_64 TX2_64
seL4_Call 367(0) 378(2) 492(16) client->server, same vspace, ipc_len is 0
seL4_ReplyRecv 396(0) 402(2) 513(16) server->client, same vspace, ipc_len is 0
Thank you for your help
-----邮件原件-----
发件人: devel-request(a)sel4.systems [mailto:devel-request@sel4.systems]
发送时间: 2021年12月2日 9:00
收件人: devel(a)sel4.systems
主题: Devel Digest, Vol 127, Issue 1
Send Devel mailing list submissions to
devel(a)sel4.systems
To subscribe or unsubscribe via email, send a message with subject or body 'help' to
devel-request(a)sel4.systems
You can reach the person managing the list at
devel-owner(a)sel4.systems
When replying, please edit your Subject line so it is more specific than "Re: Contents of Devel digest..."
Today's Topics:
1. Subscription (Xin Wang)
2. some performance problem when test 4 cores SMP benchmark of seL4bench project
(yadong.li)
3. Re: some performance problem when test 4 cores SMP benchmark of seL4bench project
(Gernot Heiser)
----------------------------------------------------------------------
Message: 1
Date: Wed, 1 Dec 2021 06:52:45 +0000
From: Xin Wang <xin.wang(a)bst.ai>
Subject: [seL4] Subscription
To: "devel(a)sel4.systems" <devel(a)sel4.systems>
Message-ID: <BL0PR18MB2146092BCA3DBADBC26984ADF7689(a)BL0PR18MB2146.nam
prd18.prod.outlook.com>
Content-Type: text/plain; charset="gb2312"
Hi sirs,
Subscription
Thanks,
从 Windows 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>发送
------------------------------
Message: 2
Date: Wed, 1 Dec 2021 14:58:17 +0000
From: yadong.li <yadong.li(a)horizon.ai>
Subject: [seL4] some performance problem when test 4 cores SMP
benchmark of seL4bench project
To: "devel(a)sel4.systems" <devel(a)sel4.systems>
Message-ID: <2ae9929de02e481796d3d697182842c3(a)horizon.ai>
Content-Type: text/plain; charset="gb2312"
Hi,
Now, I meet some performance problem when test 4 cores SMP benchmark of seL4bench on our platform.
Out platform is XXX, But I get the test data of IMX8MM_EVK_64 and TX2 platform from seL4 website, I think they are official statistics.
My test results below:
ARM platform
Test item XXX IMX8MM_EVK_64 TX2
mean(Stddev)
500 cycles, 1 core 636545(46) 625605(29) 598142(365)
500 cycles, 2 cores 897900(2327) 1154209(44) 994298(94)
500 cycles, 3 cores 1301679(2036) 1726043(65) 1497740(127)
500 cycles, 4 cores 1387678(549) 2172109(12674) 1545872(109)
1000 cycles, 1 core 636529(42) 625599(22) 597627(161)
1000 cycles, 2 cores 899212(3384) 1134110(34) 994437(541)
1000 cycles, 3 cores 1297322(5028) 1695385(45) 1497547(714)
1000 cycles, 4 cores 1387149(456) 2174605(81) 1545716(614)
From these compare data:
1. When test smp bench on one core, the performance of several platform is similar
2. When test smp bench on muti core, the result of IMX8MM_EVK_64 is beauty, the result of 4 cores is 3.47 times as good as 1 core, I think it’s good
3. But the platform of TX2 has some different performance, the result of 2 cores is 1.66 times as good as 1 core, I still think is good, But the result of 3 cores almost have the same ping-pong count with 4 cores, why add one core, the count result not add as our expected ?
4. The performance of our platform is badly, on our platform, the result of 3 cores almost also have the same ping-pong count with 4 cores, and our count result of 4 cores just 2 times as good as one core, I think it is very bad
5. I want to know what are the possible causes of the badly performance about our platform XXX and TX2 ?
------------------------------
Message: 3
Date: Wed, 1 Dec 2021 21:09:41 +0000
From: Gernot Heiser <gernot(a)unsw.edu.au>
Subject: [seL4] Re: some performance problem when test 4 cores SMP
benchmark of seL4bench project
To: "devel(a)sel4.systems" <devel(a)sel4.systems>
Message-ID: <720E9728-1079-4455-BB0E-34A7C5CE88F4(a)unsw.edu.au>
Content-Type: text/plain; charset="utf-8"
Hi Yandong,
To understand what’s going on I’d need to know what these numbers are:
- what is being measured, and what’s the 500/100cy parameter?
- which web site are the “official” numbers from, they aren’t at https://sel4.systems/About/Performance/
Gernot
On 2 Dec 2021, at 01:58, yadong.li<http://yadong.li/> <yadong.li(a)horizon.ai<mailto:yadong.li@horizon.ai>> wrote:
Hi,
Now, I meet some performance problem when test 4 cores SMP benchmark of seL4bench on our platform.
Out platform is XXX, But I get the test data of IMX8MM_EVK_64 and TX2 platform from seL4 website, I think they are official statistics.
My test results below:
ARM platform
Test item XXX IMX8MM_EVK_64 TX2
mean(Stddev)
500 cycles, 1 core 636545(46) 625605(29) 598142(365)
500 cycles, 2 cores 897900(2327) 1154209(44) 994298(94)
500 cycles, 3 cores 1301679(2036) 1726043(65) 1497740(127)
500 cycles, 4 cores 1387678(549) 2172109(12674) 1545872(109)
1000 cycles, 1 core 636529(42) 625599(22) 597627(161)
1000 cycles, 2 cores 899212(3384) 1134110(34) 994437(541)
1000 cycles, 3 cores 1297322(5028) 1695385(45) 1497547(714)
1000 cycles, 4 cores 1387149(456) 2174605(81) 1545716(614)
From these compare data:
1. When test smp bench on one core, the performance of several platform is similar 2. When test smp bench on muti core, the result of IMX8MM_EVK_64 is beauty, the result of 4 cores is 3.47 times as good as 1 core, I think it’s good 3. But the platform of TX2 has some different performance, the result of 2 cores is 1.66 times as good as 1 core, I still think is good, But the result of 3 cores almost have the same ping-pong count with 4 cores, why add one core, the count result not add as our expected ?
4. The performance of our platform is badly, on our platform, the result of 3 cores almost also have the same ping-pong count with 4 cores, and our count result of 4 cores just 2 times as good as one core, I think it is very bad 5. I want to know what are the possible causes of the badly performance about our platform XXX and TX2 ?
_______________________________________________
Devel mailing list -- devel(a)sel4.systems<mailto:devel@sel4.systems>
To unsubscribe send an email to devel-leave(a)sel4.systems<mailto:devel-leave@sel4.systems>
------------------------------
Subject: Digest Footer
_______________________________________________
Devel mailing list -- devel(a)sel4.systems To unsubscribe send an email to devel-leave(a)sel4.systems
------------------------------
End of Devel Digest, Vol 127, Issue 1
*************************************
Hi,
Now, I meet some performance problem when test 4 cores SMP benchmark of seL4bench on our platform.
Out platform is XXX, But I get the test data of IMX8MM_EVK_64 and TX2 platform from seL4 website, I think they are official statistics.
My test results below:
ARM platform
Test item XXX IMX8MM_EVK_64 TX2
mean(Stddev)
500 cycles, 1 core 636545(46) 625605(29) 598142(365)
500 cycles, 2 cores 897900(2327) 1154209(44) 994298(94)
500 cycles, 3 cores 1301679(2036) 1726043(65) 1497740(127)
500 cycles, 4 cores 1387678(549) 2172109(12674) 1545872(109)
1000 cycles, 1 core 636529(42) 625599(22) 597627(161)
1000 cycles, 2 cores 899212(3384) 1134110(34) 994437(541)
1000 cycles, 3 cores 1297322(5028) 1695385(45) 1497547(714)
1000 cycles, 4 cores 1387149(456) 2174605(81) 1545716(614)
From these compare data:
1. When test smp bench on one core, the performance of several platform is similar
2. When test smp bench on muti core, the result of IMX8MM_EVK_64 is beauty, the result of 4 cores is 3.47 times as good as 1 core, I think it’s good
3. But the platform of TX2 has some different performance, the result of 2 cores is 1.66 times as good as 1 core, I still think is good, But the result of 3 cores almost have the same ping-pong count with 4 cores, why add one core, the count result not add as our expected ?
4. The performance of our platform is badly, on our platform, the result of 3 cores almost also have the same ping-pong count with 4 cores, and our count result of 4 cores just 2 times as good as one core, I think it is very bad
5. I want to know what are the possible causes of the badly performance about our platform XXX and TX2 ?