Hi professor Heiser: To understand what’s going on I’d need to know what these numbers are: - what is being measured, and what’s the 500/100cy parameter? - which web site are the “official” numbers from, they aren’t at https://sel4.systems/About/Performance/ First, I got the data of IMX8MM_EVK_64 and TX2 from https://github.com/seL4/sel4bench/actions/runs/1469475721#artifacts, the sel4bench-results-imx8mm_evk file and sel4bench-results-tx2 file, unpack the file out, I find xxxx_SMP_64.json Secondly, the test is the smp benchmark form sel4bench-manifest project, the source file is sel4bench/apps/smp/src/main.c The test scenario look like below: A pair thread of ping-pong on the same core, the ping thread will wait for "ipc_normal_delay" time then send 0 len ipc message to pong thread, then return. I think the 500 cycles mean how long ipc_normal_delay will really delay The above scenario will test on one core, or mutil core. If we run 4 cores, every core will have a ping thread and a pong thread run like above description, then record the sum of all cores ping-pong counts. I think this experiment is used to illustrate in multi core, our seL4 kernel big lock will not affect mutli-core performance, am I right ? Addition: Our seL4_Call performance is same with other platform XXXX IMX8MM_EVK_64 TX2_64 seL4_Call 367(0) 378(2) 492(16) client->server, same vspace, ipc_len is 0 seL4_ReplyRecv 396(0) 402(2) 513(16) server->client, same vspace, ipc_len is 0 Thank you for your help -----邮件原件----- 发件人: devel-request@sel4.systems [mailto:devel-request@sel4.systems] 发送时间: 2021年12月2日 9:00 收件人: devel@sel4.systems 主题: Devel Digest, Vol 127, Issue 1 Send Devel mailing list submissions to devel@sel4.systems To subscribe or unsubscribe via email, send a message with subject or body 'help' to devel-request@sel4.systems You can reach the person managing the list at devel-owner@sel4.systems When replying, please edit your Subject line so it is more specific than "Re: Contents of Devel digest..." Today's Topics: 1. Subscription (Xin Wang) 2. some performance problem when test 4 cores SMP benchmark of seL4bench project (yadong.li) 3. Re: some performance problem when test 4 cores SMP benchmark of seL4bench project (Gernot Heiser) ---------------------------------------------------------------------- Message: 1 Date: Wed, 1 Dec 2021 06:52:45 +0000 From: Xin Wang <xin.wang@bst.ai> Subject: [seL4] Subscription To: "devel@sel4.systems" <devel@sel4.systems> Message-ID: <BL0PR18MB2146092BCA3DBADBC26984ADF7689@BL0PR18MB2146.nam prd18.prod.outlook.com> Content-Type: text/plain; charset="gb2312" Hi sirs, Subscription Thanks, 从 Windows 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>发送 ------------------------------ Message: 2 Date: Wed, 1 Dec 2021 14:58:17 +0000 From: yadong.li <yadong.li@horizon.ai> Subject: [seL4] some performance problem when test 4 cores SMP benchmark of seL4bench project To: "devel@sel4.systems" <devel@sel4.systems> Message-ID: <2ae9929de02e481796d3d697182842c3@horizon.ai> Content-Type: text/plain; charset="gb2312" Hi, Now, I meet some performance problem when test 4 cores SMP benchmark of seL4bench on our platform. Out platform is XXX, But I get the test data of IMX8MM_EVK_64 and TX2 platform from seL4 website, I think they are official statistics. My test results below: ARM platform Test item XXX IMX8MM_EVK_64 TX2 mean(Stddev) 500 cycles, 1 core 636545(46) 625605(29) 598142(365) 500 cycles, 2 cores 897900(2327) 1154209(44) 994298(94) 500 cycles, 3 cores 1301679(2036) 1726043(65) 1497740(127) 500 cycles, 4 cores 1387678(549) 2172109(12674) 1545872(109) 1000 cycles, 1 core 636529(42) 625599(22) 597627(161) 1000 cycles, 2 cores 899212(3384) 1134110(34) 994437(541) 1000 cycles, 3 cores 1297322(5028) 1695385(45) 1497547(714) 1000 cycles, 4 cores 1387149(456) 2174605(81) 1545716(614) From these compare data: 1. When test smp bench on one core, the performance of several platform is similar 2. When test smp bench on muti core, the result of IMX8MM_EVK_64 is beauty, the result of 4 cores is 3.47 times as good as 1 core, I think it’s good 3. But the platform of TX2 has some different performance, the result of 2 cores is 1.66 times as good as 1 core, I still think is good, But the result of 3 cores almost have the same ping-pong count with 4 cores, why add one core, the count result not add as our expected ? 4. The performance of our platform is badly, on our platform, the result of 3 cores almost also have the same ping-pong count with 4 cores, and our count result of 4 cores just 2 times as good as one core, I think it is very bad 5. I want to know what are the possible causes of the badly performance about our platform XXX and TX2 ? ------------------------------ Message: 3 Date: Wed, 1 Dec 2021 21:09:41 +0000 From: Gernot Heiser <gernot@unsw.edu.au> Subject: [seL4] Re: some performance problem when test 4 cores SMP benchmark of seL4bench project To: "devel@sel4.systems" <devel@sel4.systems> Message-ID: <720E9728-1079-4455-BB0E-34A7C5CE88F4@unsw.edu.au> Content-Type: text/plain; charset="utf-8" Hi Yandong, To understand what’s going on I’d need to know what these numbers are: - what is being measured, and what’s the 500/100cy parameter? - which web site are the “official” numbers from, they aren’t at https://sel4.systems/About/Performance/ Gernot On 2 Dec 2021, at 01:58, yadong.li<http://yadong.li/> <yadong.li@horizon.ai<mailto:yadong.li@horizon.ai>> wrote: Hi, Now, I meet some performance problem when test 4 cores SMP benchmark of seL4bench on our platform. Out platform is XXX, But I get the test data of IMX8MM_EVK_64 and TX2 platform from seL4 website, I think they are official statistics. My test results below: ARM platform Test item XXX IMX8MM_EVK_64 TX2 mean(Stddev) 500 cycles, 1 core 636545(46) 625605(29) 598142(365) 500 cycles, 2 cores 897900(2327) 1154209(44) 994298(94) 500 cycles, 3 cores 1301679(2036) 1726043(65) 1497740(127) 500 cycles, 4 cores 1387678(549) 2172109(12674) 1545872(109) 1000 cycles, 1 core 636529(42) 625599(22) 597627(161) 1000 cycles, 2 cores 899212(3384) 1134110(34) 994437(541) 1000 cycles, 3 cores 1297322(5028) 1695385(45) 1497547(714) 1000 cycles, 4 cores 1387149(456) 2174605(81) 1545716(614) From these compare data: 1. When test smp bench on one core, the performance of several platform is similar 2. When test smp bench on muti core, the result of IMX8MM_EVK_64 is beauty, the result of 4 cores is 3.47 times as good as 1 core, I think it’s good 3. But the platform of TX2 has some different performance, the result of 2 cores is 1.66 times as good as 1 core, I still think is good, But the result of 3 cores almost have the same ping-pong count with 4 cores, why add one core, the count result not add as our expected ? 4. The performance of our platform is badly, on our platform, the result of 3 cores almost also have the same ping-pong count with 4 cores, and our count result of 4 cores just 2 times as good as one core, I think it is very bad 5. I want to know what are the possible causes of the badly performance about our platform XXX and TX2 ? _______________________________________________ Devel mailing list -- devel@sel4.systems<mailto:devel@sel4.systems> To unsubscribe send an email to devel-leave@sel4.systems<mailto:devel-leave@sel4.systems> ------------------------------ Subject: Digest Footer _______________________________________________ Devel mailing list -- devel@sel4.systems To unsubscribe send an email to devel-leave@sel4.systems ------------------------------ End of Devel Digest, Vol 127, Issue 1 *************************************