[seL4] Re: some performance problem when test 4 cores SMP benchmark of seL4bench project 答复: Devel Digest, Vol 127, Issue 1

7 Dec 2021

      On 7 Dec 2021, at 13:43, Kent Mcleod <kent.mcleod72@gmail.com> wrote:
...
Looking at the sel4bench smp benchmark implementation, the metric is
the total number of "operations" in a single second.  An operation is
a round trip intra address space seL4_Call + seL4_ReplyRecv between 2
threads on the same core with each thread delaying for the cycle count
before performing the next operation.  After 1 second of all cores
performing these operations continuously and maintaining a core-local
(on a separate cache line) count, the total number of operations is
added together and reported as the final number. So you would expect
that the reported metric would scale following Amdahl's law based on
the proportion of an operation that is serialized inside the kernel
lock which would potentially vary across platforms.
Thanks for the explanation, Kent. 

Observations:

1) The metric is essentially independent of the delay. Looking at the single-core figures for the i/MX8, I get 1598.5 ns in both cases, the difference being 15ps. Doesn’t make sense to me.

2) Assuming this processor runs at the 1.8GHz it seems speced for, this corresponds to 2877 cycles, which is huge, even if the 1000cy delay is subtracted!

3) As I said before, intra-AS IPC is a meaningless metric we should never use (but that’s incidental to the particular thing we want to measure here).

4) Having to do these calculations to understand the numbers is a sure indication that the results are presented in an unsuitable form.

I can’t see how these figures make sense.

Gernot

[seL4] Re: some performance problem when test 4 cores SMP benchmark of seL4bench project 答复: Devel Digest, Vol 127, Issue 1

Gernot Heiser