On 7 Dec 2021, at 13:43, Kent Mcleod <kent.mcleod72@gmail.com> wrote:
Looking at the sel4bench smp benchmark implementation, the metric is the total number of "operations" in a single second. An operation is a round trip intra address space seL4_Call + seL4_ReplyRecv between 2 threads on the same core with each thread delaying for the cycle count before performing the next operation. After 1 second of all cores performing these operations continuously and maintaining a core-local (on a separate cache line) count, the total number of operations is added together and reported as the final number. So you would expect that the reported metric would scale following Amdahl's law based on the proportion of an operation that is serialized inside the kernel lock which would potentially vary across platforms.
Thanks for the explanation, Kent. Observations: 1) The metric is essentially independent of the delay. Looking at the single-core figures for the i/MX8, I get 1598.5 ns in both cases, the difference being 15ps. Doesn’t make sense to me. 2) Assuming this processor runs at the 1.8GHz it seems speced for, this corresponds to 2877 cycles, which is huge, even if the 1000cy delay is subtracted! 3) As I said before, intra-AS IPC is a meaningless metric we should never use (but that’s incidental to the particular thing we want to measure here). 4) Having to do these calculations to understand the numbers is a sure indication that the results are presented in an unsuitable form. I can’t see how these figures make sense. Gernot