Periodically sampling without resetting is fine, it just exposes the issue you noticed where if nothing is ever run on a core then its utilization will be incorrect. But I think the incorrect idle utilization should be easily fixed separately, and you can keep sampling how you are.

For a quick solution I would be happy to something close to your change, just instead of changing the core whose idle time is sampled, add an additional field to the returned information and return both the idle utilization of the callers core, as well as the core of the invoked thread.

Long term we need a better way to add debug/benchmarking/profiling/etc invocations to the kernel, as the setup is a bit tedious to add / change invocations for. But in that better world we would be able to easily add invocations for just requesting the idle of a specific core.

Adrian

On Tue 29-Aug-2017 10:41 PM, Alexander Boettcher wrote:
Hi Adrian,

On 28.08.2017 02:31, Adrian.Danis@data61.csiro.au wrote:
For the first change you mention, it probably does make more sense to return the idle time of the requested thread, although honestly there's no reason it couldn't just return both. The only reason it doesn't give all the idle times is for simple encoding of the structures given the number of cores is arbitrary. The main thing I'm curious about with your change is how are you resetting the idle thread utilization? Or are you just continually sampling utilisation from system start?
We just periodically sampling for now, remember the time values per
thread and the values of idle times per CPU and compute the diff of the
values the next time we take a sample. So, we start once the sampling
and never reset, actually. Is this a issue ?

It looks like, the CPU idle times (one could also say the
"execution"/sleep time of the idle thread per CPU) can be requested
mainly as a "side effect" by requesting the execution time of an
arbitrary thread on a specific CPU, if I read/understood the code correctly.

The main point is, we would like to request the idle time per CPU from
one and the same thread (in our case in roottask) on the boot CPU, which
is currently not possible. Adding some dummy threads per CPU which don't
execute actual code, is odd, but works. They serve as capability handle
to request the idle times (with the patch) of the remote CPUs.

So what would the way to go, to add such a feature upstream ?
(I can also live in principal with the answer to host the patch just on
our side).

Best,

Alex.

Regarding the idle times of unused cores, this should be easy enough, I think the boot code just needs to initialize the information properly. Currently I believe the initialization is somewhat 'not done' under the assumption that benchmark_utilisation_switch / benchmark_track_reset_utilisation will get called.

Adrian

On Fri 25-Aug-2017 10:37 PM, Alexander Boettcher wrote:

Hello,

we use the seL4 benchmark interface to feed Genode's trace
infrastructure with information about the CPU utilization of threads and
of idle times.

In general the integration was reasonable straightforward.

One minor point we had. It seems that the idle time of a CPU can only be
requested by a thread running on the vary same CPU. Requesting the CPU
idle times of a remote CPU (the calling thread is on another CPU) is not
supported, right ?

In principle we may start on each CPU a thread just for the sake of
requesting the idle utilization time, which however looks a bit of overkill.

We "kind of" circumvent the issue for us by changing the syscall
handling in the kernel in that regard, that the CPU number of the
requested target thread is taken instead that of the caller thread, e.g.
like this:

--- a/src/benchmark/benchmark_utilisation.c
+++ b/src/benchmark/benchmark_utilisation.c
@@ -36,6 +36,11 @@ void benchmark_track_utilisation_dump(void)

     tcb = TCB_PTR(cap_thread_cap_get_capTCBPtr(lu_ret.cap));
     buffer[BENCHMARK_TCB_UTILISATION] = tcb->benchmark.utilisation; /*
Requested thread utilisation */
+#if CONFIG_MAX_NUM_NODES > 1
+    if (tcb->tcbAffinity < ksNumCPUs)
+        buffer[BENCHMARK_IDLE_UTILISATION] =
NODE_STATE_ON_CORE(ksIdleThread,
tcb->tcbAffinity)->benchmark.utilisation; /* Idle thread utilisation */
+    else
+#endif
     buffer[BENCHMARK_IDLE_UTILISATION] =
NODE_STATE(ksIdleThread)->benchmark.utilisation; /* Idle thread
utilisation */

 #ifdef CONFIG_ARM_ENABLE_PMU_OVERFLOW_INTERRUPT

With this change, we still have to create a thread on every CPU (since
we need a capability for the syscall), but at least the threads must not
be running actively.

Does this change (the patch) make sense to you and is it worthwhile to
adjust in general on your side - or would you advise/envision another
approach/direction ?

(like specifying the cpu number for the idle times directly via the
syscall or having a explicit syscall just for cpu idle times [but there
are no specific thread idle capabilities as syscall parameter] etc.)


Another question:

Currently it seems that no idle CPU times are provided, as long as no
user thread is actively running on that specific CPU. We can handle it,
no problem in general - however we would have to adjust generic code
(which runs on 8 different kernels) specifically for seL4 to handle this
minor case.
Could this possibly be changed (easily) ?


Thanks,

Alex.