The server attempting to gate-keep and ensure that
there’s enough time left for the request would be quite complex and probably expensive, as
it would need a full model of its own execution. It would also be very pessimistic, given
the pessimism of worst-case execution-time (WCET) analysis on pipelined processors with
caches. Much better to make the client responsible for its own fate and kick it out if it
By “misbehave”, you mean call the server with insufficient remaining budget? If it's
infeasible for the server to know whether the remaining budget is sufficient, how is it
feasible for the client to know?
would be to assign two contexts to the network driver: the standard 10% high priority
context and a 100% low-priority context, and run the driver whenever either context
allows. Then average total utilization could rise to 90% (with 20% for the network driver)
to avoid throttling the bulk transfer.
Not sure how realistic this scenario is. You seem to assume that somehow there is a
notion of packet priority on the network?
No; the network is saturated, but not entirely by the bulk transfer. The bulk transfer
can't monopolize the CPU, so it can't block other threads from sending/receiving
traffic too. I assume the critical thread's traffic is small and gets through without
a problem. The point of my example was just demand for network driver service exceeding
what it could supply with a fixed 10% CPU utilization limit, and available CPU time being
left unused instead of being used to satisfy the demand.
in your scenario, the driver, handling low and high
traffic, would have to be trusted and assured to high criticality.
If the high-criticality client depends on network access, then it must trust the network
driver to provide that access, but that's the case not only in my scenario, but also
in yours or any other one. But the client doesn't have to trust the network driver not
to monopolize the CPU in either scenario. So I'm not sure what distinction you're
All doable, but I don’t think I’d want this sort of
complexity and extra concurrency control in a critical system.
Suppose the network and NIC hardware are fast enough, relative to the CPU, that the
network driver can't saturate them even at 100% CPU utilization. How do you choose the
utilization to allocate to the driver? Do you add up the worst-case data rates of the
critical (i.e. non-slack) threads, figure out the CPU utilization that the driver needs to
sustain that total rate, and statically allocate accordingly?
why give the
network driver an independent budget at all? Even if the driver needs a short period, it
still makes sense to run using the client's budget. It's pointless to avoid
dropping packets if the client is too slow to consume them anyway, which means the client
must be guaranteed suitable utilization; therefore, simply guarantee enough additional
utilization to cover the cost of the network driver's service, and when the driver
transceives data on the client's behalf, charge the client for the time.
Not that simple: the driver needs to execute on an interrupt (or are you assuming polled
I/O)? The model supports passive drivers, it can wait both for clients IPCing it (with SC
donation) and on a Notification (semaphore) that delivers the interrupt and also donates
I made a mistake; the network driver does need an independent budget, but only a very
small one, and my main idea still works.
Suppose client thread C and network driver D have periods Cp and Dp and budgets Cb and Db.
C authorizes D to use some part Cb_d of Cb, and the system accordingly creates a new
scheduler context C_D_sc for use by D, with period Dp and budget Cb_d*Dp/Cp (thus with the
same max CPU utilization Cb_d/Cp that C delegated, and budget normalized to D's
period). Whenever C_D_sc is invoked, the time used in it is then deducted from the budget
for the next invocation of C's scheduler context. Thus, C just chooses a limit; it
doesn't delegate a fixed amount of utilization, so it still gets to use whatever D
When packets arrive from the network, D discovers them (while running in its main
scheduler context) either immediately due to an interrupt, or by polling after at most Dp.
Db is just long enough for D to examine the headers (to determine the destination clients)
of as many packets as can arrive during Dp. For each client, D switches to the
corresponding context C_D_sc and delivers C's packets.
When switching to C_D_sc, deduct from its budget (and thus ultimately from C's
scheduler context) the small amount of time that D consumed in its main scheduler context
on C's behalf, which is the quantity of packets being delivered times the per-packet
header reading latency; this way, D effectively never independently utilizes the CPU so
long as all incoming packets have willing and able recipients. This is why I originally
said D needs no independent budget; I just forgot about the initial time needed to figure
out which client to charge. For packets that are corrupted, or destined for dead
destination clients, or for clients with expired delegated time, etc, D does independently
utilize the CPU, just enough to examine and drop the packets.
In contrast to data reception, data transmission has trivial accounting. C simply calls D
synchronously, which then runs as an ordinary server for that invocation; if C's
remaining budget is insufficient to send the packet, then the packet is dropped. Or, call
asynchronously, and D can process it in C_D_sc, since that accounting mechanism is
necessary anyway for data reception.
However, in all those scenarios you’re making the
driver high-crit, and my example was all about it being low.
Even in your example it's high-crit in the sense that regardless of when or how often
it runs, it could misbehave and drop packets that high-crit threads depend on. So I assume
you're just saying that in your example it's low-crit in the sense that it
can't monopolize the CPU. But in my scenario, it can't monopolize the CPU either.
Maybe my mistake about the budget mislead you. Sorry for moving the goal post.
I’m not sure I’m still following your scenario, and
what it has to do with your initial idea that a server should gate-keep client time.
Separate idea. The latter just brought the former to mind.