While analyzing some time related APIs, I've noticed that the x86 rt kernel assumes that the tsc frequency is constant and well represented by the measurement made at init by comparing against the PIT. It also exposes only microseconds and uses the internal calibration to convert that to/from ticks. I'm concerned about this behavior. The tsc can have up to 200ppm error if uncalibrated and not temperature controlled, from what I'm able to determine. I'd rather have all the public APIs take ticks and allow user space to determine how their timing requirements best correspond to ticks. This came up when wanting to measure how much running time a process/thread has used. Ideally, it's kept as a running count of ticks. In the current API, one can only observe microseconds consumed, converted by a potentially lossy ticksToUs (lossy when the rate isn't exactly known or tick->us conversion isn't an exact integer multiplication). It'd be more useful for me to have tick count directly.

I'm also somewhat concerned with using the PIT to estimate the TSC rate, as I can't find anything about expected error of the PIT, and any slop there will influence all future kernel behaviors until reboot. One potential alternative is the ACPI PM timer or the HPET. Those run at higher frequencies, which would help reduce the error bounds on the rate estimation.

Let me know what you think! If it's useful at all, I've written up some stuff about the work that inspired these thoughts: https://gitlab.com/robigalia/meta/blob/master/blog/_drafts/2016-12-23-what-time-is-it.adoc