Hi Mark,
On 24 Jun 2021, at 04:23, Mark Jones
wrote: On the verification side, I've long been interested in the question of how we protect against covert channels resulting from hardware state components that are undocumented, or that don't work as intended, or that are not captured in a formal model. [There was a news item in the last few weeks about a (seemingly harmless) flaw of this kind in the Apple M1 processor, which is probably why this was on my mind again.]
That is indeed an interesting question and it is hard to achieve. One way is doing vulnerability research to figure out where things go wrong on the hardware side and what that space even really looks like. Yuval has found many interesting side channels that way, e.g. Spectre and Meltdown are among those. Essentially this is validating the model and investigating where it breaks. The pure formal verification answer is then always a more detailed model that does have that hidden state. This works for some things (e.g. we found a missing register in one of our models that way back in the day), but it's not really feasible if there is no documentation from the hardware side or the hardware misbehaves. The other way is figuring out what we can do without full knowledge of the hardware. We have started work on that and have a few ideas that we're formalising. The end result of that if everything works out should be closing large classes of timing side channels *without* knowing the exact hardware details. The idea is that we need enough documentation to determine if an instruction *might* change some unspecified deeper hardware state (which might affect timing), and we need enough instructions/mechanisms to either partition that hardware state (as with cache colouring) or reset that hardware state (as with cache flushing). If you have both, and have a precise enough measure of time, you can build OS mechanisms that close the channels that result from that hardware state. (More in that link that Gernot sent) This would solve a good chunk of the problem -- there are more channels, such as power etc, but one thing at a time ;-) What that will still not solve is completely undocumented registers and hardware state, which I'm not sure you can do much about if you're working purely in software. It's not like these don't exist, there is a reason complex chips like modern x86 implementations have a hard time for security evaluations. You can systematically search for them as with for instance https://github.com/xoreaxeaxeax/sandsifter, but you can't know that it is complete. It's a fact of live that you must guard against some hardware-brokenness, but you do need some grounding in reality to achieve anything. Cheers, Gerwin