By normal I mean code that wasn't predicted. Predicted code execution is "abnorm...

netsharc · on Jan 4, 2018

To use a really stupid analogy, imagine a cook in an Mafia-involved restaurant, and you're a cop (you're the adversary). There's a whiteboard in the boss' office with a number you want to know.

So you come there every day and say, "Hey, if the lights at (some address) are on, I want a cheese pizza, otherwise I want pasta.". In the beginning he sends his boy to check the light at that address, and the boy takes 20 minutes to go and check, and all that time he has to wait before he can make you the pizza. After a while the boy always says "The lights are on", so after a while the cook would still send the boy, but he would anticipate the answer "the lights are on" and he would start a making pizza. Once in a while (maybe once every 6 months) the boy would return saying the lights are off, so after 20 minutes of making the pizza, he would throw away the pizza and make you pasta.

After doing this a lot, you say, "Hey, if the lights at (some address) are on, I want a pizza with the topping from the box numbered according to the number written on your boss' whiteboard. Otherwise, I want pasta.". You know the lights will be off, but he starts making pizza with some topping you can't tell, but you notice he got a box from the fridge to put it next to his pizza-making table. The boy returns, he throws away the pizza and makes you pasta. Then you ask him, "Hey, I have another order, can I get a pizza with a topping from box 1?". If he goes to the fridge to get that box, then you know the box he picked up earlier (while he was guessing that the boy would say the lights are on) was not box 1, so you ask for another pizza with topping from box 2, etc. But if he doesn't go the fridge, you know he picked up that box previously (because he predicted the boy will tell him the lights will be on), and that's how you figure out what number is written on the boss' whiteboard: when the cook doesn't go to the fridge, it's because the box with that number is already on his table.

The whiteboard is the secret memory area, taking the box from the fridge is reading from some RAM address to cache (if it's already in cache, no need to read from RAM), and I hope the speculative execution is clear.

This analogy describes (very roughly) one of the exploits in Spectre.

mikeash · on Jan 4, 2018

Imagine some code running in some walled-off privileged space (like the kernel) that looks something like this:

  if(i < length) {
    value = data[i]
    doStuff(lookupTable[value])
  }

Let's say you can run this code with an arbitrary value for `i` (maybe it's the parameter to a system call). From a high-level view, this code is perfectly safe. If `i` is out of bounds, nothing happens. If it's in bounds, the system does something with the data at i, presumably something you're allowed to do.

But what really happens on a modern CPU when `i` is out of bounds? It's possible that the value for `length` is not available and has to be loaded from memory, which can take several hundred CPU cycles in the worst case. It's also possible that the CPU will make a guess that the `if` branch will be taken. If that happens, then execution looks something like this:

  try to compare i < length
  can't, because length is not yet available
  guess the branch will be taken, begin speculatively executing it:

  initiate the loading of length
  load data[i]
  load lookupTable[value]
  ...possibly some more stuff...

  eventually, the load of length completes
  oops, turns out i >= length
  roll back all of the speculative stuff above

Normally none of this is visible. But, the memory loads have side effects. Namely, they modify the CPU's caches. By attempting to load different locations in memory and looking at how long it takes, you can figure out which locations got cached during that speculative execution phase. Since one of those locations was determined by an out-of-bounds load of data[i], that means you can use this to figure out the data stored at an arbitrary location in this privileged space.

It requires speculative execution of a branch that ends up not being taken, because if the speculative execution was correct then you haven't gained access to anything you're not supposed to know, and if there was no speculative execution then you never load anything out of bounds.

This is simplified and probably wrong in some aspects, but that's the rough idea.