Analyzing Core I9-9900K performance with Spectre and Meltdown mitigations

MR4D · on Jan 2, 2019

Conclusion at the end is fairly brutal:

“The long and short of matters then is that based on the testing we've done thus far, it doesn't look like Coffee Lake Refresh recovers any of the performance the original Coffee Lake loses from the Meltdown and Spectre fixes. Coffee Lake was always less impacted than older architectures, but whatever performance hit it took remains in the Refresh CPU design.”

duxup · on Jan 2, 2019

Forgive me as I'm nowhere near knowledgeable in CPUs or such so my terminology will be way off.

For any CPU designed with the expectation of using the old method of memory access prediction without any protections... can we expect they'll ever show a significant performance recovery?

I guess I always assumed the answer was no.

ethbro · on Jan 2, 2019

(Someone please correct me if I'm wrong) Without adding additional hardware, likely not significant.

The way you avoid some of the impacted scenarios (at modest performance impact) is with additional hardware or microarchitecture changes.

Basically, the task is 'Ensure processor state, as observed by another process, never changes because of speculative execution branches.'

Which is a high bar to meet, especially if you want to simultaneously optimize your execution unit utilization.

_ugfj · on Jan 2, 2019

That is pretty obvious if you consider this is just another tweak of the venerable Skylake architecture. By now we have said good bye to the reasonable thermals we enjoyed for a while because, well, if you deliver six of the same cores that'll consume 50% more at the same clock so one hand you slow down the chip when all cores run and on another, you just allow it to consume more.

blattimwind · on Jan 2, 2019

And Skylake itself wasn't a new design either, but rather a tweak of Haswell (prominent changes were in the uncore), which in turn was largely identical to SB/IB.

_ugfj · on Jan 2, 2019

It's been benchmarked that from Sandy Bridge to Kaby Lake the IPC only grew 20% https://www.hardocp.com/article/2017/01/13/kaby_lake_7700k_v... but the power efficiency has brutally increased: This is a 35W CPU in 2011 https://browser.geekbench.com/processors/381 and this is an Y (4.5-7W) CPU from 2017 https://browser.geekbench.com/processors/1822 .

dantle · on Jan 2, 2019

Skylake was a reasonably big core architecture upgrade (a "tock" in Intel terminology). Several new instructions (XSAVE, AVX512, etc.) were introduced. Perhaps you meant Broadwell?

wtallis · on Jan 2, 2019

Consumer Skylake and derivatives didn't get AVX512; that's reserved for the server parts that arrived two years later.

tedunangst · on Jan 2, 2019

How did they confirm that the mitigations weren't still being used? If you're still using separate page tables for the kernel, well of course it's going to remain slow. The point of the fixed silicon is so you don't need the mitigation, not that the mitigation gets faster.

tolien · on Jan 2, 2019

There's a screenshot linked to in the comments [1] showing how you can query whether the mitigations are enabled, and showing the results for a 9900k.

1: https://www.anandtech.com/comments/13659/analyzing-core-i9-9... - screenshot comes from anandtech.com

opencl · on Jan 2, 2019

The "fixed silicon" at least so far mostly seems to be hardware implementations of the existing software mitigation techniques with very similar performance characteristics.

tedunangst · on Jan 2, 2019

Can we test this? Can we install an unpatched OS and observe the same performance penalty?

In the case of meltdown, it seems unlikely the CPU is maintaining its own shadow page table. How would it do that?

IntelMiner · on Jan 2, 2019

You could compile a Linux kernel with the Spectre defenses disabled. I assume userspace software that includes mitgations could also be patched out in the same way for testing

tarlinian · on Jan 2, 2019

I agree, I remember reading this article and thinking something was odd...how could you implement page table isolation in hardware? The Spectre improvement could plausibly be more similar to software mitigation...

quotemstr · on Jan 3, 2019

And that's a shame, since in principle, the hardware can do better than software. I've heard of approaches ranging from better cache partitioning to transactional commit to cache on instruction retirement. I can only imagine that Intel is working on these systemic fixes in the next big microarchitecture revision while continuing to apply cruder hacks on older cores that can't, say, alter low-level instruction retirement much.

__bjoernd · on Jan 2, 2019

Performance is all nice and such. Did anyone validate that the new processors actually mitigate Spectre and Meltdown?

fooker · on Jan 2, 2019

They mitigate Spectre specifically but not speculative execution bugs in general, which it seeems will be with us for the foreseeable future.

leeoniya · on Jan 2, 2019

at least we'll regain some lost spectre-related perf hits once MS ships their retpoline-patched kernel

https://mspoweruser.com/windows-10-19h1-will-reduce-the-impa...

wumpus · on Jan 2, 2019

Don't most machines with this kind of CPU in them not run Windows?

gji · on Jan 3, 2019

If you’re talking about the i9-9900, it’s a very high clock frequency part with “only” 8 cores and part of the consumer line (no ecc support). I’d actually think most people who have it run Windows and use it for things like gaming.

Xylakant · on Jan 2, 2019

Most maybe, but there’s still quite a large fleet of windows servers out there.

leeoniya · on Jan 2, 2019

i've seen js performance on synthetic benchmarks drop somewhat over the past months. perhaps JS JITs like V8 rely to some degree on specific branch prediction properties.

mbrumlow · on Jan 2, 2019

I wonder how much Intel knew of this "bug" and went ahead and shipped with it because of the speed increases.

blattimwind · on Jan 2, 2019

Knowledge about this type of attack dates back to the 90s (possibly earlier), but it isn't entirely clear whether the engineers who developed the "protection check after [speculative] load" were aware of that. I would argue though that "check after load" should have smelled bad to people intimiately familiar with CPU design.

It should be noted though that at the time neither sharing processors with strangers / across trust boundaries nor executing arbitrary crap in a VM were common activities. Memory protection and such were mostly viewed as a technique to increase reliability, not to provide actual security.

duxup · on Jan 2, 2019

I want to say I'm not giving them a pass but I think you can go overboard and "know" that "hey in theory someone could" and security yourself into never doing anything.

Accordingly how much they knew about the nature of it and their ability to predict is really the question and IMO kinda a hard one to know (unless there are some memos out there) / judge.

Granted in the age of little to no consideration given to security in so many places .... I wouldn't be surprised by anything.

chandlerc1024 · on Jan 2, 2019

Working closely with Intel and others on these issues I have seen zero evidence that anyone realized the security implications and shipped anyways. Zero evidence.

wmf · on Jan 2, 2019

This bug was designed around 1993 when security, multicore, and SMT didn't really exist.

Fnoord · on Jan 2, 2019

I suppose the question or insinuation would be whether it was discovered by Intel (or someone else) in the meantime?

acct1771 · on Jan 2, 2019

The CIA and NSA called, they said "Yeah".

scottlocklin · on Jan 3, 2019

They should make a "miss me yet?" meme for Itanium.

jammygit · on Jan 2, 2019

What does this mean for developers buying laptops or workstations in the next year - is amd or raptor looking like a better choice, or is Intel still looking good even after the hit?

I'm reading that workstations might not need to worry for the most part for example, unless a package gets compromised or some browser exploit makes it through

syn_rst · on Jan 3, 2019

I built a workstation with a Threadripper earlier this year, and I couldn't be happier so far. The single-core performance advantage of Intel parts isn't that big, and you get a ton of cores and PCIe lanes in exchange.

roadkillon101 · on Jan 2, 2019

How would you write a program to use these exploit? Unless I'm mistaken, I'm under the impression you have to talk directly to the processor through the kernel in order to do any of these exploit. You would have to write the code in assembler or use a special library to do predictive branching?

sigi45 · on Jan 2, 2019

Your code always talks directly to the cpu. Your program 'just' runs in a env where it is not allowed to do a few things the kernel is allowed to.

Probing the cpu for timings of memory access you don't have access to, or forcing it to something somewhere where you do have access to, you don't need the kernel for that. Thats the problem.

__bjoernd · on Jan 2, 2019

Yes, you'd write some assembly or C. Or your start from existing demos: https://samsclass.info/123/proj14/spectre.htm

saagarjha · on Jan 2, 2019

You don't need to write assembly or C; it is possible to perform an exploit by utilizing any high-resolution clock, like the one JavaScript on most browsers provided until recently.

billman · on Jan 2, 2019

I would love to see a benchmark for VM performance before and after.

hu3 · on Jan 2, 2019

TLDR: hardware patch is as slow as the software one.

Except now it's a bit worse since I cannot disable the patch to recuperate lost performance.

bryant · on Jan 2, 2019

> I cannot disable the patch to recuperate lost performance.

Knowing the actual, demonstrated risks... why would you do this?

I'm not trying to devalue your position. I'm trying to understand your risk calculation.

---

Edit: Good catch, humans. The thought of running code in an unexposed, isolated, largely trusted environment didn't cross my mind; I was moreso focused on the environments I'm used to (where everything is connected and nothing is trusted). That said, I'd argue that a database backend to any typical webapp definitely qualifies as exposed.

rndgermandude · on Jan 2, 2019

I'm going with the herd immunity for my personal stuff.

The calculation is basically:

their_waste = likelihood enough people have the mitigations enabled (not tech enough to disable them) so that "bad people" will not waste time developing exploits for the tiny number of unprotected people like me (herd protecting me)

my_risk = likelihood of "bad people" actually finding me and being able to run their code

their_reward = likelihood of them actually finding something meaningful and valuable in the memory they can manage to dump

oops = (my_risk * their_reward) / their_waste

I am assuming my_risk and their_reward to be low and their_waste to be high, so oops will be acceptably low (hopefully :p)

Wish me luck!

benologist · on Jan 2, 2019

Server logs are full of scripts and people trying to penetrate services with old, known bugs that have been widely neutralized for years.... it doesn't cost much to try one exploit.

rndgermandude · on Jan 2, 2019

So far there haven't any mass exploits for these CPU bugs targeting personal systems like mine in the wild.

The most risk comes from web browsers as you can execute (constrained) attack code in those. That's why browser vendors were busy disabling SharedArrayBuffer real quick and developed further mitigations that hinder exploiting CPU bugs like these.

I may reconsider when browser-based exploits become a real thing that is in widespread use.

People probing my ports and exposed services running on my machines is far less of an issue since I don't run any service designed to run attacker supplied (but sandboxed) code like a browser does. If somebody managed to run code anyway (RCE) then I probably would have other problems than just worrying about somebody running spectre exploit code ;)

As it stands right now, the prime targets are shared execution environments running untrusted sandboxed code, aka cloud providers needing to worry that customer A's VM doesn't dump the memory of customer B's VM running on the same hardware.

e1ven · on Jan 2, 2019

Not everyone is running any untrusted code. If you're running (for example) a physics simulation, the mitigation doesn't gain you much.

w0utert · on Jan 2, 2019

Fortunately the performance of these kinds of CPU-bound workloads are almost completely unaffected by the mitigations, so you might as well enable them anyway.

Filligree · on Jan 2, 2019

One of them is "Disable hyperthreading," which absolutely has a severe penalty.

germainelong · on Jan 2, 2019

For some workloads it is better to have hyperthreading disabled.

wumpus · on Jan 2, 2019

For some workloads.

gnufx · on Jan 3, 2019

Actually, I've seend benchmarks to the contrary. As usual with benchmarks, there's no useful data to understand them -- specifically not performance counter data. I've yet to see a good analysis and haven't been able to do it myself. Many large-scale computations actually aren't CPU-bound, except insofar as they spin in MPI; some do plenty of filesystem i/o, but at least with something like PVFS2, that can be just in user-space on the compute node.

gnufx · on Jan 3, 2019

The sort of HPC clusters with which I'm familiar run plenty of what I'd call untrusted code, and are multi-access with arbitrary student users and not-infrequently-compromised credentials. That said, there seems to be a fairly small attack surface the way I'd set up compute nodes, even if they're not single-job/node; especially if maximum job times are a day or two. I probably wouldn't turn on the mitigations on compute nodes.

jakevn · on Jan 2, 2019

There are non-internet facing workloads that would benefit from the additional performance.

LeifCarrotson · on Jan 2, 2019

I completely agree when it comes to running, say, a VPS.

If you're running (or using) a service where thousands of businesses rely on the ability to run their code and their data on your machines without any of your other customers being able to access it, yeah, security is priority #1.

On my personal workstation, though, what are they going to get? My credit card number? That's my bank's problem. I'm not particularly worried about targetted attacks, if my competitor or customers got everything on my hard drive little would change for any of us. Force me to restore from backup? Email password would be bad, but that's partially what 2-factor is for.

I have a tiny chance of getting a few hours of inconvenience if someone completely owns my PC. That's not worth all my work happening a little bit slower all the time.

zerkten · on Jan 2, 2019

> On my personal workstation, though, what are they going to get?

I think it depends on the context. I felt similarly until I discovered just how many machines attackers would pivot through in real-world attacks featuring strong adversaries. Preventing these attacks on every machine is a strength-in-depth measure.

bryant · on Jan 2, 2019

> what are they going to get?

Your data is likely less important in that context compared to your device as a fractional resource or a pivot. (which I believe is largely zerkten's point)

tux3 · on Jan 2, 2019

Consider a server application that doesn't run arbitrary untrusted code and doesn't have meaningfully separate privilege levels.

You can't leverage any speculation exploit without code execution, and there's nothing left to exploit on the box once you have a shell.

banachtarski · on Jan 2, 2019

The database backend does NOT qualify... If your DB machine is exposed to the outside world, using this exploit is overkill and there are plenty of other easier vectors of attack.

CamperBob2 · on Jan 2, 2019

Simple. I'm pretty sure I won't be the first to fall victim to an exploit that is purely academic at this point. Why on earth should I take a large performance penalty on my own PC to mitigate an attack that I'm pretty sure (a) will never be a problem for me and (b) that will almost certainly arrive with plenty of warning if it does?

This whole business is massively, massively overhyped from the point of view of individual workstation users. Not every system needs to be locked down like NORAD. Doing so is a failure of basic threat modeling.

sl1ck731 · on Jan 2, 2019

If it is on my personal machine I can't accept a 8-15% performance hit (I don't know the real impact thats just the first range I found).

dijit · on Jan 2, 2019

My database layer doesn’t run any untrusted code. I’d like the perf back there. Then audit the shit out of any users running there.

shados · on Jan 2, 2019

I'm talking out of my behind here, but I was always under the impression some stuff we use casually is built with the assumption it's not under heavy attack, too.

Eg: I thought consumer grade videocards were pretty darn insecure. I don't know where I picked that idea, but if that's true, then the idea of having the option to run "insecurely" for certain things make sense.

Not sure if we can trust an average user with this, but if the videocard thing is true, we do.

rayiner · on Jan 2, 2019

I thought these exploits already required code execution on your machine. (Maybe I'm wrong about that.) If untrusted code is already running on your machine, your system is already compromised. So I don't see the big deal about these exploits, except in the context of hosted VMs.

zrm · on Jan 2, 2019

It's not that they don't require code execution, it's that there is more code execution happening in things that are supposed to be sandboxed than most people generally anticipate. It's not just VMs.

For example, how many of the map editors for various games are Turing-complete? If you download a custom map from random peer, you may be executing "sandboxed" code. Can it pull off a timing attack?

And the elephant is presumably javascript.

rndgermandude · on Jan 2, 2019

You'd still need a communication channel to the outside world that is available to the attack code/map or else it cannot exfiltrate the data it dumped.

zrm · on Jan 2, 2019

In a multiplayer game where each of the peers is constantly sending the others data, that seems like a surmountable problem.

rndgermandude · on Jan 2, 2019

The map engine "executing" the map has no access to the network layer of the game; or at least it shouldn't.

lotyrin · on Jan 2, 2019

what games don't have maps with manipulable objects that would need to have their state synced over the network? A barrel existing/having been exploded is one bit, the precise position of an object is quite a few more, etc.

tedunangst · on Jan 2, 2019

> In this paper, we present NetSpectre, a generic remote Spectre variant 1 attack. For this purpose, we demonstrate the first access- driven remote Evict+Reload cache attack over network, leaking 15 bits per hour.

https://misc0110.net/web/files/netspectre.pdf

userbinator · on Jan 2, 2019

leaking 15 bits per hour.

Exactly. 15 bits per hour, in an artificial environment with minimised noise, after untold amounts of preparatory work were already performed to analyse the software running on the target machine.

Here, have 15 bytes from a random process running on my machine (I just randomly attached a debugger, scrolled through memory arbitrarily, and copied them):

d1 e1 81 f9 fe ff 00 00 76 05 b9 fe ff 00 00 66 89

What are they? I don't know. Maybe you're really lucky and it's a key to something, or a password hash... but what? The above would've taken 8 hours to read using that attack. Now you should see the level of unconcern I have about this. Someone who is being targeted would care more, but I don't believe me and indeed the majority of users are so important as to be in such a position.

In much the same way I'm not going to install bars over every window of my house.

gruturo · on Jan 2, 2019

15 bits per hour means that in 136 hours you potentially exfiltrate a 2048 private key. Actually make it 120 hours since the rest at that point is bruteforceable.

I imagine Gmail's HTTPS certificate, or a Microsoft code signing key, or Linus' GPG key, or being able to impersonate some government agency or messaging server, are well worth 5 days of this.

userbinator · on Jan 2, 2019

Yes, of course as I referred to this is something only high-value targets need to worry about, and even then I think it's not that high up on the list of risks. 5 days is just to read the data, and there's a considerable amount of preparatory work involved in setting up this attack --- figuring out what to read and where it is, is just as hard if not more so than figuring out how to read it through Spectre.

The authors of that paper have the massive advantage of knowing exactly the software running on the target system and its environment; something which an attacker in the real world is unlikely to have, unless the attacker already has such familiarity with the system that it seems far easier to exfiltrate data via some other means than trying to find and setup this very slow side-channel. Everything has to be set up just right for this to work. Otherwise you might probably still manage to read something, but it's completely useless.

(High-value private keys in companies are likely to be in HSMs anyway, in which case they're completely inaccessible to attacks like these.)

icedchai · on Jan 3, 2019

Not really. How do you know where that 2048 bit private key is? You might have to read through the entire address space. On a relatively small server with 16 gigabytes of RAM, it'll only take you about a million years to exfiltrate the entire thing...

Let's say you luck out and only need to read the first 100 megabytes of memory... you're talking 1000's of years.

heywire · on Jan 2, 2019

> d1 e1 81 f9 fe ff 00 00 76 05 b9 fe ff 00 00 66 89

That's amazing, I have the same combination on my luggage!

nasredin · on Jan 3, 2019

Are you a missileer who is confusing your luggage code with your ICBM missile code?

AmVess · on Jan 2, 2019

What a horrible test. They tested without hyperthreading turned on. Spectre and Meltdown are risks BECAUSE of hyperthreading. It makes zero sense to test the performance impact of the fixes with the major component of the problem turned off.

BeeOnRope · on Jan 2, 2019

Neither Spectre nor Meltdown are related to hyperthreading.

sigi45 · on Jan 2, 2019

I will look it up later but i thought hyperthreading increases it.

Perhaps i'm mixing something up but i thought that intel removed smt from the newer generations (is removing it) because of it.

BeeOnRope · on Jan 2, 2019

There have been other recent vulnerabilities, like the ax/ah thing described in [1] or TLBleed that have relied on SMT, but not Meltdown or the original Spectre variants.

[1] http://gallium.inria.fr/blog/intel-skylake-bug/

tux3 · on Jan 2, 2019

You're right that Spectre and Meltdown are not related to SMT, but that doesn't invalidate the wider point about hyper-threading and side-channel issues I think.

Parent may have simply meant TLBleed/L1TF (Foreshadow) instead of Meltdown/Spectre.

BeeOnRope · on Jan 2, 2019

I'm not trying to invalidate any wider point about hyperthreading, and the original article wasn't trying to make one.

It was specifically about Spectre and Meltdown mitigations which are unrelated to hyperthreading, so testing with or without hyperthreading is fine. Bringing up hyperthreading here is like bringing up how a diet high in salt is unhealthy, someone pointing out that salt has nothing to do with the original article, and then a final comment "Yeah, but that doesn't invalidate the wider point that we should consume less salt!".

51lver · on Jan 2, 2019

Running the chips in less than ideal performance settings does seem fishy. I wonder if the perf gap is wider or narrower with HT on.