Hyper-Threading has been a source of security concerns for a decade now, and vulnerabilities in existing HT implementations have been trickling out over the last few years. Unlike Management Engine or TrustZone, at least we can disable Hyper-Threading (for a 30% performance hit).
Also, HT is not such a great performance win - on a few different 4-core/8-thread machines, I had access to, loading all 8 threads to "100% CPU" (whatever that means) usually only delivers 20-30% faster computation than with HT off (4-core/4-thread) - which is inline with your 30% number.
And that's an improvement - some 15 years ago, with similar computational loads, most of my tests ran 10-20% faster with the HT off (using 2 core / 2 threads) than with HT on (using 2 core / 4 threads) - there just wasn't enough cache to support those many threads.
A 20-30% increase is a BIG increase for a hardware feature, though. The cost of hyperthreading in transistors mostly amounts to the larger total register set. The whole point is the rest of the decode/dispatch/execute/retire pipeline is all shared.
How is 20%-30% not a great performance win? If I tell you today there's this One Simple Trick that you can do on your computer to instantly gain access to 20%-30% more performance, would you do it in a heartbeat?
If your workload is already well parallelized, then, yes 20% is quite significant. However, working to parallelize properly over 8 rather than 4 has its own costs.
The thing that bothers me most is that 800% CPU and 500% CPU on this processor are roughly equivalent at 5x100%CPU, it makes everything very hard to reason about when planning capacity.
I think you’re misunderstanding what HT is. It’s not true parallelism, it’s just hiding latency by providing some extra superscalar parallelism. You can’t expect it to give you actual linear improvements in performance because it’s just an illusion.
I understand that very well. But non of the standard tools that manage CPU understand that, and most people don't either.
If I had a nickel for every time I had to explain why "You are at 50% CPU now, but you can't actually run twice as many processes on this machine and get the same runtime", I'd be able to buy a large frapuccino or two at starbucks.
Perhaps I'm uninformed though - is there a tool like htop, which would give me an idea of how close am I to maxing out a CPU?
No there isn’t. But if you understand it I don’t get why you think 20% isn’t a good performance boost, especially considering the rate of return for power and area in silicon.
Because many people believe it is a 100% improvement, plan/budget accordingly, and then look for help.
As far as silicon/power it is nice, but IIRC (I am not involved in purchasing anymore) it used to cost over 50% in USD for those 20% in performance when you non-HT parts were common.
You ignored the price issue, which was measurable and real, but also:
It (used to be) my job. Does "because people fall for deceptive marketing, waste money, and then waste my time trying to salvage their reputation" sound better?
I think it isn't viable with non-deterministic (in time) hardware behavior. This means dedicated caches, or no caches at all. Dedicated guaranteed memory speeds and latencies. Dedicated processing units. The untrusted code cannot be affected by other code, otherwise the other code leaks its usage patterns across.
Dang: "Hyper-Threading technology, as used in FreeBSD and other operating systems that are run on Intel Pentium and other processors, allows local users to use a malicious thread to create covert channels, monitor the execution of other threads, and obtain sensitive information such as cryptographic keys, via a timing attack on memory cache misses."
Also, found elsewhere:
"According to Linus Torvalds and others on linux-kernel this is a theoretical attack, paranoid people should disable hyper threading"
Yes. Intel dismissed it at the time, saying that "nobody would ever have untrusted code running on the same hardware on which cryptographic operations are performed".
30% performance hit? I'm sure that heavily depends on the workload... and I'm also sure you lose performance when HT is on, depending on the workload as well.
That would make sense, my understanding is that with a 100% pegged CPU hyperthreading won't be super beneficial as they aren't real cores, just smarter scheduling. You can't really schedule 100% load better, however for applications that are latency specific it would make more sense, as you don't have the CPU pegged, you just want a faster response.
Sure you can. You can do math while another HT is waiting for memory. Sometimes you can even multiplex use of multiple ALUs or one HT can do integer and another can do floating point.
It's actually under high multithreaded load that HT shines, especially if that load is heterogenous or memory latency bound.
I too was once under the misapprehension that HT was "just smarter scheduling", until I took a university course in microarchitecture that explained how Simultaneous Multithreading actually works in terms of maximising utilisation of various types of execution units. I wonder why "smarter scheduling" became a common understanding.
Disabling hyper-threading is highly unlikely to produce a 30% performance hit. Most highly optimized software disables or avoids hyper-threading because doing so increases performance.
Hyper-threading tends to benefit the performance of applications that have not been optimized, and therefore presumably are also not particularly performance sensitive in any case.
In highly-parallel workloads like rendering (ray tracing) where pipeline stalls due to loads happen quite regularly, it's fairly easy to get 20-35% speedups with HT.