Hacker News new | past | comments | ask | show | jobs | submit login

"We postulate that the original study produced good results because it evaluated the new scheduler in a machine whose performance bottleneck lay outside the execution core. The problems introduced by the steering logic were, as a result, hidden."

Any new microarchitecture needs to first answer that big question -- how is the core going to deal with memory?




Fun fact, the Mill's proposed method for hiding that latency, "deferred loads" has already been done by Duke's Architecture Group: http://people.duke.edu/~bcl15/documents/huang2016-nisc.pdf (warning PDF link).

The big gain? A measly ~8%.


This doesn't decouple across function calls like the Mill claims to, right? Although doing that optimization on existing C/C++ programs will be difficult.


I don't think that's enough of a difference, YMMV, but nothing indicates that it should.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: