Hacker News new | past | comments | ask | show | jobs | submit login

Problem is superscalar processors the correspondence between number of instructions and speed breaks down. Partly because the processor does it's own optimization on the fly and can do multiple things in parallel.

A programmer should be careful about second guessing the compiler. And a compiler should be careful about second guessing the processor.




I'm not sure if you're implying this is premature optimisation. It isn't.

It's a performance-sensitive standard-library function, the kind of thing that deserves optimisation in assembly. It's also the kind of problem that can be accelerated with SIMD, but that necessarily means more complex code. That's why the standard library implementations aren't always dead simple.

Here's a pretty in-depth discussion [0]. They discuss CPU throttling, caches, and being memory-bound.

[0] https://news.ycombinator.com/item?id=18260154




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: