Hacker News new | past | comments | ask | show | jobs | submit login

The instruction encoding talk starts with comparison between Mill, DSP and Haswell and tries to explain the basic math. The Mill is a DSP that can run normal, "general purpose" code better - 10x better - than an OoO superscalar. The Mill used in the comparison - one for your laptop - is able to issue 8 SIMD integer ops and 2 SIMD FP ops each cycle, plus other logic.



I was strictly replying to the Intel FLOPs claim of the parent comment. I have only a faint idea how the Mill CPU works, so I can't really compare against it.

From the little I have read, the Mill CPU looks like a cool idea, but I'm skeptical about the claims. I'd rather see claims of efficiency on particular kernels (this can be cherry-picked too, but at least it will be useful to somebody) than pure instruction decoding/issuing numbers. Those are like peak FLOPs: depending on the rest of the architecture they can become effectively impossible to achieve in reality. In any case, I'm looking forward to hearing more about this.


Apologies, I was replying to the thread in general and not your post in particular.

Art has now published the 33 pipeline breakdown on the "Gold" Mill here: http://ootbcomp.com/topic/introduction-to-the-mill-cpu-progr...

A key thing generally is that vectorisation on the Mill is applicable to almost all while loops, so is about speeding up normal code (which is 80% loops with conditions and flow of control) as well as classic math.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: