Hacker News new | past | comments | ask | show | jobs | submit login

Apple CPUs are quite sophisticated wide and deep OoO braniacs designs with state of the art branch predictors.

There is nothing simple about them. The only reason they are not desktop level performance is because the architecture has been optimized for a lower frequency target for power consumption.

A desktop optimized design would probably be slightly narrower (so that decoding is feasible with a smaller time budget) and and possibly deeper to accommodate the higher memory latency. Having said that, the last generation is not very far from reasonable desktop frequencies and might work as-is.




Compare die shots of the two. Even after you correct for the density provided by 7nm process, A12 predictor is few times smaller than that of recent intel i cores


5 minutes of Googling didn't return any image of either skylake or 12 die shots with labelled predictors. Do you have any pointers?

Also I know nothing about the details, but I expect that most of the predictor consists of CAM memory used to store the historical information. I doubt that without internal knoledge, is it possible to distinguish it reliably from other internal memories.


CAM is expensive and requires some kind of replacement scheduling logic. I believe that branch predictors are still implemented as straight one-way associative RAM, often even without any kind of tagging and only true CAM in the CPU core is TLB.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: