Hacker News new | past | comments | ask | show | jobs | submit login
Low level stuff in ARM (coranac.com)
23 points by dacav on Oct 2, 2008 | hide | past | favorite | 3 comments



The author seems to be freaking out a wee bit much over branching, which the last time I checked is quite fast on the shallow-pipeline ARM cores in the market. Branches get more expensive the deeper the pipeline gets, which means they're hugely important for desktop CPUs (although a little less now than in the days of the "netburst" cores from Intel). But for ARM? Meh.

What's a much bigger deal on ARM are cache issues. The L1 caches are very small, and there is no L2. Keeping working set sizes down for instructions and data is hugely important, which means that tricks like these aren't always a win if you're using them instead of (e.g.) calling a "min()" function.

And (someone correct me if I'm misremembering) ARM doesn't have a physically tagged cache, which means the caches can't survive a change in memory domain like a system call. I know for a fact that syscalls on my Motorola A780 (XScale CPU, Linux 2.4 kernel) are 20k cycles or more.

The bottom line is that I think the author is missing the point. These are elegant assembly hacks, but aren't really where performance-conscious programmers need to be focusing their efforts.


ARMv6 (and higher) does have a physically tagged cache.

Another thing that helps with avoiding cache flushes on context switches is the presence of a pid bit in the page table.


this is pretty neat stuff.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: