Hacker News new | past | comments | ask | show | jobs | submit login

Intel has `rep movs` which is recommended for general use, you can beat it for sizes < 128B using a specialized loop. A very good memcpy for general purposes would just branch over the size to either a specialized small copy or `rep movs` for larger sizes.

The GCC version is just bananas. https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysde...

Compare the newer ERMS implementation: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86...




What's weird is that if you git blame a bit, a lot of the elaborate stuff was contributed by Intel.


Intel is at least as guilty of chasing misleading benchmarks as anyone else in the history of our field.


No AVX2-optimized version? I'm disappointed…

Edit: it's rolled into the erms version.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: