Hacker News new | past | comments | ask | show | jobs | submit login

I haven't paid attention for the past decade, do modern C/C++ compilers generate any decent AVX512 code if told to do so? Or do you still need to do it by hand via intrinsics or similar?



The short answer is "Yes, sometimes".

Clever hand-written SIMD code is still consistently better, sometimes dramatically better. But generally speaking, I've found Clang to be pretty good at auto-vectorizing code when I've "cleared the path" so to speak, organizing the data access in ways that are SIMD friendly.

On the Windows platform, in my experience MSVC is a disaster in terms of auto-vectorization. I haven't been able to get it to consistently vectorize anything above toy examples.


Those aren't the only two options. You can use libraries that are made to take advantage of SIMD and you can use ISPC which is specifically about SIMD programming.


Those languages are too SIMD-hostile.


Besides C#, what languages do you think are not SIMD-hostile?


Languages where the semantics have considered parallelism and this kind of optimization. ISPC, OpenCL, Chapel, Futhark, etc.


Thanks. With ISPC and OpenCL it's a given...I was thinking more general-purpose programming languages where it is easy to exploit CPU-side SIMD.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: