Hacker News new | past | comments | ask | show | jobs | submit login

All software will use V once it is available. Any operation done in a loop over fixed size elements can have V applied to it. It is much more powerful than SIMD.



Any software that uses memcpy will use V once glibc is updated.


I must admit, I've yet to find a performance relevant loop that I can't do in simd that doesn't also have dependencies on previous iterations such that no magic instruction set is going to help unless it's capable of time travel.


LLVM's Polly[0] can often do magic there, by renumbering[1] the iteration space. Where variable vector length instructions help is decoupling the chunk size from the machine code, because they take care of the remainder that doesn't fit in whole vectors/chunks in an agnostic fashion. It's so you can get at least most of the gains from wider vector units without needing to change the code.

[0]: https://polly.llvm.org/ [1]: https://en.wikipedia.org/wiki/Polytope_model


It's not that you couldn't have used fixed SIMD, it's so it can rescale the SIMD automatically.


Right, but portable code != portable performance. See also: OpenCL.

There's also the observation "keep simd vectors small but many" (e.g. Apple's arm chips) over "super long vectors" (intel avx512) is superior as it is much more flexible whilst delivering similar performance for tasks that are amenable to larger vectors. Having an architecture pushing towards the latter seems a retrograde step to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: