Hacker News new | past | comments | ask | show | jobs | submit login
10~17x faster than what? A performance analysis of Intel' x86-SIMD-sort(AVX-512) (github.com/voultapher)
20 points by Voultapher on June 10, 2023 | hide | past | favorite | 10 comments



Nice.

Effect of frequency downclocking when AVX-512 instructions get used?

https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#Dow...


Cross-posting from https://news.ycombinator.com/item?id=36273544:

We looked into this [1] and conclude:

- throttling is a non-issue on Xeon Gold/Platinum;

- AVX-512 startup overhead can hurt on Skylake, but AVX-512 is still a net win for data sizes >= 100 KiB.

- Startup overhead is a non-issue on Icelake and AMD Zen4.

1: https://github.com/google/highway/blob/master/hwy/contrib/so...


I spent the last couple months working on this writeup, looking forward to feedback and questions. Hope you find this insightful.


Thank you for such detailed analysis! Just curious to know which version of x86-simd-sort did you benchmark: release v1.0 or the top of main current branch? (I'm the author of x86-simd-sort).


Hi, hope you found this analysis helpful. I vendored the code Feb 16 2023 from the main branch.


Of course! Appreciate all the time you put in. I added a few more optimizations to qsort after that (see https://github.com/intel/x86-simd-sort/pull/33), just wanted to know if your analysis took that into account.


Still using the simple way of getting the pivot though.


No matter how sophisticated the pivot selection is, you can always risk having some degenerate worst case. I recommend having something like a heapsort fallback after a certain recursion limit is reached, as do pdqsort, ipnsort and vqsort(I'm a little fuzzy what their fallback is, but they have one).


Yes, vqsort does indeed resort to Heapsort after too many recursions. I'd be surprised if that happens on real data, though, because we apply a lot more effort to the pivot selection. Would be curious to see any input distribution that triggers a fallback.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: