Hacker News new | past | comments | ask | show | jobs | submit login

I've been relying on a Julia port of version 2 (version 3 is out now). Version 3 added a lot of new functions, and I believe it improved performance on many of the old ones.

It is much faster (when vectorized) than what you get in base Julia, but lags behind gcc (glibc) and the Intel compiler's vectorized math libraries in performance.




Curious whether you've compared it to https://github.com/chriselrod/LoopVectorization.jl ? Which if I understand right is a pure-Julia attempt to use many of the same tricks.


Compare my username with that of the author of that library ;).

For special functions, LoopVectorization relies on SLEEFPirates.jl, which is a fork of SLEEF.jl, a Julia port of version 2 of SLEEF. Most of the changes in SLEEFPirates are so that it works when you use llvm-vectors as arguments, but I also switched to using Estrin's rather than Horner's method of evaluating polynomials for a few functions (which more recent SLEEF versions did as well).

The code is pure Julia (or Julia + LLVM call; either way it does not need any external dependencies aside from Julia itself). It does need performance work, but I have many higher priorities at the moment.


Hah! Sorry, didn't cross my mind. And thanks for the details.


glibc has SIMD-vectorized math functions? I don't quite understand what is being compared here. Any specifics you can share?


Yes, here is the source for 8 double precision logs (with avx512) in glibc, for example: https://github.molgen.mpg.de/git-mirror/glibc/blob/20003c498... The "sysdeps/x86_64/fpu/multiarch" directory contains many of these functions.

You'll need at least GCC 8 to use them automatically, as well as the -ffast-math flag: https://godbolt.org/z/PL26up


Correction: gcc 6 and 7 create them too. They just have 7 unrolled calls to log finite before (and then again after) a loop surrounding the vectorized call.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: