Hacker News new | past | comments | ask | show | jobs | submit login

Ive dredged though Julia, Numba, Jax, Futhark, looking a way to have good CPU performance in absence of GPU, and I'm not really happy with any of them. Especially given how many want you to lug LLVM along with.

A recent simulation code when pushed with gcc openmp-simd matched performance on a 13900K vs jax.jit on a rtx 4090. This case worked because the overall computation can be structured into pieces that fit in L1/L2 cache, but I had to spend a ton of time writing the C code, whereas jax.jit was too easy.

So I'd still like to see something like this but which really works for CPU as well.




Agreed, JAX is specialized for GPU computation -- I'd really like similar capabilities with more permissive constructs, maybe even co-effect tagging of pieces of code (which part goes on GPU, which part goes on CPU), etc.

I've thought about extending JAX with custom primitives and a custom lowering process to support constructs which work on CPU (but don't work on GPU) -- but if I did that, and wanted a nice programmable substrate -- I'd need to define my own version of abstract tracing (because necessarily, permissive CPU constructs might imply array type permissiveness like dynamic shapes, etc).

You start heading towards something that looks like Julia -- the problem (for my work) with Julia is that it doesn't support composable transformations like JAX does.

Julia + JAX might be offered as a solution -- but it's quite unsatisfying to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: