Hacker News new | past | comments | ask | show | jobs | submit login

If most of your code is numpy stuff, will you actually see a speedup from PyPy? (I mean hypothetically, once NumPyPy is properly optimized.)



Yes, ignoring that most people need a loop now and then, even pure NumPy-using code has a lot of potential.

Consider "a + b + c + d". For large arrays, the problem is that it creates many temporary results of same size as original arrays that must be streamed over the memory bus. And since FLOPs are free and your computation is limited by memory bandwidth, you pay a large penalty for using NumPy (that gets worse as expression gets more complex).

Or "a + a.T"... here you can get lots of speedup using basic tiling techniques, to fully use cache lines rather than read a cache line only to get one number and discard it.

And so on. For numeric computation, there are large gains (like 2-10x) from properly using the memory bus, that NumPy can't take advantage of. So you have projects like numexpr and Theano that mainly focus on speeding up non-loop NumPy-using code.


In my experience, you'll have some parts of your code in which you have to do a for loop or something else that cannot be (cleanly) expressed using numpy and that's where performance starts to suffer. The current solution is to use something like cython, but if pypy simplifies that, I think that's great.


It highly depends. If it's just large linear algebra operations, then it won't matter, since all the work will be done by BLAS.

If it's lots of small operations, I think pypy can inline them and you might see a significant speed up.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: