One of the biggest problems of switching to PyPy is to know which libraries are supported and what the best options are. For example, how should we connect to MySQL? Or use memcache? Or use Redis?
It would also be great with some stories of who runs PyPy in production and what their experience is (especially on large high-traffic sites/apps).
I'm mostly missing Qt GUI support myself. Hopefully cppyy (http://doc.pypy.org/en/latest/cppyy.html) matures, as then it will be possible to use C++ libraries automatically without none or very little binding work.
It'd be great to have a site listing this, but #pypy on freenode is a good place to ask such questions — there's normally something readily available (although in some cases only suboptimal solutions).
Unfortunately it's not a silver bullet for performance.
I have a relatively short Python program that does statistical analysis. It also contains a lot of string matching and is deeply recursive. I just added a benchmark mode, which gives the following (consistent) result:
Python 2.7.3: set solved in 16 seconds
PyPy 2.0b2: set solved in 33 seconds
(any initial IO is not part of the time, so the problem is entirely compute-bound and it pegs the CPU at 100%)
If there's something obvious I'm overlooking that would make PyPy faster I'd love to hear about it.
While I'm not a pypy developer, I know they treat any python* code running slower than cpython as a bug. They love to try and help with it if you stop by #pypy on freenode irc.
It's best if you can give them a self contained example they can download and run themselves to track down what is going on. If not (because it is closed source or such) they may or may not be able to do much, but are still more than happy to try and give some advice at least.
Janzert
* pure python, certain methods of interfacing with extension modules will probably always be rather slow and the solution is generally to either get rid of it or rewrite the interface to it using cffi.
Consider factoring your code into a benchmark and submitting it to the PyPy team. They have an awesome attitude toward this. From http://pypy.org/performance.html:
We generally consider things that are slower on PyPy than CPython to be bugs of PyPy. If you find some issue that is not documented here, please report it to our [bug tracker](https://bugs.pypy.org/) for investigation.
The PyPy memory usage also keeps increasing linearly during the benchmark, reaching 220MB at the end. The CPython version is using exactly 23MB for the entire run with zero fluctuation. I'm guessing the recursive nature of the program is killing PyPy.
Are you by chance using s += x to create strings? If so, stop. Anyway, my crystal ball is in service right now, you should really give us a chance to look at it.
don't want to kick someone when their obviously down. But, are you losing $$ because of pypy's lack of support for python 3? Or is this a case of wanting to be "one of the cool kids" using the latest stuff?
I only ask because I have yet to get an answer as to how the lack of python 3 support for <my pet framework/tech> is hurting anyone at this moment.
FTR, I'm anticipating the gil-less implementation to see how that performs more so than language syntax updates. But I'm happily getting work done with python 2.7.3 in the meantime.
What's the nature of the performance regressions? Is everything just universally 5% slower, or did the benchmark scores drop by 5% because some particular operations/workloads are slower in this release? If the latter, which ones?
Basically the way we did the JIT/greenlet integration involved restructuring how frames are represented in the JIT, which was very slightly slower.
This was needed to support greenlets fully, and so on its own it might be worth it, however it also gives us the ability to do some more (very creative) optimizations, which should let us buy that performance back, and more.
Cool, thanks. I'll probably run one of my projects' benchmark set later today to see how this stacks up... I suspect that if some of the numpy and cffi stuff is enough faster, I may still be better off on this release than the last. Either way, gevent support seems worth it to me. Kudos on the release.
We include a builtin _continuation module, which is the foundation for greenlets and stackless. PyPy compiles just fine if you don't include that though.
It would also be great with some stories of who runs PyPy in production and what their experience is (especially on large high-traffic sites/apps).