I really want pypy to be the future of Python along with python3. I haven't used it in a while but the performance improvements I could get by switching from python somecode.py to pypy somecode.py were magical.
So what is the best way I can help support pypy3? Are there any easy issues I can help contribute to?
The page mentions pypy3's current status as of August 2016: "We are soon releasing a beta supporting Python 3.3. For the next full year, though, see PyPy gets funding from Mozilla for Python 3.5 support. Individual donations through here are still welcome, and go towards the same goal, which is upgrading PyPy to support Python 3.x (which really means 3.3/3.5 by now). Thanks to all our past contributors! Your money has been put to good use so far."
I don't want PyPy to be the future of Python. Python 3.5 (asyncio, os.scandir, functools part of stdlib, to name a few) was a big step in the right direction for those of us still locked in Python 2 so we can say to our bosses "hey, look all the good things we're missing". Python 3.6 with compact dicts working ~4x faster than old dicts[0, 1] is also huge for that -- plus, since dicts are used all over the place in Python this may have a global impact on all Python performance (don't quote me on that, but it's a possible side-effect).
My biggest issue with PyPy is library/modules incompatibility. I work in forensics, and I honestly tried using most of the python based tools with PyPy, and many just don't work because of lib/module dependencies. The few that do, may or may not run faster, depending on your use case and even the case (some have slower initialization but process faster, so with a big enough target file you get things done quicker). And while I am sort of a developer, I also have to solve cases, so I don't have enough time to sit and port all the libraries of 20 different tools to work in PyPy.
And if performance is still a need (it always is in forensics, with more cases with larger drives/memory coming all the time), there's Cython, and with it we've gotten speedups ranging from the few percent (about 14% from "removing the interpreter") to 20-30x faster -- which has the side-effect of revealing yet another bottleneck in the application, but hey, now you know.
So, seeing the things that are coming to Python 3, I sincerely hope PyPy exists and goes on living, but I much rather CPython be the future of Python -- because it is good, compatible (mostly), it has the right tools, and things work.
> My biggest issue with PyPy is library/modules incompatibility.
Addressing this has been a big focus for PyPy lately. You should consider re-evaluating PyPy. numpy (arguably one of the heaviest users of C extensions) works on pypy at very nearly 100% of its tests passing.
We tested last week, distorm3 and yara bindings for python are the big things that just didn't work (yet again) with our test case (the volatility memory forensics framework). We got them installed with pip, but they just didn work. Volatility still runs without them, it just complains a bit.
Also, startup time of volatility exploded for no apparent reason (4 seconds CPython vs 12 seconds PyPy, latest 2.7 branch). Real analysis time was indeed faster (~30 seconds CPython vs ~15 seconds PyPy), but half the plugins won't work.
Plus, in the end (startup+analysis), both CPython and PyPy got close results (though arguably PyPy was some about 15% faster).
Might be worth reporting since you have a reproducible failure and a use case for which the performance is not satisfactory. You can tune some of the JIT behavior with command line options -- that might make it kick in sooner and you might see a net benefit.
When I was still locked to Python 2 I really wanted to use Pypy but couldn't due to Cpython requirements. Now we're using Python 3.5 async stuff everywhere.
I wonder if I'll ever get to use Pypy in anger or will something in Python 3.6+ distract me (those f-strings look pretty fancy, dammit).
I am new to the python ecosystem. I use python for solving problems and hackerrank and noticed that switching from python3 to pypy3 can mean the difference betwwen timing out on some problems v/s getting it to work.
Can someone eli5 how pypy achieves such difference and why that improvement cannot be contributed back to python3?
PyPy uses a JIT (just-in-time) compiler, as opposed to the static interpreter of CPython (the default/canonical implementation of Python from python.org). What this means is that the PyPy interpreter uses runtime information about your program to compile your Python code into very efficient machine code on the fly. Because this compilation is done at runtime, PyPy has lots of data about things like variable types, how often each function gets called, etc, and it can use that data to generate really fast code in many cases.
PyPy is a completely different implementation of Python, so it's impossible to simply contribute improvements from it to CPython. To answer the question I think you're asking, though, the main reason we can't just have PyPy replace CPython as the default implementation isn't strictly technical: The Python community (Guido in particular) has decided to keep the reference implementation (CPython) as simple and consistent as possible, so that it's easy to contribute to and doesn't have any surprising behavior. PyPy, by comparison, is an incredibly complex piece of software. Even if you ignore the fact that it would make contributions harder, its performance is, in many scenarios, actually significantly worse than CPython's, because its optimizations rely on specific patterns within your code. I think it's mostly a good thing that PyPy is its own project and not the default implementation of Python, because it allows Python (the language) to grow much faster than it would otherwise.
I hope that answers your question. I apologize to anyone who knows more about the details than I do if I over-simplified things. :)
The complexity of the JIT isn't in pypy's implementation of python. This is probably the most important part of the pypy project: it's a python implementation written in a simple subset of python plus some standard hints, and an independent piece that generates a JIT for things written in that subset.
PyPy is a different implementation of a python interpreter, it includes a JIT to rewrite hot code into assembler. See www.pypy.org for more info. The JIT cannot be tacked on to CPython (what you call python3) since it is part of the interpreter.
Personally while I appreciate what PyPy are trying to do, for a lot of real world use cases you can get really impressive performance out of python using the right libraries (see: the entire Python scientific computing stack).
You get impressive performance out of python the same way you get impressive performance out of any dynamic language (with few exceptions): make sure your hot paths are executed by code written in C. The nice thing about PyPy is that it makes your pure Python code a lot faster. For instance, I wrote some log-parsing code in pure Python that got a 2.5x speedup when ran from PyPy.
Add Julia to the list, since it's directly competing with Python on scientific computing, and the other two aren't really that much (maybe Go a little).
If you are talking about scientific computing, sure, compare PyPy, R, and Julia (and probably add cython, numba and company to make a meaningful comparison).
But the OP seems to be interested in web dev where Julia does not deserve the same attention as PyPy, node, and Go.
Julia might be intended for a scientific audience, but there doesn't appear to be any reason why it couldn't address server-side needs. It just needs the right library (like Python did).
By the time Julia hits 1.0, I wouldn't be surprised if someone had developed a decent networking library.
There won't be breaking changes from 3.3->3.5, so the code changes are the same either way. I guess skipping a release saves the effort of making the release/prioritizing things that appeared in 3.4.
You can implement Your Own languages on pypy so make it 4 :)
Breaking changes I would include would be : remove all the deprecated bits of subprocess or replace it entirely, remove the zip, ftp and other modules in favour of virtual filesystems with a consistent api and finally try and make all the APIs symmetrical (e.g. open always paired with close, push with pop etc).
So what is the best way I can help support pypy3? Are there any easy issues I can help contribute to?