Hacker News new | past | comments | ask | show | jobs | submit login
If PyPy is 6.3 times faster than CPython, why not just use it? (stackoverflow.com)
140 points by neokya on Sept 22, 2013 | hide | past | favorite | 40 comments



[actual real code story example]

I wrote two approaches to the same problem.

The first approach uses simple python data structures and greedy evaluation. It runs under CPython in 0.15 seconds. Running under pypy takes 1.2 seconds. pypy is 8x slower.

The second approach (using the same data) builds a big graph and visits nodes v^3 times. Running under CPython takes 4.5 seconds. Running under pypy takes 1.6 seconds. pypy is almost 3x faster.

So... that's why. "It depends." But—it's great we have two implementations of one language where one jits repetitive operations and the other evaluates straight-through code faster.


I have to echo this sentiment here. Every time I see a post about PyPy being fast, I think, "Hmm, perhaps I should try out this package I'm working on and see if it performs better." After getting a PyPy environment working---sometimes by installing forks that are PyPy compatible---I almost always end up with real word uses that are noticeably slower with PyPy as opposed to regular ol' CPython.

I may not be coding to PyPy's strengths, but I've gone through this process on several different packages that I've released and I tend to see similar results each time. I want to try and use PyPy to make my code faster, but it just doesn't seem to do it with real code I'm using.


Please file bugs. We can't fix issues we don't know exist.


> Because PyPy is a JIT compiler it's main advantages come from long run times and simple types (such as numbers).

It is not inherent to JIT compilers that they need long running times or simple types to show benefit. LuaJIT demonstrates this. Consider this simple program that runs in under a second and operates only on strings:

  vals = {"a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o"}

  for _, v in ipairs(vals) do
    for _, w in ipairs(vals) do
      for _, x in ipairs(vals) do
        for _, y in ipairs(vals) do
          for _, z in ipairs(vals) do
            if v .. w .. x .. y .. z == "abcde" then
              print(".")
            end
          end
        end
      end
    end
  end

  $ lua -v
  Lua 5.2.1  Copyright (C) 1994-2012 Lua.org, PUC-Rio
  $ time lua ../test.lua 
  .
  
  real	0m0.606s
  user	0m0.599s
  sys	0m0.004s
  $ luajit -v
  LuaJIT 2.0.2 -- Copyright (C) 2005-2013 Mike Pall. http://luajit.org/
  $ time ./luajit ../test.lua 
  .
  
  real	0m0.239s
  user	0m0.231s
  sys	0m0.003s
LuaJIT is over twice the speed of the (already fast) Lua interpreter here for a program that runs in under a second.

People shouldn't take the heavyweight architectures of the JVM, PyPy, etc. as evidence that JITs are inherently heavy. It's just not true. JITs can be lightweight and fast even for short-running programs.

EDIT: it occurred to me that this might not be a great example because LuaJIT isn't actually generating assembly here and is probably winning just because its platform-specific interpreter is faster. However it is still the case that it is instrumenting the code's execution and paying the execution costs associated with attempting to find traces to compile. So even with these JIT-compiler overheads it is still beating the plain interpreter which is only interpreting.


PyPy also manages to speed this program up (or at least, what I understand this program to be):

    Alexanders-MacBook-Pro:tmp alex_gaynor$ time python t.py
    .

    real    0m0.202s
    user    0m0.194s
    sys 0m0.007s
    Alexanders-MacBook-Pro:tmp alex_gaynor$ time python t.py
    .

    real    0m0.192s
    user    0m0.184s
    sys 0m0.008s
    Alexanders-MacBook-Pro:tmp alex_gaynor$ time python t.py
    .

    real    0m0.198s
    user    0m0.190s
    sys 0m0.007s
    Alexanders-MacBook-Pro:tmp alex_gaynor$ time pypy t.py
    .

    real    0m0.083s
    user    0m0.068s
    sys 0m0.013s
    Alexanders-MacBook-Pro:tmp alex_gaynor$ time pypy t.py
    .

    real    0m0.083s
    user    0m0.068s
    sys 0m0.013s
    Alexanders-MacBook-Pro:tmp alex_gaynor$ time pypy t.py
    .

    real    0m0.082s
    user    0m0.067s
    sys 0m0.013s
    Alexanders-MacBook-Pro:tmp alex_gaynor$ cat t.py
    def main():
        vals = {"a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o"}

        for v in vals:
            for w in vals:
                for x in vals:
                    for y in vals:
                        for z in vals:
                            if v + w + x + y + z == "abcde":
                                print(".")

    main()


That's not the same program.

You're using a set literal ({}) instead of a list ([]) or a tuple (()).


Just tried it. Using a list instead of a set literal sped up the program slightly (~350 ms -> ~330 ms for python and ~155 ms -> ~140 ms for pypy on my computer).


I recommend to use a set of toy programs like those from the Benchmarks Game. Some nested loops with a little bit of string concatenation won't really tell you anything useful.


wow, so much faster than lua


You can't compare measurements taken on different computers, for all you know the OP has a potato and I have a speed demon toaster.


Don't knock potatoes. I do all of my hardcore data analysis on my Very Large Potato Array.


They probably don't have the same hardware.

On the same hardware I get:

lua 5.1.5: ~480 ms

python 2.7.5: ~330 ms

luajit 2.0.2: ~190 ms

pypy 2.0.2: ~140ms


Same programs, same computer:

  ~$ time python t.py
  .
  
  real	0m0.350s
  user	0m0.304s
  sys	0m0.024s
  ~$ time lua t.lua
  .

  real	0m0.255s
  user	0m0.248s
  sys	0m0.004s
  ~$ time luajit t.lua
  .

  real	0m0.157s
  user	0m0.144s
  sys	0m0.012s


Which versions of lua and python? Because when I ran the programs python was faster than lua.


Python 2.7.3, Lua 5.1.5, LuaJIT 2.0.2 on Linux 3.2.0, Ubuntu 12.04.3 LTS on a 2.53GHz i3 (64bit)


I unfortunately had to edit the program after I took the measurements because HN cut off my program horizontally. So I removed one or two letters from the list.


Another example is all modern JS engines — they are JITing compilers very much tuned for the short-runtime use-case (often to the detriment of long-runtime use-case).


I actually experimented a while ago by running a long-running Twisted-based daemon on top of PyPy to see if I could squeeze more speed out. PyPy did indeed vastly increase the speed versus the plain Python version, but once I discovered that Twisted was using select/poll by default and switched it to epoll, my performance issues with the original CPython version were gone (and PyPy couldn't use Twisted's epoll at the time).

Another major issue was that running the daemon under PyPy used about 5 times the memory that the CPython version did. This was a really old version of PyPy, though, so they have probably fixed some of this memory greediness.


What version of twisted and os you were using? I'm asking because all latest Twisted releases are using epoll by default.


It's worth noting that PyPy also supports epoll (and kqueue), and has for a few versions.


I remember looking at that, but Twisted's epoll reactor was a C extension at the time. It looks like Twisted 12.1.0 switched to using the epoll provided by the Python base library, but that was released about a year after I was originally installing this daemon (and I was installing everything from apt, so add another year to the age of the packages I got).


This was with Twisted 10.1.0 and Python 2.6.6. I remember being in extreme disbelief when I found that it was using select/poll instead of epoll (who does that, especially when they already have epoll support?). I ended up writing this:

  for reactor in ["epoll", "kq"]:
      try:
          rn = reactor + "reactor"
          getattr(__import__("twisted.internet", fromlist=[rn], rn).install()
          print "Auto-selecting reactor: " + reactor
          break
      except ImportError:
          pass


A couple of years back when I was deploying twisted using Debian packages, it was not using epoll by default.


Yep, this was on a Debian 6.0.2 install, with packages from apt.


Because CPython came first.

Because Python isn't about performance.

Because it's not really 6.3 times faster for most(any?) use cases.

Because VMs are misunderstood as superfluous abstractions where a good interpreter should be instead.

Because VMs are understood as superfluous abstractions where a good OS should be instead.

...

And most of all, because better hardware costs less then the extra man hours involved the transition.


Re: any? use cases. While dramatic speedup is not too common, MyHDL, a hardware description and verification language written in Python, is known to run 10 times faster on PyPy.

http://www.myhdl.org/doku.php/performance

I also remark that MyHDL's simulation is competitive (on PyPy) with open source Icarus Verilog, in case you wonder why would anyone write HDL in Python.


PyPy has LOTS of problems with 3rd party libraries. If you want to deploy it in production you'll have to check that each one of them does exactly what you need it to and oftentimes you are surprised how things are broken.

We are using PyPy for some of our services (where it is doing about x3 faster than CPython), while for some others (Django UI - at least the way we are using it) we found that PyPy is actually slower, so we are sticking with CPython.

Unfortunately PyPy team has not even made it their priority to test PyPy with Django. It is one thing to have a cookie-cutter test suite that measures simple use cases and it is entirely different matter to test how well it can run the whole stack of apps on top of it.


Why don't you help out then, instead of complaining?


I do not understand this kind of comment. Even if Open Source is a great thing, some people just want to use the tool. They are not interested or do not have the time to participate in the project.


Then you have no right to complain.

Much open source exists because people in the event of scratching their own itch, happen to release something as it might be of use to others.

When others contribute it helps make the solution more generic/robust as it is guided towards meeting multiple requirement sets.

Do you have any idea how shit it was writing software in the 1990s without the breadth of tools we have today that are open source and permissively licensed?

PyPy is a smallish project with very limited funding. If you try it out and it doesn't help you complete your goals, find another way. That might be to make it better somehow and if you don't have the resources, find an other way to meet your business goals.


Please stop replying to people like this. It's extremely discouraging to people, and not helpful to the PyPy project (of which I'm one of the developers).

People have a right to have a problem with our software without trying to fix it themselves.


I did help out by reporting the issues in detail.

I am very thankful to PyPy developers for all the help they provided though as I said it did not address all of our concerns, which is why our PyPy deployment is more limited than what we wish it could be.

We are not Google or Facebook. We cannot invest resources in hacking internals of every low-level component of our stack. We are happy to contribute feedback and test suggested solutions, but we have to stay focused on our own product and business.


I would like to point out that while (c)python reliance on C seem to be the problem here, it is not inherent to general use of C either. Again, Lua vs. LuaJit prove that a Jit implementation can be a drop in to an other (non Jit) one.

On Gentoo, I tend to force applications to link against LuaJit and it works just fine.

Message written in LuaKit using AwesomeWM and my alt+tag show VLC, Wireshark and MySql Workbench running, all on LuaJit to some level of success (most are flawless). All of those applications doesn't (AFIAK) officially support LuaJit.


LuaJIT isn't a perfect drop-in, however, as it has various limitations that base Lua doesn't (in addition to the obvious ones if you're using Lua 5.2 features, which LuaJIT doesn't support).

In my case it's because of LuaJIT has address-space limitations that standard Lua does not, due to its use of NaN-encoding for pointers. There are some inputs where LuaJIT simply runs out of memory (or rather, address-space), which work fine when run using standard Lua.

[For my app the speedup from LuaJIT isn't so great anyway, so it's just a minor annoyance.]


Lua 5.2 features, which LuaJIT doesn't support

Just being pedantic: LuaJIT supports some 5.2 features. Search for "5.2" on this page: http://luajit.org/extensions.html


It's a great question. I'd say for myself I've been hesitant to use CPython or PyPy simply because their documentation seems focused on the extremely technical, rather than a person just trying it out for the first time.

I know Python, and I know C. But I'm worried ending up down rabbit holes in PyPy and competitors. I've not been able to find a really solid tutorial or parse the docs very well.

Perhaps it's just me, though. That's always possible. I just see a large barrier to adoption.


When you say CPython, do you mean Cython? I've only recently learned what "Python" really is myself, and it's easy to miss the difference between these two:

http://en.wikipedia.org/wiki/CPython

http://en.wikipedia.org/wiki/Cython


I meant Cython, yes.


I'm successfully using PyPy in production. It's about data processing. The most important dependencies: redis driver, beautiful soup.

             PyPy   cPython
  jobs/sec    ~60      ~8  
  mem usage   1.5G     2G
When using lxml on cPython, the jobs/sec increased to 10 (on that time lxml wasn't supported by PyPy, now it is). I really encourage to give a try to PyPy.


Will CPython always remain the reference implementation?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: