Not that I'm a big fan of Jython, but including startup time in a benchmark is only useful for very short running command line tools. It says nothing about the speed of the JIT or the code that's being tested.
I thought about that, but decided against subtracting out startup time. This is a real-world benchmark of how long it takes to run some actual code that I care about. Startup time counts, but the cumulative execution time will be dominated by time taken to run the slower programs, not startup time in the trivial ones, so I don't think I'm counting it too much.
But I think next time I do this I'll increase the max runtime from 1 minute to 3. That will keep more of the slower programs in the benchmark, and make startup time count that much less. Without actually removing it, because that seems too artificial and contrived to me.
Sure, and I did mention command line utilities. The issue I see with the benchmark is that it compares apples and oranges. For some programs it tests JIT performance and for others it tests startup time. If I want to test startup time, I just use a hello world program. To test JIT/interpreter performance, I try to exclude startup time.
Hello world programs only ever rely on the bare runtime, though; real utilities have to load libraries after the runtime loads, which incurs further delays (which might depend on JITing/interpreting speed if they're not pre-compiled.)
The same can be said about benchmarking numeric vs io code. Is it apples and oranges and bananas then? Most programs have a mix of everything. Whether Eurler's problems are a good mix that represent your workload is up to you to decide.
In addition to what fauigerzigerk said, if you run a lot of command line utilities where startup time is relevant, there are established solutions to that (nailgun for example) so it still isn't a fair comparison. If you want a comparison of state of the art solutions, which this post obviously is, put nailgun into the mix.
those utilities are often used shell scripted to run many times in a row. at least in interactive use, the multiplied startup times can grow annoyingly long.
You could use one of those Java background daemons to do that, but anyway, what I was trying to say is just that this benchmark doesn't test JIT or interpreter performance in the case of Jython.
Also Hotspot VM needs warmup to achieve maximum performance since it takes some time to detect and JIT-compile performance-critical parts of code with all optimizations.
Has anyone done any recent memory benchmarks with PyPy?
I'd like to see magnitude of the memory trade-off for using a JIT compiler. As a web developer my programs are mostly IO-bound, not CPU-bound. I'm also bootstrapping and trying to squeeze as much as I can out of my 512MB linode.
PyPy's objects are smaller than CPython's, however the steady state interpreter is larger, and the JIT adds some additional overhead due to bookeeping and generated machine code.
LuaJIT is a nigh untouchable work of art, we might never see a faster dynamic language JIT.
PyPy is significant because of the nature of Python itself: the language specification is much more complex than Lua and there are currently many more users and applications.
[the more complicated the language specification, the harder it's going to be to prove things about, the harder it's going to be to write a compiler]
Lisp implementations have been performing at a similar level for years now - many with the aid of AOT compilation, some not. Currently, Racket and SBCL are comparable to LuaJIT on the Alioth microbenchmarks - faster at some and slower at others.
> LuaJIT is a nigh untouchable work of art, we might never see a faster dynamic language JIT.
Well Mike Pall disagrees in that he says there are still a lot of possible optimizations. Also all the process of by hand optimization for specific architecture can theoretically be automatized and decoupled. You're totally right about Python complexity though.
But the speed difference is still too big. Even if PyPy is two times faster than CPython, LuaJit still remains in most of the cases the order of magnitude faster, they are comparable only when calculation bottleneck are bignum routines and not rest of the language:
The ones I've read are relatively straightforward Lua, though.
Some shootout programs look really hairy compared to normal code in their language. (The "optimized Haskell" shootout programs were, at one point, though I haven't followed it for a while.) With Lua / LuaJIT, that doesn't seem to be the case.
Besides, Mike Pall is using some of the shootout benchmarks to tune LuaJIT, so it's not surprising he has many of the top submissions.
I think it has more to do with how his runtime performs. :)
Tuning Lua code really isn't that hard, the language is tiny and has both semantics and performance characteristics that are easy to reason about accurately.
There's a good sample chapter from _Lua Programming Gems_ on Lua performance tuning (http://www.lua.org/gems/sample.pdf), FWIW. That and a good profiler will get you far.
You want me to port all my Project Euler solutions from Python to Lua so that I can tell you that LuaJIT is much faster than any Python implementation, which you already know?
Not at all, I'm just asking for comments from anybody who has experience with both. I used to use Python quite a bit, but switched to Lua a few years ago and PyPy really hasn't been on my radar.
That's really nice for PyPy. Although Eulers are mainly numerical tests. Most of the work done in many programmers "real world" are work on strings, which CPython is very good at since all its strings libs are implemented in C. So i'd like to see a comparison with a better benchmark i guess :)
IronPython is at 2.6. PyPy and Jython are at 2.5. psyco is at 2.6.
Most of my Euler solvers are compatible with Python 2.5. The ones that don't work in 2.5 end up getting excluded from the benchmark, because the benchmark only shows programs that worked and finished in less than a minute on every tested Python.
epoll was new in CPython 2.6, PyPy currently targets Python 2.5, we have a branch where we're working towards Python 2.7 support (we skipped a step), that will include epoll support.
Hey, don't get me wrong, it's great that you're making strides, and I still hope the project is a success.
My viewpoint comes as someone who was very excited initially at the thought of a drop-in replacement for CPython whose goal was to be "faster than c". It's been a very long time and it seems like pypy still has a long way to go to achieve production-readiness, so it's hard to continue being excited about the project.