Hacker News new | past | comments | ask | show | jobs | submit login

It's interesting to see these 2-9% improvements from version to version. They are always talked about with disappointment, as if they are too small, but they also keep coming, with each version being faster than the previous one. I prefer a steady 10% per version over breaking things because you are hoping for bigger numbers. Those percentages add up!



I envy these small and steady improvements!!

I spent about one week implementing PyPy's storage strategies in my language's collection types. When I finished the vector type modifications, I benchmarked it and saw the ~10% speed up claimed in the paper¹. The catch is performance increased only for unusually large vectors, like thousands of elements. Small vectors were actually slowed down by about the same amount. For some reason I decided to press on and implement it on my hash table type too which is used everywhere. That slowed the entire interpreter down by nearly 20%. The branch is still sitting there, unmerged.

I can't imagine how difficult it must have been for these guys to write a compiler and succeed at speeding up the Python interpreter.

¹ https://tratt.net/laurie/research/pubs/html/bolz_diekmann_tr...


This is happening mostly because Guido left, right? The take that CPython should be a reference implementation and thus slow always aggravated me (because, see, no other implementation can compete because every package depends on CPython kirks, in such a way that we're now removing the GIL of CPython rather than migrating to Pypy for example)


Partly, yes, but do note he is still very much involved with the faster-cpython project via Microsoft. Google faster cpython and van rossum to find some interviews and talks. You can also check out the faster-cpython project on github to read more.


It's fascinating to me that this process seems to rhyme with that of the path PHP took, with HHVM being built as a second implementation, proving that PHP could be much faster -- and the main project eventually adopting similar approaches. I wonder if that's always likely to happen when talking about languages as big as these are? Can a new implementation of it ever really compete?


Probably. Without a second implementation proving it out the bureaucracy can write it off as not possible and the demand may be less just because users don’t know what they’re missing


similar with vim vs neovim as well


There is a significant portion of NeoVim users though.


Can you elaborate? Genuinely interested.


Not the parent, but here's a blog post about it: https://geoff.greer.fm/2015/01/15/why-neovim-is-better-than-...


Guido is still involved, but he's no longer the BDFL.



> no longer the BDFL

the irony :/


They should have ceremonially defenestrated him from a ground floor window.


“Dictator” means you get to arbitrarily change the rules. In his case, to end his own term early.


And the "for life" part?


He decided it was for life, just not his.


He dictated it wasn’t the case anymore


Pypy actually works really well. It could probably get even farther if people knew about it more.

Test your packages on Pypy, people.


The only trouble is C packages, which are really really common, right? So you have a performance hit or something (or so I heard)

I guess that now the GIL is going away, pypy will become better at handling packages with native code like numpy


In my experience, pypy works with basically everything these days. I remember having some struggles with a weird fortran based extension module a few years ago, but it might work now too.

Most c extension modules should work in pypy, there's just a performance hit depending on how they're built (cffi is the most compatible).

https://doc.pypy.org/en/latest/faq.html#do-c-extension-modul...


The point of using C extensions is to have better performance. Python is already slow in general; code that use such extensions typically depend on this performance to not be unbearable (such as data science)

People wanting to use Pypy usually do so because they want better performance. Having a performance hit while using pypy is disconcerting.

I was speculating that in the future, C extensions in pypy would be faster, but I now see that the GIL is actually unrelated to this performance hit. Anyway it's really a pity.


I get what you are saying, but normally this wouldn't matter too much right? You will have a small number of calls into the C-extension that together do a lot of work. So as a percentage the ffi-overhead is small.


C extensions are fast if you use cffi. It's when it has to emulate CPython to run extensions written against CPython, that it is slow.

Please do not phrase that as a failure of Pypy. That is so weird.


It's fine with C packages these days, it's increasingly rare to find libraries that it won't work with.

That said, it has happened often enough I'm cautious about where I use it. It would suck to be dependent on pypy's excellent performance, and then find I can't do something due to library incompatibility.


This is way better than before when no C packages worked.

Now, a lot of C packages work - and where they don't it's worth raising bugs: with PyPy, but also in the downstream program - occasionally they can use something else if it looks like the fix will take a while.


Seems a bit silly to think that- Guido is still involved with Python... and in fact is the one heading the Faster CPython project at Microsoft which is responsible for many of these improvements.


As a compiler, python is an optimized (C/kernel) implementation of parse generation. JIT (PyPy) is a method that parses a 'trace' variation of grammar instead of syntax, where the optimizer compiles source objects that are not limited to python code.

It's goal is to create as many 'return' instructions as it can decide to.

GIL is merely a CPython problem but synchronization can also be a compilation problem.


Because it took 10 years to have Python 3 being as fast as Python 2 while being more strict. 2-9% means it will be another 10 years to have Python 3 being significantly faster.

Ref: https://mail.python.org/pipermail/python-dev/2016-November/1...


5.5% compounded over 5 years is a bit over 30%: not a huge amount but an easily noticeable speed-up. What were you thinking of when you typed “significantly faster”?


Compunding a decrease works differently than an increase. If something gets 10% faster twice it actually got 19% faster. In other words, the runtime is 90% of 90%, i.e. 81%.


In a certain sense, decreases are reciprocals of increases. You can calculate the reciprocal of each “faster”, or you can work with the figures as given and what you want is the reciprocal of the final result. This follows from the elementary fact that division is the same as multiplication by the reciprocal and we are only treating multiplication and division.


Not if “faster” refers to computation rate rather than runtime, in which case it becomes 100/81 i.e. 23% faster.


Yes, but I’ve literally never heard anyone say that.


It's not really possible to tell if someone saying "50% faster" means a 50% speed increase or a 50% time decrease.

Even in other contexts it can be ambiguous.

Yesterday I drove 60mph, today I drove 50% faster.

Yesterday I got there in 1 hour, today I got there 50% faster.

It's not really possible to tell them apart without looking at the numbers.


You've never seen anyone say "x% faster" where x is a number larger than 100? I find that hard to believe.


Not for a programming language because it's extremely rare for the computation rate to increase, rather than the work being done to compute something decrease.

If you've rewritten something to better use cachelines, removed saturating memory bandwidth, etc then sure you've increased The computation rate. But that's rarely how these language specific optimizations occur.


I have, in technical descriptions of compilers.


> If something gets 10% faster twice it actually got 19% faster

21%, not 19%.

It is 1.1 * 1.1 = 1.21

You are right in the opposite direction. If it got 10% slower, then it is 0.9 * 0.9 = 0.81 = 19% slower


It's easy to see there must be something wrong with this since if you get 10% faster 8 times, we can be sure that this doesn't mean you're 114% faster (1.1^8 = 2.14). You can't get more than 100% faster!

When you say something is 10% faster, what you mean is it took 10% less time to finish. So 19% is correct.


100 percent faster means doubling the speed,just like a 100 percent salary increase means doubling the salary, or 100 percent car speed increase means doubling its speed.

Unless one uses a bit more esoteric definition of speed, in which a 50 percent of car speed increase, makes it go from 100 km/h to 200 km/h, such that it arrives in half the time.


The reference really doesn't say what you posted. It just says that python 3.6 was up to 45% faster than 2.7 on some benchmarks and up to 54% slower on some others (which the author of the suite considered largely unrealistic).

Where is the evidence that before python 3 was significantly slower than 2.7 before 3.6?


That link is pretty clear that it's a non-realistic benchmark that results in a Python 3 slowdown. (And there are many benchmarks that are faster in Python 3.)


What! Why? (I couldn’t figure it out from your link)


The link seems fairly clear to me - One explanation given is that python3 represents all integers in a "long" type, whereas python2 defaulted to small ints. This gave (gives?) python2 an advantage on tasks involving manipulating lots of small integers. Most real-world python code isn't like this, though.

Interestingly they singled out pyaes as one of the worst offenders. I've also written a pure-python AES implementation, one that deliberately takes advantage of the "long" integer representation, and it beats pyaes by about 2000%.


It's not the size of the improvement that's regrettable, it's the absolute performance after the improvement.

No one is disappointed by V8's 6-8% improvement with Maglev. [1]

Because V8 is (for a scripting language) insanely fast.

And Python is not, unfortunately.

[1] https://v8.dev/blog/holiday-season-2023


What are some good stats/benchmarks on the rough order of magnitude of python vs js perf these days?


https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

In many cases, Node.js is an order of magnitude faster than CPython.

(Acknowledged: You could write your Python in C.)

(Acknowledged: PyPy exists.)


Someone please compare 3.13 to 2.3! I’d love to see how far we’ve come.


Good idea! It can be done fairly easily by people who are good with changelogs.

FWIW, the most recent changelog is at https://docs.python.org/3.13/whatsnew/3.13.html


I suspect parent meant a performance comparison...


I think it's more an issue of framing sometimes causing confusion, i.e. the ultimate trajectory is probably "crushingly slow → slow" rather than fastness coming in to it.

(h/t https://news.ycombinator.com/item?id=35906158)


Well I think they even multiply, making it even better news!


I'd rather they add up. Minus -5% runtime there, another -5% there... Soon enough, python will be so fast my scripts terminate before I even run them, allowing me to send messages to my past self.


log(2)÷log(1.1) ~= 7.27, so in principle sustained 10% improvements could double performance every 7 releases. But at some point we're bound to face diminishing returns.


.9 * x = 0.5

x ln 0.9 = ln 0.5

x = ln 0.5 / ln 0.9

x = 6.5788

So decreasing runtime by 10% 6.5788 times results in the code running in half the original time.


I think their number is the right one. Ten percent faster is not a ten percent decrease in runtime, it's about a nine percent decrease.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: