It's interesting to see these 2-9% improvements from version to version. They ar...

matheusmoreira · on Jan 9, 2024

I envy these small and steady improvements!!

I spent about one week implementing PyPy's storage strategies in my language's collection types. When I finished the vector type modifications, I benchmarked it and saw the ~10% speed up claimed in the paper¹. The catch is performance increased only for unusually large vectors, like thousands of elements. Small vectors were actually slowed down by about the same amount. For some reason I decided to press on and implement it on my hash table type too which is used everywhere. That slowed the entire interpreter down by nearly 20%. The branch is still sitting there, unmerged.

I can't imagine how difficult it must have been for these guys to write a compiler and succeed at speeding up the Python interpreter.

¹ https://tratt.net/laurie/research/pubs/html/bolz_diekmann_tr...

nextaccountic · on Jan 9, 2024

This is happening mostly because Guido left, right? The take that CPython should be a reference implementation and thus slow always aggravated me (because, see, no other implementation can compete because every package depends on CPython kirks, in such a way that we're now removing the GIL of CPython rather than migrating to Pypy for example)

korijn · on Jan 9, 2024

Partly, yes, but do note he is still very much involved with the faster-cpython project via Microsoft. Google faster cpython and van rossum to find some interviews and talks. You can also check out the faster-cpython project on github to read more.

girvo · on Jan 9, 2024

It's fascinating to me that this process seems to rhyme with that of the path PHP took, with HHVM being built as a second implementation, proving that PHP could be much faster -- and the main project eventually adopting similar approaches. I wonder if that's always likely to happen when talking about languages as big as these are? Can a new implementation of it ever really compete?

edgyquant · on Jan 10, 2024

Probably. Without a second implementation proving it out the bureaucracy can write it off as not possible and the demand may be less just because users don’t know what they’re missing

SJetKaran · on Jan 9, 2024

similar with vim vs neovim as well

zelphirkalt · on Jan 10, 2024

There is a significant portion of NeoVim users though.

beeburrt · on Jan 10, 2024

Can you elaborate? Genuinely interested.

tiltowait · on Jan 10, 2024

Not the parent, but here's a blog post about it: https://geoff.greer.fm/2015/01/15/why-neovim-is-better-than-...

bb88 · on Jan 9, 2024

Guido is still involved, but he's no longer the BDFL.

Cupprum · on Jan 9, 2024

Just to clarify BDFL.

[1]: https://en.wikipedia.org/wiki/Benevolent_dictator_for_life

paulddraper · on Jan 10, 2024

> no longer the BDFL

the irony :/

earthboundkid · on Jan 10, 2024

They should have ceremonially defenestrated him from a ground floor window.

dragonwriter · on Jan 10, 2024

“Dictator” means you get to arbitrarily change the rules. In his case, to end his own term early.

gitaarik · on Jan 10, 2024

And the "for life" part?

bialpio · on Jan 10, 2024

He decided it was for life, just not his.

edgyquant · on Jan 10, 2024

He dictated it wasn’t the case anymore

remram · on Jan 10, 2024

Pypy actually works really well. It could probably get even farther if people knew about it more.

Test your packages on Pypy, people.

nextaccountic · on Jan 10, 2024

The only trouble is C packages, which are really really common, right? So you have a performance hit or something (or so I heard)

I guess that now the GIL is going away, pypy will become better at handling packages with native code like numpy

eyegor · on Jan 10, 2024

In my experience, pypy works with basically everything these days. I remember having some struggles with a weird fortran based extension module a few years ago, but it might work now too.

Most c extension modules should work in pypy, there's just a performance hit depending on how they're built (cffi is the most compatible).

https://doc.pypy.org/en/latest/faq.html#do-c-extension-modul...

nextaccountic · on Jan 10, 2024

The point of using C extensions is to have better performance. Python is already slow in general; code that use such extensions typically depend on this performance to not be unbearable (such as data science)

People wanting to use Pypy usually do so because they want better performance. Having a performance hit while using pypy is disconcerting.

I was speculating that in the future, C extensions in pypy would be faster, but I now see that the GIL is actually unrelated to this performance hit. Anyway it's really a pity.

im3w1l · on Jan 10, 2024

I get what you are saying, but normally this wouldn't matter too much right? You will have a small number of calls into the C-extension that together do a lot of work. So as a percentage the ffi-overhead is small.

remram · on Jan 10, 2024

C extensions are fast if you use cffi. It's when it has to emulate CPython to run extensions written against CPython, that it is slow.

Please do not phrase that as a failure of Pypy. That is so weird.

Twirrim · on Jan 10, 2024

It's fine with C packages these days, it's increasingly rare to find libraries that it won't work with.

That said, it has happened often enough I'm cautious about where I use it. It would suck to be dependent on pypy's excellent performance, and then find I can't do something due to library incompatibility.

stuaxo · on Jan 10, 2024

This is way better than before when no C packages worked.

Now, a lot of C packages work - and where they don't it's worth raising bugs: with PyPy, but also in the downstream program - occasionally they can use something else if it looks like the fix will take a while.

ensignavenger · on Jan 10, 2024

Seems a bit silly to think that- Guido is still involved with Python... and in fact is the one heading the Faster CPython project at Microsoft which is responsible for many of these improvements.

onesphere · on Jan 10, 2024

As a compiler, python is an optimized (C/kernel) implementation of parse generation. JIT (PyPy) is a method that parses a 'trace' variation of grammar instead of syntax, where the optimizer compiles source objects that are not limited to python code.

It's goal is to create as many 'return' instructions as it can decide to.

GIL is merely a CPython problem but synchronization can also be a compilation problem.

hartator · on Jan 9, 2024

Because it took 10 years to have Python 3 being as fast as Python 2 while being more strict. 2-9% means it will be another 10 years to have Python 3 being significantly faster.

Ref: https://mail.python.org/pipermail/python-dev/2016-November/1...

chalst · on Jan 9, 2024

5.5% compounded over 5 years is a bit over 30%: not a huge amount but an easily noticeable speed-up. What were you thinking of when you typed “significantly faster”?

aktenlage · on Jan 9, 2024

Compunding a decrease works differently than an increase. If something gets 10% faster twice it actually got 19% faster. In other words, the runtime is 90% of 90%, i.e. 81%.

chalst · on Jan 10, 2024

In a certain sense, decreases are reciprocals of increases. You can calculate the reciprocal of each “faster”, or you can work with the figures as given and what you want is the reciprocal of the final result. This follows from the elementary fact that division is the same as multiplication by the reciprocal and we are only treating multiplication and division.

ummonk · on Jan 9, 2024

Not if “faster” refers to computation rate rather than runtime, in which case it becomes 100/81 i.e. 23% faster.

moomin · on Jan 9, 2024

Yes, but I’ve literally never heard anyone say that.

kaashif · on Jan 10, 2024

It's not really possible to tell if someone saying "50% faster" means a 50% speed increase or a 50% time decrease.

Even in other contexts it can be ambiguous.

Yesterday I drove 60mph, today I drove 50% faster.

Yesterday I got there in 1 hour, today I got there 50% faster.

It's not really possible to tell them apart without looking at the numbers.

ummonk · on Jan 10, 2024

You've never seen anyone say "x% faster" where x is a number larger than 100? I find that hard to believe.

squeaky-clean · on Jan 10, 2024

Not for a programming language because it's extremely rare for the computation rate to increase, rather than the work being done to compute something decrease.

If you've rewritten something to better use cachelines, removed saturating memory bandwidth, etc then sure you've increased The computation rate. But that's rarely how these language specific optimizations occur.

chalst · on Jan 10, 2024

I have, in technical descriptions of compilers.

BeetleB · on Jan 10, 2024

> If something gets 10% faster twice it actually got 19% faster

21%, not 19%.

It is 1.1 * 1.1 = 1.21

You are right in the opposite direction. If it got 10% slower, then it is 0.9 * 0.9 = 0.81 = 19% slower

readams · on Jan 10, 2024

It's easy to see there must be something wrong with this since if you get 10% faster 8 times, we can be sure that this doesn't mean you're 114% faster (1.1^8 = 2.14). You can't get more than 100% faster!

When you say something is 10% faster, what you mean is it took 10% less time to finish. So 19% is correct.

nuancebydefault · on Jan 10, 2024

100 percent faster means doubling the speed,just like a 100 percent salary increase means doubling the salary, or 100 percent car speed increase means doubling its speed.

Unless one uses a bit more esoteric definition of speed, in which a 50 percent of car speed increase, makes it go from 100 km/h to 200 km/h, such that it arrives in half the time.

cycomanic · on Jan 10, 2024

The reference really doesn't say what you posted. It just says that python 3.6 was up to 45% faster than 2.7 on some benchmarks and up to 54% slower on some others (which the author of the suite considered largely unrealistic).

Where is the evidence that before python 3 was significantly slower than 2.7 before 3.6?

paulddraper · on Jan 10, 2024

That link is pretty clear that it's a non-realistic benchmark that results in a Python 3 slowdown. (And there are many benchmarks that are faster in Python 3.)

bilsbie · on Jan 9, 2024

What! Why? (I couldn’t figure it out from your link)

Retr0id · on Jan 9, 2024

The link seems fairly clear to me - One explanation given is that python3 represents all integers in a "long" type, whereas python2 defaulted to small ints. This gave (gives?) python2 an advantage on tasks involving manipulating lots of small integers. Most real-world python code isn't like this, though.

Interestingly they singled out pyaes as one of the worst offenders. I've also written a pure-python AES implementation, one that deliberately takes advantage of the "long" integer representation, and it beats pyaes by about 2000%.

paulddraper · on Jan 10, 2024

It's not the size of the improvement that's regrettable, it's the absolute performance after the improvement.

No one is disappointed by V8's 6-8% improvement with Maglev. [1]

Because V8 is (for a scripting language) insanely fast.

And Python is not, unfortunately.

[1] https://v8.dev/blog/holiday-season-2023

rattray · on Jan 10, 2024

What are some good stats/benchmarks on the rough order of magnitude of python vs js perf these days?

paulddraper · on Jan 10, 2024

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

In many cases, Node.js is an order of magnitude faster than CPython.

(Acknowledged: You could write your Python in C.)

(Acknowledged: PyPy exists.)

bilsbie · on Jan 9, 2024

Someone please compare 3.13 to 2.3! I’d love to see how far we’ve come.

chalst · on Jan 9, 2024

Good idea! It can be done fairly easily by people who are good with changelogs.

FWIW, the most recent changelog is at https://docs.python.org/3.13/whatsnew/3.13.html

ska · on Jan 9, 2024

I suspect parent meant a performance comparison...

frou_dh · on Jan 10, 2024

I think it's more an issue of framing sometimes causing confusion, i.e. the ultimate trajectory is probably "crushingly slow → slow" rather than fastness coming in to it.

(h/t https://news.ycombinator.com/item?id=35906158)

technocratius · on Jan 9, 2024

Well I think they even multiply, making it even better news!

chmod775 · on Jan 9, 2024

I'd rather they add up. Minus -5% runtime there, another -5% there... Soon enough, python will be so fast my scripts terminate before I even run them, allowing me to send messages to my past self.

Qem · on Jan 9, 2024

log(2)÷log(1.1) ~= 7.27, so in principle sustained 10% improvements could double performance every 7 releases. But at some point we're bound to face diminishing returns.

hughdbrown · on Jan 10, 2024

.9 * x = 0.5

x ln 0.9 = ln 0.5

x = ln 0.5 / ln 0.9

x = 6.5788

So decreasing runtime by 10% 6.5788 times results in the code running in half the original time.

Dylan16807 · on Jan 12, 2024

I think their number is the right one. Ten percent faster is not a ten percent decrease in runtime, it's about a nine percent decrease.