RustPython: A Python interpreter written in Rust

dang · on Aug 23, 2021

One past thread:

A Python Interpreter Written in Rust - https://news.ycombinator.com/item?id=19064069 - Feb 2019 (194 comments)

ridiculous_fish · on Aug 23, 2021

How is an interpreter expressed in Rust- what parts can be safe and what must be unsafe? For example the GC conceptually has access to every object at every allocation point, how does that work with the borrow checker?

steveklabnik · on Aug 23, 2021

Depends on the details. If you are just doing a classic AST interpreter, no particular focus on speed, there's no need for unsafe code at all, especially if you do what Python does and just use refcounting, though I have not implemented a cycle detector personally...

brundolf · on Aug 23, 2021

Reference counting and/or memory arenas and/or selective use of unsafe { }, depending on your goals

This post provided a great deep-dive anecdote on just that: https://news.ycombinator.com/item?id=28026562

gautamdivgi · on Aug 23, 2021

Does this get rid of the GIL?

MR4D · on Aug 23, 2021

Has anyone used this in the wild? (ignoring the obligatory "RustPython is not totally production-ready" comment on the github page)

Would love to hear how well it works, even anecdotally.

brson · on Aug 23, 2021

I use this daily instead of python3 as a simple calculator via the repl, and for that purpose it is indistinguishable from python.

Scarbutt · on Aug 23, 2021

Is there something impeding you from trying it?

klyrs · on Aug 23, 2021

I think it's valuable to ask "stupid questions" like this. Can I try this out? Yes, but without looking at the page, I estimate that it'll take me 30 minutes minimum to get rust downloaded and installed, and figure out how to build the project. That's not something I'll do on a short break from work, for example. If somebody has some experience with the project in, then I'll be satisfied with that for now, and maybe enticed to try it when I do have the time. I'm gonna guess that there are quite a few lurkers with the same question, some curiosity, and not enough time or motivation to answer it for themselves immediately.

MR4D · on Aug 24, 2021

You nailed it! :)

idiot900 · on Aug 23, 2021

Presumably nothing, but without asking others, the maximum value of N is 1.

Scarbutt · on Aug 23, 2021

And one may be more than enough.

yewenjie · on Aug 23, 2021

Previously -

https://news.ycombinator.com/item?id=19064069 (February 2, 2019 — 430 points, 194 comments)

leetrout · on Aug 23, 2021

Missed opportunity to call it copperhead.

ralgozino · on Aug 23, 2021

> Missed opportunity to call it copperhead.

I was thinking on Ruston, I really like it and it's a word play with "a ton of rust" :)

_ofdw · on Aug 23, 2021

Rust is specifically the name for iron oxide. Technically speaking, copper doesn't "rust". It oxidizes.

masklinn · on Aug 23, 2021

Sure but that's still only two levels of separation from rust, so probably good enough for a pun.

And technically iron oxide is already removed from the "true name": according to Graydon's recollection[0] it was named after the fungi[1].

[0] https://www.reddit.com/r/rust/comments/27jvdt/internet_archa...

[1] https://en.wikipedia.org/wiki/Rust_(fungus)

pmontra · on Aug 23, 2021

This is not only an English thing. In Italian rust is ruggine and those fungi in Italian are ruggine too.

usmannk · on Aug 23, 2021

How interesting. I learned of the fungus just a few days ago when I found it on a houseplant of mine. That's a fun connection.

The fungus looks a lot like pest eggs, so it was a relative relief to discover.

leetrout · on Aug 23, 2021

Sure, just thinking of the rusty brown color

webmaven · on Aug 23, 2021

Verdigris is the copper equivalent of rust.

unwind · on Aug 23, 2021

Or, if it also contained an implementation of a Rails-workalike, Iron Snake [1]. Sometimes I want to start a project just to use a neat name I happened to come up with.

[1] https://en.wikipedia.org/wiki/The_Iron_Snake

41209 · on Aug 23, 2021

What could this mean for speed ?

That's Python's greatest weakness imo.

nonameiguess · on Aug 23, 2021

Probably nothing. Python isn't slow because C is slow. It's slow because of all the boxing and runtime type checks, which are built into the core language semantics. There are other reasons that could be addressed and maybe the developers of this interpreter addressed them, but I doubt it. If they could get rid of the global interpreter lock, that'd be kind of a holy grail.

Other problem with rewriting the interpreter in Rust, though, is now you lose all the already actually fast third party modules like NumPy, PyTorch, TensorFlow, and what not that all rely on having the internal API to call C, Fortran, and CUDA. Until someone writes a CUDA interface in Rust and a BLAS implementation at least equal to OpenBLAS, if not MKL, but that doesn't seem to be a priority for NVIDIA or the scientific and numerical computing communities, who seem to be focused on Julia if they're not happy with Python, R, and MATLAB.

detaro · on Aug 23, 2021

PyPy shows what you can do (and what you can't do) if you are willing to leave the relatively simple interpreter design behind. (native dependencies are its main problem too: having to present the CPython C API limits what you can do, especially if you can't really look into the code of the dependency)

masklinn · on Aug 23, 2021

> If they could get rid of the global interpreter lock, that'd be kind of a holy grail.

The threading semantics and general dislike for global state seem like they'd make that more likely, although the C API could make that a more difficult proposition, even in its modern "restricted" forms.

brundolf · on Aug 23, 2021

> now you lose all the already actually fast third party modules like NumPy, PyTorch, TensorFlow, and what not that all rely on having the internal API to call C, Fortran, and CUDA

Do you? Rust can work with C ABIs; why wouldn't it be able to do a passthrough for Python?

liuliu · on Aug 23, 2021

Python's C API basically exposes the interpreter internal objects (these are very well-designed internal objects!). Anyone wants to support C-extension modules for CPython need to either keep their internal representation compatible with CPython's interpreter presentation (which severely constrains what you can do in your new interpreter), or have some kind of translation layer between.

See: https://docs.python.org/3.10/c-api/index.html and https://www.pypy.org/posts/2018/09/inside-cpyext-why-emulati...

nonameiguess · on Aug 23, 2021

It's not an issue of being able to call C code in general, which almost anything can do. The codebases for these modules use CPython API bindings to do it, though, and those won't be available in an alternative interpreter, so you'd need to completely re-write all of these libraries to use different bindings to interface between the underlying BLAS and CUDA libs and your interpreter.

Which they can do, but that is a heck of a lot more work than just writing a Python interpreter.

brundolf · on Aug 23, 2021

Oh. So the libraries are hardcoded to only work with one specific interpreter, regardless of language? That's unfortunate.

rcxdude · on Aug 24, 2021

Pretty much. In principle rust (or any other language) could recreate the interface but it really limits how you could design the interpreter. You either make the interface really expensive or basically just transliterate the CPython implementation into your language.

dec0dedab0de · on Aug 23, 2021

Being written in Rust vs C shouldn't affect speed too much in either direction. It's really the implementation details that matter. The JIT would likely speed things up though as standard CPython does not have a JIT. Though if you're looking for speed with python you're better off with PYPY, mostly because of their JIT which has been in development for a while now.

sametmax · on Aug 23, 2021

Depends. Nuitka, the Python compiler, takes the python code, translate it to C, then compile it. Not only does it result in a stand alone python program, but it can be up to 4 times faster. Of course, it does compile the python code as well, so it's not just a matter of implementation, but rustpython allows to compile it all to wasm, which, while not binary, is pretty fast.

dec0dedab0de · on Aug 23, 2021

Sure, but as I see it, both of those are implementation details.

The Standard CPython first compiles to bytecode, then the interpreter translates every instruction as it is ran to the correct instruction for the local platform.

A JIT like this, or PyPy compiles to bytecode, then compiles (some of?) the bytecode to the correct local instructions before running. This means it doesn't need to translate every instruction as it runs.

Projects like Nuitka/Cython compile ahead of time for the platform. So you would either distribute binaries, or have them be compiled at install time.

In any case the only point I was making is that if they made a one to one translation of CPython to RustPython it would not make much of a difference. However, the fact that they are working on a JIT definitely would.

As far as wasm, I beleive it is just building the entire RustPython interpreter as a WASM package. Which I guess would be so you could load it into a browser and then use it to interpret your python code. I suspect this approach would likely be slower than all of the above. Or at the very least be highly dependent on the Browser it is running in.

woodruffw · on Aug 23, 2021

Another point to consider: the Rust implementation here likely doesn't support any of CPython's old APIs, many of which involve contending the GIL or are otherwise not great for performance. This is all speculation, but that alone might result in some nice performance boosts.

dec0dedab0de · on Aug 23, 2021

If you're hitting the GIL then yeah, that could be a big win. Though I suspect that the amount people actually hitting the GIL is significantly lower than the amount of people talking about it.

woodruffw · on Aug 23, 2021

It's been a while since I've written a Python extension, but my understanding is that every single use of `PY_INCREF` or `PY_DECREF` (or any transitive use) hits the GIL, since those reference counts need to be locked.

Almost every Python extension that I've ever read is littered with those calls (macros?), which I'd expect adds up to a decent amount of contention.

dec0dedab0de · on Aug 23, 2021

You're right, but it's only an issue when the application programmer is using threads. So basically for anyone that is doing CPU intensive work, that can be broken up into chunks, and needs to share memory, and doesn't want to take the time to write their own extension to do that part.

Practically speaking, I/O is usually the bigger bottleneck. Or you can change your algorithm to not share memory, then use multiprocessing, or a task scheduler like celery to spread it over multiple machines.

There are a ton of people who hear about someone running into GIL issues and they think it is affecting them, when they're not doing anything that would be helped by removing it. A JIT compiler on the other hand, would speed up every Python program.

klyrs · on Aug 23, 2021

I only talk about the GIL when I try and write some parallel code in Python and get a -10% performance benefit. As a corollary, I only use profanity when speaking of the GIL. But I don't do that often, because I usually skip straight to C++ extensions for code that wants parallelism.

nerdponx · on Aug 23, 2021

You might be interested in PyPy, GraalPython, Mypyc, and Nuitka.

0: https://www.pypy.org/

1: https://github.com/oracle/graalpython

2: https://mypyc.readthedocs.io/en/latest/

3: https://nuitka.net/pages/overview.html

msdrigg · on Aug 23, 2021

I see this repository is working on a jit, so that could lead to speed ups over the common cython implementation. I don't see any benchmarks for how this compares to pypy though, which also has a jit.

The main weakness in this repo seems to currently be lack of support for numpy/pandas and compatibility in general [0].

[0] https://github.com/RustPython/RustPython/issues/1940

znpy · on Aug 23, 2021

Imho python' greatest weakness is the Gil.

There are many workarounds, but the Gil is still there.

If there will ever be a python 4 with breaking changes, I hope it will be to implement proper gil-less operation.

turminal · on Aug 23, 2021

Why? I mean, what benefit does gil-lessness provide now that there is asynchronicity all over the place? Preemptive multithreading is a giant and endless source of bugs and most of the time you can just use some faster language if speed is truly what you need.

mikedilger · on Aug 23, 2021

Years ago I wrote a rust program (pre 1.0) that linked to a python library that a co-worker wrote. We got deadlocks. Turns out we also were linking the system OpenSSL which internally uses python to deal with ca-certificates and the GIL was the culprit. So that's a why.

znpy · on Aug 26, 2021

asynchronicity solves the problem only when your code is io-bound (that is, you're mostly doing network i/o or disk i/o).

If you're doing computation, you're still locked to a single thread by the GIL.

Yeah you can use multi-processing, but now you're paying the price of inter-process communication.

takeda · on Aug 23, 2021

It really sucks that the Gilectomy thing went nowhere. It looks like perhaps the only viable way to remove GIL would require breaking compatibility, and it's kind of too soon after Py2 -> Py3.

dralley · on Aug 23, 2021

According to Raymond Hettinger, it was easy to get rid of the GIL and replace it with a bunch of smaller locks that allowed true concurrency - but the single-thread performance was dramatically worse due to all the extra locking and unlocking, and that wasn't something they were willing to sacrifice.

takeda · on Aug 24, 2021

Are you sure it was Raymond? Well at least Larry Hastings (the person who was attempting Gilectomy) said something similar, but even then he actually was ok with that as long as multithreaded performance was better, but unfortunately even that was worse. The main problem is Python's grabage collection through reference counting. Changing the counter requires locks, which then require cache flush every time.

There was an attempt (I think in PyPy) to use Software Transactional Memory (STM) which would solve this problem, but apparently it is still difficult to do it and looks like it did not succeed.

bluepoint · on Aug 24, 2021

will they call it ryston?