How is an interpreter expressed in Rust- what parts can be safe and what must be unsafe? For example the GC conceptually has access to every object at every allocation point, how does that work with the borrow checker?
Depends on the details. If you are just doing a classic AST interpreter, no particular focus on speed, there's no need for unsafe code at all, especially if you do what Python does and just use refcounting, though I have not implemented a cycle detector personally...
I think it's valuable to ask "stupid questions" like this. Can I try this out? Yes, but without looking at the page, I estimate that it'll take me 30 minutes minimum to get rust downloaded and installed, and figure out how to build the project. That's not something I'll do on a short break from work, for example. If somebody has some experience with the project in, then I'll be satisfied with that for now, and maybe enticed to try it when I do have the time. I'm gonna guess that there are quite a few lurkers with the same question, some curiosity, and not enough time or motivation to answer it for themselves immediately.
Or, if it also contained an implementation of a Rails-workalike, Iron Snake [1]. Sometimes I want to start a project just to use a neat name I happened to come up with.
Probably nothing. Python isn't slow because C is slow. It's slow because of all the boxing and runtime type checks, which are built into the core language semantics. There are other reasons that could be addressed and maybe the developers of this interpreter addressed them, but I doubt it. If they could get rid of the global interpreter lock, that'd be kind of a holy grail.
Other problem with rewriting the interpreter in Rust, though, is now you lose all the already actually fast third party modules like NumPy, PyTorch, TensorFlow, and what not that all rely on having the internal API to call C, Fortran, and CUDA. Until someone writes a CUDA interface in Rust and a BLAS implementation at least equal to OpenBLAS, if not MKL, but that doesn't seem to be a priority for NVIDIA or the scientific and numerical computing communities, who seem to be focused on Julia if they're not happy with Python, R, and MATLAB.
PyPy shows what you can do (and what you can't do) if you are willing to leave the relatively simple interpreter design behind. (native dependencies are its main problem too: having to present the CPython C API limits what you can do, especially if you can't really look into the code of the dependency)
> If they could get rid of the global interpreter lock, that'd be kind of a holy grail.
The threading semantics and general dislike for global state seem like they'd make that more likely, although the C API could make that a more difficult proposition, even in its modern "restricted" forms.
> now you lose all the already actually fast third party modules like NumPy, PyTorch, TensorFlow, and what not that all rely on having the internal API to call C, Fortran, and CUDA
Do you? Rust can work with C ABIs; why wouldn't it be able to do a passthrough for Python?
Python's C API basically exposes the interpreter internal objects (these are very well-designed internal objects!). Anyone wants to support C-extension modules for CPython need to either keep their internal representation compatible with CPython's interpreter presentation (which severely constrains what you can do in your new interpreter), or have some kind of translation layer between.
It's not an issue of being able to call C code in general, which almost anything can do. The codebases for these modules use CPython API bindings to do it, though, and those won't be available in an alternative interpreter, so you'd need to completely re-write all of these libraries to use different bindings to interface between the underlying BLAS and CUDA libs and your interpreter.
Which they can do, but that is a heck of a lot more work than just writing a Python interpreter.
Pretty much. In principle rust (or any other language) could recreate the interface but it really limits how you could design the interpreter. You either make the interface really expensive or basically just transliterate the CPython implementation into your language.
Being written in Rust vs C shouldn't affect speed too much in either direction. It's really the implementation details that matter. The JIT would likely speed things up though as standard CPython does not have a JIT. Though if you're looking for speed with python you're better off with PYPY, mostly because of their JIT which has been in development for a while now.
Depends. Nuitka, the Python compiler, takes the python code, translate it to C, then compile it. Not only does it result in a stand alone python program, but it can be up to 4 times faster. Of course, it does compile the python code as well, so it's not just a matter of implementation, but rustpython allows to compile it all to wasm, which, while not binary, is pretty fast.
Sure, but as I see it, both of those are implementation details.
The Standard CPython first compiles to bytecode, then the interpreter translates every instruction as it is ran to the correct instruction for the local platform.
A JIT like this, or PyPy compiles to bytecode, then compiles (some of?) the bytecode to the correct local instructions before running. This means it doesn't need to translate every instruction as it runs.
Projects like Nuitka/Cython compile ahead of time for the platform. So you would either distribute binaries, or have them be compiled at install time.
In any case the only point I was making is that if they made a one to one translation of CPython to RustPython it would not make much of a difference. However, the fact that they are working on a JIT definitely would.
As far as wasm, I beleive it is just building the entire RustPython interpreter as a WASM package. Which I guess would be so you could load it into a browser and then use it to interpret your python code. I suspect this approach would likely be slower than all of the above. Or at the very least be highly dependent on the Browser it is running in.
Another point to consider: the Rust implementation here likely doesn't support any of CPython's old APIs, many of which involve contending the GIL or are otherwise not great for performance. This is all speculation, but that alone might result in some nice performance boosts.
If you're hitting the GIL then yeah, that could be a big win. Though I suspect that the amount people actually hitting the GIL is significantly lower than the amount of people talking about it.
It's been a while since I've written a Python extension, but my understanding is that every single use of `PY_INCREF` or `PY_DECREF` (or any transitive use) hits the GIL, since those reference counts need to be locked.
Almost every Python extension that I've ever read is littered with those calls (macros?), which I'd expect adds up to a decent amount of contention.
You're right, but it's only an issue when the application programmer is using threads. So basically for anyone that is doing CPU intensive work, that can be broken up into chunks, and needs to share memory, and doesn't want to take the time to write their own extension to do that part.
Practically speaking, I/O is usually the bigger bottleneck. Or you can change your algorithm to not share memory, then use multiprocessing, or a task scheduler like celery to spread it over multiple machines.
There are a ton of people who hear about someone running into GIL issues and they think it is affecting them, when they're not doing anything that would be helped by removing it. A JIT compiler on the other hand, would speed up every Python program.
I only talk about the GIL when I try and write some parallel code in Python and get a -10% performance benefit. As a corollary, I only use profanity when speaking of the GIL. But I don't do that often, because I usually skip straight to C++ extensions for code that wants parallelism.
I see this repository is working on a jit, so that could lead to speed ups over the common cython implementation. I don't see any benchmarks for how this compares to pypy though, which also has a jit.
The main weakness in this repo seems to currently be lack of support for numpy/pandas and compatibility in general [0].
Why? I mean, what benefit does gil-lessness provide now that there is asynchronicity all over the place? Preemptive multithreading is a giant and endless source of bugs and most of the time you can just use some faster language if speed is truly what you need.
Years ago I wrote a rust program (pre 1.0) that linked to a python library that a co-worker wrote. We got deadlocks. Turns out we also were linking the system OpenSSL which internally uses python to deal with ca-certificates and the GIL was the culprit. So that's a why.
It really sucks that the Gilectomy thing went nowhere. It looks like perhaps the only viable way to remove GIL would require breaking compatibility, and it's kind of too soon after Py2 -> Py3.
According to Raymond Hettinger, it was easy to get rid of the GIL and replace it with a bunch of smaller locks that allowed true concurrency - but the single-thread performance was dramatically worse due to all the extra locking and unlocking, and that wasn't something they were willing to sacrifice.
Are you sure it was Raymond? Well at least Larry Hastings (the person who was attempting Gilectomy) said something similar, but even then he actually was ok with that as long as multithreaded performance was better, but unfortunately even that was worse. The main problem is Python's grabage collection through reference counting. Changing the counter requires locks, which then require cache flush every time.
There was an attempt (I think in PyPy) to use Software Transactional Memory (STM) which would solve this problem, but apparently it is still difficult to do it and looks like it did not succeed.
A Python Interpreter Written in Rust - https://news.ycombinator.com/item?id=19064069 - Feb 2019 (194 comments)