Hacker News new | past | comments | ask | show | jobs | submit login
Codon: Python Compiler (usenix.org)
83 points by joak on May 8, 2023 | hide | past | favorite | 50 comments



Just for reference,

* Nuitka[0] "is a Python compiler written in Python. It's fully compatible with Python 2.6, 2.7, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 3.10, and 3.11."

* Pypy[1] "is a replacement for CPython" with builtin optimizations such as on the fly JIT compiles.

* Cython[2] "is an optimising static compiler for both the Python programming language and the extended Cython programming language... makes writing C extensions for Python as easy as Python itself."

* Numba[3] "is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code."

* Pyston[4] "is a performance-optimizing JIT for Python, and is drop-in compatible with ... CPython 3.8.12"

[0] https://github.com/Nuitka/Nuitka

[1] https://www.pypy.org/

[2] https://cython.org/

[3] https://numba.pydata.org/

[4] https://github.com/pyston/pyston


While we’re at it:

* Mypyc [0] “compiles Python modules to C extensions. It uses standard Python type hints to generate fast code. Mypyc uses mypy to perform type checking and type inference.

Mypyc can compile anything from one module to an entire codebase. The mypy project has been using mypyc to compile mypy since 2019, giving it a 4x performance boost over regular Python.”

[0] https://github.com/mypyc/mypyc


... and also now there's Mojo too (https://www.modular.com/mojo). Not the same but seems relevant.


Yea it’s gotten to the point where it’s a sub-domain of expertise in Python being on top of the various performance extensions/runtimes/hacks.

Intuitively, numba and cython still feel like the most interesting/relevant.

I recall talk a few years ago about the possibility of writing packages in “numba”. IE, targeting the subset of Python and numpy that numba supports.

Occasionally I check in with numba’s releases and they seem to be slowly but surely supporting more of Python such that surely it will start to make sense to talk about the “numba” language. Has anyone got a tighter grip on whether this than I?


I would say the most popular way to speed up a python program today is not through a compiler anymore, but using rust for hot loops and maturin (https://pypi.org/project/maturin/) for seamlessly use it in Python and provide packages.

I get the benefit of being able to "just use python" and still gain a speed up though. Plus there are many situations with rust is no possible (no time to learn such a complex language, rust is not approved by the security team).


The other edge is where your heatmap is just all red, and so "speed up hot loops" isn't going to get it done, and at that point somebody needs to bite the bullet and {hire people who know / learn} an AOT language with good performance.

or

it turns out the reason the heatmap is all red is that you were solving categorically the wrong problem, e.g. you decided to use machine vision to figure out what's in the packaging, while your competitors are just scanning the EAN-13 barcode.


All the other options that were mentioned are open-source and available to public usage. They actually exist.

Mojo on the other hand has an empty Github repo, no downloads, and no public access. You need to get approval from Modular Inc before you can even use a demo of Mojo.

Clicking the "Get started with Mojo" or the "Request access" buttons take you to the same page with a form that asks your full name and email address, but giving them your personal info does not actually give you access to Mojo. It just gives you the following message until someone at Modular Inc vets your application.

> We will contact you as soon as we are ready to onboard you into our early preview program.

All the other options mentioned in this thread actually have downloads and repos that you can check out, and you don't need to fill an application form to "Request access". They exist, right now.


Yeah I didn't mean to advocate for Mojo necessarily. I use Python for projects and like a lot about it, but often feel as if its implementations are kind of fragmented in many respects, and Mojo seems like it could potentially further that trend for me at least, at the current time. Other projects are certainly more open and established.

For me personally what I want to see is a completely open, working, high-performance Python compiler that is fully compatible with official spec 100%, in the sense that you could take any code anywhere and it would work just as well on the official interpreter as on the compiler. Maybe this means creating an official Python 4.0 spec or something that's a superset of 3.0, or something, but right now I'm not sure I see anything quite like that.

I could see Mojo being that eventually but you're right that at the moment it's far from that.


Careful though, while Mojo is exciting, it's very tied to the modular platform for now, unlike the other projects.


While we're at it:

* Taichi "is embedded in Python and uses modern just-in-time (JIT) frameworks (for example LLVM, SPIR-V) to offload the Python source code to native GPU or CPU instructions, offering the performance at both development time and runtime."

https://www.taichi-lang.org/


Very interested in the benchmark between Taichi and numba, both of them are using llvm as backend


Cython lets you incrementally add types to your python code and transpile to C. Its best feature is the html output that highlights which part of your code still has an external dependency and is thus not directly translatable to C.

Numba has an amazing jit decorator. It can even build on numpy functions. Once I had a simple sum loop and it was compiled into O(1).

Pyston is very cool, it outputs C++ code that you can statically build into your project. Still needs libpython.

Pypy speeds up plain python code. It can even install from pip. I tried it on neat-python package, solid 40% improvement.

Nuitka is different. It is a packaging system which embeds the python interpreter and your code into a distributable binary.


Not to forget CPython's own faster-cpython project which aims at JIT compiling. [0]

Also JAX[1], PyTorch[2] come with JIT compilation specifically aimed at GPU kernels "fusing" multiple higher-level operation

And NumPy/Scipy (also) uses Pythran[3], an AOT compiler not too unsimilar to Numba.

[0] https://github.com/faster-cpython/cpython [1] https://pytorch.org/docs/stable/generated/torch.compile.html [2] https://jax.readthedocs.io/ [3] https://pythran.readthedocs.io/

I think some useful classification criteria would be - does it replace running code in Python (either own interpreter or compiler), vs does it speed up certain bits, - does it aim to faithfully implement Python or does it intentionally diverge in the semantics, - does it provide low-level semantics (where numba, pythran shine) or higher-level (e.g. what PyTorch, JAX do) - target architectures (CPU, GPU offloading, ...)


And Jax, and now Mojo, and while we’re at it, Graal.Python, and all sorts of other forgotten ones.


A wikipedia entry with a side by side comparison would seem appropriate at this stage


Two questions:

1) Why so many Python compilers - should python be more open/modular so different JITs and other functionality can be plugged in/chosen at install or runtime?

2) Why has no one named a compiler "Monty"?


They don't have the same goals, nor characteristics.

Numba and cython are rarely used to write the entire program, you usually apply them on your bottleneck, and compile just this part.

Also JIT, like for numba or pypy, compile when the program runs, but projects like cython and nuikta compile ahead of time. The performance characteristics are very different, since the former need a warmup but the latter don't, however they need to be manually built.

Solutions like cython or mypyc require type hinting for getting the best speed up.

All of them except nuikta target better perfs, some in general (pyston, mypyc, cython), some targetting specifically numerical calculations (numba).

nuitka doesn't care about any of that, it takes regular Python code and compiles it. While it does provide some speedup (up to 4x), it's not the goal at all. The goal of nuikta is to able you to ship your python program as a standalone executable to the end user.

It's the most reliable way to do it, as, just like the tag line says, it's "extremly compatible". I've yet to find a Python program that nuikta couldn't compile, including one with PyQT + numpy. It's very good at it.


And Cinder


Their fannkuch benchmark seems to be a bit dishonest. They claim an enormous perf delta on https://exaloop.io/benchmarks.html but fannkuch uses factorial a lot and they define factorial with a very small (n=20) table: https://github.com/exaloop/codon/blob/fb461371613049539654c1...

Disclaimer: I've worked on several Python runtimes and compilers, but I'm not by any means out to get Codon. Just happened across this by accident while looking at their inline LLVM, which is neat.


Can you elaborate on why using the lookup table makes this objectionable/dishonest? Factorial of 20 is the largest factorial that fits in 64-bit integer. The factorial function you linked is only defined on codon's int type (64-bit signed) so it encompasses the full range of the output datatype.


It's a targeted micro-optimisation that doesn't say anything about codon's more general-case numeric performance. Unless the other implementations are using the same algorithm it's not comparing like with like.


It's not that targeted. Factorial is a reasonable thing to do, optimising for the values that fit in a 64-bit integer is a reasonable thing to do.

If I read this in some code, I wouldn't think "Huh, that's a specialised feature, I bet it's attacking a specific benchmark" I would think, "Oh, that makes sense".


Looks like cpython does the same thing if for small n.

https://github.com/python/cpython/blob/a9c6e0618f26270e2591b...


Repo for more details: https://github.com/exaloop/codon

> What is Codon?

> Codon is a high-performance Python compiler that compiles Python code to native machine code without any runtime overhead. Typical speedups over Python are on the order of 10-100x or more, on a single thread. Codon's performance is typically on par with (and sometimes better than) that of C/C++. Unlike Python, Codon supports native multithreading, which can lead to speedups many times higher still. Codon grew out of the Seq project.

> What isn't Codon?

> While Codon supports nearly all of Python's syntax, it is not a drop-in replacement, and large codebases might require modifications to be run through the Codon compiler. For example, some of Python's modules are not yet implemented within Codon, and a few of Python's dynamic features are disallowed. The Codon compiler produces detailed error messages to help identify and resolve any incompatibilities.

> Codon can be used within larger Python codebases via the @codon.jit decorator. Plain Python functions and libraries can also be called from within Codon via Python interoperability.


Last I heard Codon was a python-like language rather than actual Python and the author suggests something similar. Headline seems pretty misleading.


It was only technically python-like, but enough like it that you could conceivably call them the same thing.


Codon can't even run the whole python standard library. It changes integer semantic from arbitrary width to fixed width. Exaloop's own docs refer to the process of converting code between the two as "porting".

Seems like if it isn't pythonic enough to run python code correctly it probably shouldn't be called "python".


If it accepts a subset of python it's technically python: any codon-python program is a python program. Not full python but python nonetheless...

My understanding:

The goal is not to compile existing python code but to write new python code meant to be compiled in order to be faster. Limited use, still useful.


It is trivially easy to write a codon program which runs in both Codon and Python but behaves differently in a way that is not considered a bug in either language. That means it is not a subset, unless by that you mean that each language has a subset of the other's programs which it can run. In which case C and JavaScript are the same language too.


Is there an example of this? I know there's several areas where Cpython has some funky undefined behavior, but those aren't particularly notable because they're by definition implemention dependent.


Isn't fixed width integers as compared to arbitrary width an example of this?


The 2**64 example given here is exactly right. You don't need undefined behavior or language lawyering to hit the difference; you just need a 65 element bitset.


print(repr(2**64))


I will pass. From the license:

Terms

The Licensor hereby grants you the right to copy, modify, create derivative works, redistribute, and make non-production use of the Licensed Work.


Yeah... I was kind a keeping an eye on the project until I realized that they want to sell you a production license to actually use it.


I've found it useful to embed scripting languages in host languages and remote control them to get the right mix of static/dynamic (fast/slow) code, rather than the other way around (calling host languages from scripting languages).

But embedding Python is unfortunately a pretty big deal from what I've seen, and not as ergonomic as it could be since it's not designed to be used that way.


If I understand this correctly, it takes Python code and makes it as fast as C by compiling Python code into binary like compiling C to binary.

I want something like this for JavaScript. Given how much stuff is written in JavaScript now, that would have a huge impact on the performance of so many apps.


For Javascript, what about that method would make it more performant than Javascript's JITs?


As I understand it, JavaScript's JIT isn't quite as performant as binary compiled from a low level language like C. I'm presuming that Codon's performance is closer to that than the performance of JavaScript's JIT, but maybe that isn't correct?


How do you do, fellow python compilers?


So many compilers, it sorta reminds me of JS. Could it be that this language too has major flaws? if so why does everyone keep using it only to find out much later that they need to call out to C to do the heavy lifting only to eventually find out again that even that is not fast enough. So they need to compile all the rest of the python parts too.

Seems like a rundabout way to use these lower level language without actually acknowledging that fact. An incredible amount of effort has been wasted making Python a viable language. At least the JS people have the excuse that there was no other option.


The sad state of affairs is that C is just that hard to use for most people, so the roundabout way is pretty much the only way they can manage.


There's a ton of middle ground though. It just didn't catch on as much. Unfortunately, ecosystem is still king.


> why does everyone keep using it only to find out much later that they need to call out to C to do the heavy lifting only to eventually find out again that even that is not fast enough

Because most people/companies cannot plan a program from start to finish, without actually building it. Python is fast to program significantly faster than C/C++. If you are making a prototype, then you need to be fast to allow rapid change.

For most situations, that is good enough. For the 1% where it isn't, you bust out to C. for the 0.1% where that's not good enough, you write that again from scratch using something C-like. (yes, yes, rust exists, hello rustaceans, yes, yes I have heard about your lord and saviour. No, I'm not going to re-write everything in it, not just yet. Yes please do leave a leaflet. )

Having said that, python isn't perfect. I miss static typing, or at least a strict mode.


I think major contributor is the "Python is slow" meme. Sure it is slow when compared to C or something, but most of the time the time you save by implementing in Python massively out weights any performance gains, but since everyone keep saying that "Python is slow" it pushes some people to go and try to make it fast.

Good example of above is this line from you (highlight mine)

>An incredible amount of effort has been wasted making Python a *viable* language.

What does it even mean for a language to be "viable"? Is fast execution the only criteria? There are good reasons why we aren't writing all software in C or Assembly for maximal potential performance.


Sincere question: I wonder what fraction of Python users bump into these limitations often enough for it to be a problem. I do find Python to be more pleasant than C, though I use them in different domains -- desktop and embedded -- so it's not a direct comparison.


Every time I read a language describing itself as faster than c I cringe e stop caring


Why? Depending on the semantics of the language the compiler can make better optimizations than a C compiler could.


It says "a few python dynamic features are not supported" what are those?


Why not use 'mojo'!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: