Writing Python inside Rust

6gvONxR4sf7o · on April 18, 2020

I know this project may just be for fun, but with WASM targets for all sorts of languages, I'm hoping we get to a future where mixing and matching different languages for different parts of your program will be seamless. Imagine starting a project in an easy language, then migrating pieces to a faster "bare metal" language as needed in a super piecemeal way. Same with moving pieces to a safer language as the project grows, slowly expanding the boundaries of the safe bits as appropriate.

pjmlp · on April 18, 2020

Just like the JVM and CLR I guess.

https://en.wikipedia.org/wiki/List_of_JVM_languages

https://en.wikipedia.org/wiki/List_of_CLI_languages

https://www.graalvm.org/docs/

jasode · on April 18, 2020

>Just like the JVM and CLR I guess. (List_of_JVM_languages, List_of_CLI_languages)

I don't see JVM & CLR even with the dozens of language choices on top of the virtual machine runtimes as giving you ways to do "bare metal" programming. I guess one could arguably squint and say Java's JNI or C# P/Invoke and "unsafe{}" gets you some "bare metal" capabilities but that's a stretch.

I go back to the gp's sentence: "Imagine starting a project in an easy language, then migrating pieces to a faster "bare metal" language as needed in a super piecemeal way."

For example, JVM/Java doesn't have value types[1] (yet) and that's required for programmer's control of exact and efficient memory layout for many bare metal domains. It doesn't matter what flavor of JVM language you use because you're ultimately limited by the JVM's capabilities which causes excessive pointers to pointers which is inappropriate for some high-performance code.

In C#, some future tech like CoreRT might open up some more "bare metal" programming possibilities but that's not production yet[2]. I remember you couldn't even develop a Windows Explorer right-click menu shell extensions in C# because it was a bad idea to load up 2 different versions of the NET Framework runtimes (until NET 4.0). That's not even that low a level of programming and yet C++ had no such limitation.

I use C# as much as I can but I still have to do 50% my projects in C++ because JVM/CLR are not "bare metal" enough.

[1] https://en.wikipedia.org/wiki/Criticism_of_Java#Compound_val...

[2] https://github.com/dotnet/corert#user-content-net-core-runti...

pjmlp · on April 18, 2020

Since when is WebAssembly bare metal?!?

Apparently you missed the Windows 8, 8.1 and 10 train regarding your "bare metal" compilation from C# and VB.NET code.

.NET 4.0 was released in 2008 and I have been able to use C++ as .NET language since 2001, so ....

In fact one of my first .NET projects, in 2002, was to integrate C++ RPC library into Managed C++, long replaced replaced by C++/CLI with .NET 2.0 release.

Going back to Java, IBM, Aicas, PTC, Gemalto will happily sell you Java compilers that generate AOT native code for embedded deployment targets, not to mention what is running on my phone, where 95% of the OS APIs are exposed only via Java and where your beloved C and C++ code needs to use JNI to access them.

jasode · on April 18, 2020

>Since when is WebAssembly bare metal?!?

Not sure what that is referring to.

>C++ as .NET language

We seem to be talking about 2 different things. Using "C++" style syntax (C++/CLI) in a managed language with GC is not what many systems programmers call "bare metal". I thought it was clear from context that my mention of C++ is traditional "real C++" such as gcc/clang/MSVC and not the C++/CLI.

>Going back to Java, IBM, Aicas, PTC, Gemalto will happily sell you Java compilers that generate AOT native code

But that's not the JVM runtime though. The JVM is what you originally wrote and that's the scope of what I was replying to: ("Just like the JVM and CLR I guess. (List_of_JVM_languages ...)")

The _JVM_ doesn't really give you low-level bare-metal programming and it doesn't matter what flavor of JVM-language you choose to run on top of it.

And at the risk of further muddying up the discussion with the tangent subject of Java AOT... do any of those Java compilers give true value type semantics or is still references with pointer chasing? I'm not familiar with those compilers.

>and where your beloved C and C++ code

"beloved"?!? Can we tone down this down a bit? I'm just trying to clarify that JVM & CLR really don't span the entire spectrum of programming all the way down to "bare metal" in the way low-level programmers are typically using that phrase. I thought I was making a neutral and factual statement. I.e. I'm not interested in an emotional flamewar.

AsyncAwait · on April 18, 2020

If I had to guess, I'd say OP meant that just like CLR or the JVM, WebAssembly couldn't be used to target bare metal and since Wasm is discussed in the original post, there's no practical difference between it and CLR/JVM for language intermixing at present - it's just another "common language runtime" if you will.

But I agree that the tone used wasn't the best.

pjmlp · on April 18, 2020

Maybe the tone wasn't be best one, but I would like to know how WebAssembly is bare metal programming to start with.

Also would like to know in what way compiling natively to machine code has to do with having value type semantics on the source language.

jasode · on April 18, 2020

>, but I would like to know how WebAssembly is bare metal programming to start with.

We are getting into "splitting hairs" territory but let me attempt to untangle this thread because it seems to be hung up on what "bare metal" means.

Yes, if we're using "bare metal" to only mean real semiconductor chip, WASM is not that. It's an abstract virtual machine. So yes, in that strict sense, WASM is analogous to JVM and CLR.

But.....

I'm charitably interpreting gp's comment (6gvONxR4sf7o) and he's using WASM as his _relative_ (not absolute) perspective of _that_ being "bare metal". Ok, if we play along with that, WASM is not analogous the JVM/CLR because it is lower level[1]. Thus a non-managed language like C++ can more easily target WASM-flavor-of-bare-metal for high performance rather than a managed-C++/CLI targeting .NET CLR.

Yes, it's a subtle difference. WASM is more "bare-metal-ish" than JVM ... relatively speaking. I just don't think JVM languages can really do the same thing as WASM as the Google/Mozilla/Apple/MS specifically engineered Web Assembly to be a target for low-level-bare-metal languages like C/C++. In contrast, Sun & James Gosling deliberately didn't engineer JVM Java byte code to be a compilation target for low-level C/C++.

This means something cpu-intensive like AutoCAD or possibly Adobe Premiere Pro can hypothetically be written to target WASM and will perform better than if those apps were re-written in Java to target a Java web browser plugin. E.g. Java's JVM doesn't have value types and that architecture choice is very unfriendly to storing/manipulating millions of 3d points for a CAD program. In contrast, WASM's architecture opens up a few more "bare-metal-ish" programming domains.

The various choices of JVM languages like Kotlin/Clojure/JRuby/etc actually don't address what WASM is attempting to accomplish.

[1] https://www.quora.com/How-does-Java-bytecode-compare-to-WASM...

pjmlp · on April 18, 2020

Yet, GraalVM compiles LLVM bitcode and WASM bytecode just fine.

https://www.graalvm.org/docs/reference-manual/languages/llvm...

https://www.graalvm.org/docs/reference-manual/languages/wasm...

A JVM and respective JIT compiler all written in Java.

And I still doesn't understand what WASM does for C++ that CLR doesn't do, given that I can write straight C89 or C++ with C++/CLI, just like using gcc or clang doesn't force me to use their language extensions.

6gvONxR4sf7o · on April 18, 2020

Same idea, yeah, but hopefully with better common adoption. For various reasons users didn't adopt those (e.g. numpy on JVM would be a ton of effort, especially a decade ago). But the web is an irresistable force, maybe. I'm hopeful (though not necessarily optimistic).

pjmlp · on April 18, 2020

But they did, just not with the languages that are loved on HN as "taking over the world".

Most people on JVM land, use a mix of Java, Kotlin, Scala, Clojure, Groovy, JRuby.

Whereas on CLR land it is C#, F#, VB.NET and C++/CLI (for low level stuff).

Naturally those that want to kill Java, or rather not touch Windows, aren't aware of this.

akiselev · on April 18, 2020

Every single language I've seen that has been 'loved on HN as "taking over the world"' has had C FFI interop, whether it is Node, Rust, Python, Ruby, C#, Java, Haskell, Zig, Crystal, Julia, LUA, etc.

Web assembly is the browser C FFI, not some high level platform like Java or .Net. Your examples aren't comparable.

pjmlp · on April 18, 2020

Failure to understand what bytecode based execution runtimes is all about it seems.

Also failure to understand that C ABI does not exist, rather it is the OS ABI from OS written in C, and that other OS, not written in C, don't have such thing as C ABI across all languages.

Examples of such OSes, IBM i, z/OS, Unisys ClearPath, UCSD, Unisys ClearPath, Classic Mac OS, UCSD, Native Oberon, Mesa/Cedar, Windows (plenty of stuff is .NET/COM/UWP nowadays), Android, ChromeOS, Garmin OS, and the Web.

So no, it isn't the browser C FFI, all major ones aren't even written in C for the past 20 years.

akiselev · on April 18, 2020

You have repeatedly missed the point of the discussion because you are hyper-focused on the implementation details that are irrelevant to the end user. I honestly don't know how you jumped from the JVM and CLR to Unisys Clearpath (twice) and Mesa/Cedar except as a red herring. The topic of discussion is Web Assembly which implies a modern browser.

The majority of browsers now support Web Assembly and about half the global population has a web browser and access to the internet - and now, access to an actual universal bytecode based execution runtime by nature of being part of the web browser standards instead of an OS feature or framework installation or (god forbid) Oracle TOS.

The C FFI part was an analogy. The whole point of Web Assembly is that it can't call out to just any library on the OS.

pjmlp · on April 18, 2020

Except it can, because WebAssembly has long stop being a browser only story.

By the way, JVM on the browser, Flash CrossBridge and PNaCL were there first in what concerns "universal bytecode based execution runtime by nature of being part of the web browser".

littlestymaar · on April 18, 2020

I know I haven't worked with Java for the past 6 years (time flies!), but back then none of the 2 companies I worked for used anything else than Java on the JVM. In fact, until Kotlin came I only met one guy working with Scala, and nobody for the other languages. It was in France though,not in the US.

But I think your argument holds for CLR, were I've seen the 4 languages you mention being used altogether in the same code base.

tasogare · on April 18, 2020

I know you know this, but I’ll write it for other readers: there is even more languages on CLR if we count the one not supported by MS. Also, C native libraries and thus any language that can target it, can by used as well on Windows, Mac OS and Linux with PInvoke.

pjmlp · on April 18, 2020

Hence why I posted a link to the language list on my first comment. :)

alfalfasprout · on April 18, 2020

The problem is that inherently a lot of the times when you want to drop down to C/C++, you're doing numerical code that would really benefit from SIMD operations and many of those tend to not be portable (as far as I'm aware, basic SIMD is just in the proposal stage for WASM).

unrealhoang · on April 18, 2020

Controlling memory layout of your data structures really goes a long way for tight loop performance. SIMD might be just the icing on top of the cake.

Kinrany · on April 18, 2020

I wonder if we'll ever get to a future where all code is write-only, immutable, and thrown away after compilation.

The choice of the language would become a matter of presenting the compiled code and writing replacements for new code.

The runtime and the type system can't be replaced this way though.

1f60c · on April 18, 2020

I can't tell if you're joking or not. Why would I delete the source code of every program after writing it?

Kinrany · on April 18, 2020

I mean, delete the source, but store an intermediate representation that can be decompiled into any other language for reading.

Languages with good type systems and tooling support a workflow where you mostly rely on Intellisense hints and docs, and never read the code itself.

That, but taken to its logical conclusion: if you never read the source code, you don't need to store the source code.

The missing part is the ability to define algorithms as modifications of existing algorithms.

marvy · on April 18, 2020

So, at first I thought this approach is doomed, because you lose comments, and what looks like a comment to Rust may not look like one to Python. Example:

  x = y // 2 # floor division

But then I decided to look at the docs:

https://doc.rust-lang.org/proc_macro/struct.Span.html

And I noticed source_text, which "preserves the original source code, including spaces and comments"!!!

Why not just use this from the start then?? Seems like the easy way out, no?

(Disclaimer: I don't know Rust, can't even write hello world.)

uranusjr · on April 18, 2020

The author likely just barely missed its introduction. While the article is written recently, the implementation it talks about was published first in early April 2019, right about when source_text was first introduced into nightly.

ogoffart · on April 18, 2020

I am the one who contributed Span::source_text to rust. My motivation for doing that was exactly the same as the author of the blog post, and was to help implementing the cpp! macro which embeds C++ within rust just like this python! macro. https://docs.rs/cpp/ However, I still can't make use of it today because it is not stable yet, and even proc_macro2 does not implement it (and this is not trivial to implement there)

marvy · on April 18, 2020

Wow. The fact that you care about this being stable implies that you plan to use this "for real"; the fact that you actually implemented this implies you care enough to put in some real work. So you must have a use case in mind, and I am not imaginative enough to guess what it might be. Is it just about convenience so you don't need to bother with putting your C++ code in a different file? Or is there more to it?

ogoffart · on April 19, 2020

I'm using this to write Qt bindings: https://github.com/woboq/qmetaobject-rs/

> Is it just about convenience so you don't need to bother with putting your C++ code in a different file?

Yes, mostly. I find that having the code in place makes a big difference. I do not like useless levels of indirection and context switches while coding. This way is much better then having to edit three files (the .cpp, the ffi module, and the caller) each time I want to do a call into C++ while making sure they are in sync.

marvy · on April 20, 2020

That actually makes sense! Cool!

oon · on April 18, 2020

Using Span to get the token location is exactly what I needed! Thanks. I wrote a blog post[0] the other day about making a css macro that compiles into rust for the use in Yew, a front end react like framework.

> However, in rust, there’s no way to differentiate .a.b with .a .b

Now I know that the above is incorrect. I would have never thought of spans so I thank you again

[0] https://conradludgate.com/posts/yew-css/#what-are-the-downsi...

rav · on April 18, 2020

The blog post shows two Python snippets starting with "if True:" and different indentation and says the snippets "have a different meaning". However, in this case the difference between the snippets is mainly in their syntax and not in their meaning. The example would have been better if "if True:" was replaced by "if False:" or "if foo:".

        if True:
            x()
        y()


        if True:
            x()
            y()

m-ou-se · on April 18, 2020

Good point! Updated.

ptato · on April 18, 2020

What is the use case for embedding Python code in a Rust program?

haileys · on April 18, 2020

There could be no reasonable use case for something like this, and yet it would still have artistic value and this would still be an interesting article.

It's a brilliant hack and we are on Hacker News after all.

choward · on April 18, 2020

That's exactly what I thought when I read the title. It's very frustrating how many landing pages, project readmes, blog posts, etc that don't answer the question "why?". I usually need to know that before going further when I come across something that I've never heard of before. If they don't have the "why?" somewhere easy to find I just close the tab.

dataflow · on April 18, 2020

You know, that might not be unintentional. They might want those who don't see the value to just close the tab and move on. That way only those who see the value keep reading. I imagine it helps avoid having to engage in (often endless) arguments about the validity of the use case.

brainless · on April 18, 2020

I am quite surprised by your message. This is HN and even before or outside of that, for years I have myself dived into totally random projects simply because "why not?".

To try random ideas is part of learning and when you have something to share, just do that. There will always be like minded someone to pick up.

edjrage · on April 18, 2020

Whenever I see someone ask these "whys", I have the urge to show them this.

https://www.youtube.com/watch?v=Y4hOIgRPlNU

elpatoisthebest · on April 18, 2020

It's a bit of a guilty pleasure for me to read these kinds of weird things with no practical application. It's just a little refreshing to read about something technically possibly, but frivolous (no offense meant to the OP, of course) just because it can be done.

owl57 · on April 18, 2020

"Because it can be done" is a very solid answer to "Why?"

aspenmayer · on April 18, 2020

It helps me to step outside my own context and view a familiar use case, problem, or solution with an unfamiliar perspective.

grey-area · on April 18, 2020

Because it was there.

memco · on April 18, 2020

I’m just throwing out ideas but what if I wanted to take working python code and convert it to rust? Could I use this to start as the baseline and start replacing bits of it and see that it still produces the same output?

zerkten · on April 18, 2020

There was a recent ATP.fm podcast with Chris Lattner of Swift fame. He talked about the topic of embedding Python code in Swift for TensorFlow and it overlaps with your suggestion.

NobodyNada · on April 18, 2020

Swift for Tensorflow has implemented Python interoperability, rather than embedded Python code as in this article. A Swift program can import a Python module and interact with Python objects and functions as if they were native Swift objects (minus strong typing, of course). See https://github.com/tensorflow/swift/blob/master/docs/PythonI...

akiselev · on April 18, 2020

Rust has had Python interop through the Python C API since around 2015 [1]. It's pretty low hanging fruit for any language that supports the C FFI. Rust-cpython has had simple macros for interop for ages [2] and there's even a library that uses serde macros to encode/decode pickled object [3].

This article is just a fun hack using Rust macros.

[1] https://github.com/dgrunwald/rust-cpython

[2] http://dgrunwald.github.io/rust-cpython/doc/cpython/macro.py...

[3] https://docs.rs/serde-pickle/0.6.0/serde_pickle/

tedmielczarek · on April 18, 2020

A friend of mine has a similar project called PyOxidizer that's intended for building standalone executables with Python+Rust: https://pyoxidizer.readthedocs.io/en/stable/

His primary use for it is building distributable binaries for Mercurial, which is written primarily in Python.

rtpg · on April 18, 2020

Plugins for example? Game scripts? Think of use cases for embedding Lua into a C program.

djohnston · on April 18, 2020

Maybe interfacing w an ml lib like pytorch or something?

qchris · on April 18, 2020

I suppose you could, but I believe that there are direct bindings from Rust to Pytorch at https://github.com/LaurentMazare/tch-rs. I haven't used it personally, but I've heard fairly good things about it's correctness and responsive maintainer.

danieldk · on April 18, 2020

I have written a transformer-based (BERT, XLM-RoBERTa) sequence labeler/lemmatizer/dependency parser in Rust with tch-rs [1] and it is great! Some parts do not feel rusty. E.g. dynamic typing of tensors, you get a Tensor rather than Tensor<f32>, but that's more due to how libtorch itself works. But it's a very straightforward API that exposes most of libtorch. The maintainer is also very nice and responsive.

I was also amazed how much is implemented in libtorch itself as opposed to the Python wrapper, which makes much of the Torch functionality available to other languages.

[1] https://github.com/stickeritis/sticker2 https://github.com/stickeritis/sticker-transformers

random32840 · on April 18, 2020

Someone might have jumped that hurdle but the point is it circumvents that class of hurdle. What about the next framework which doesn't have bindings? What about leveraging existing Pytorch code that's written in Python?

RcouF1uZ4gsC · on April 18, 2020

For pytorch it would probably be easier to do something based off the pytorch C++ interface.

djohnston · on April 18, 2020

Sure, I've never even used pytorch. But I'm just saying python has a fair bit of data analysis and ml tools that probably haven't found native homes in rust

amanzi · on April 18, 2020

I only had a quick scan through the article, but could this be used to create an executable for some existing Python code?

choward · on April 18, 2020

What would the point of that be versus wrapping it in a bash script? Obfuscation?

lmm · on April 18, 2020

Python is missing a nice simple way to ship native executables. If you have to send people a package to unpack that's an extra step. In theory you can use freeze, but configuring it all is a pain.

trishmapow2 · on April 18, 2020

I've only used it for small GUIs (a couple of files & dependencies) but `pyinstaller --onefile` is about as easy as it gets.

londt8 · on April 18, 2020

You can use pip to install the executable source files. Pip is preinstalled on most Libux distributions and MacOS.

granzymes · on April 18, 2020

If imports worked I could see it being very useful for making graphs. I've heard of people serializing data in other languages and then using Python to plot it.

m-ou-se · on April 18, 2020

Imports work fine. The main reason I wrote this, was to be able to use the Python matplotlib library in Rust:

https://twitter.com/m_ou_se/status/1120577172438233088

qchris · on April 18, 2020

I can confirm this--Rust's data visualization and plotting libraries are still pretty early stage. You can generally do 2D bar charts and scatter plots pretty well, but once you start jumping into more complex representations or three dimensions, you start running into some serious barriers.

uijl · on April 18, 2020

At FOSDEM there was a talk about boosting Python with Rust [1]. Might be interesting for the people checking this.

[1] https://fosdem.org/2020/schedule/event/python2020_rust/

pansa2 · on April 18, 2020

Is it common to embed one language's code directly inside another language like this?

Lua is often used tightly-coupled to C, and it doesn't have the significant-whitespace issues that Python shows here. Even so, I've only ever seen it used in separate `.lua` files, never embedded directly within `.c` files.

iddan · on April 18, 2020

I don’t think it’s common nor it’s a best practice as you can’t apply static analysis on your embedded code. But it’s really cool Rust allows that and it can be a ad-hoc solution when you need to write a piece of code fast inside of rust

axegon_ · on April 18, 2020

While I've seen people doing this, I've never done it. I've done the opposite in a small open source project of mine(arguably abandoned at this point due to lack of time and contributors). That said, there is a huge potential here - your code may require some small operation that is cheap to execute but would take you 2 minutes to write in python while it would take you an hour of doing the same in rust. I can think of some use cases for this. Nice article!

hypewatch · on April 18, 2020

Why call python from Rust? I think calling rust from python would make more sense. Use it to optimize python functions like how python libraries do with with C (i.e. numpy)

FridgeSeal · on April 18, 2020

Suppose I’m building an application in Rust that processes a lot of data, one of the steps in processing that data I maybe want to run through a Python tool like SpaCy or Flair.

How would I go about doing that? I could put the Python code behind a little http API and call it that way, but that’s a bunch of overhead and extra stuff to maintain just to analyse some text. If I embed said Python tools in my Rust code then I can call those tools with significantly less overhead and complexity.

kccqzy · on April 18, 2020

Why does it have to have a little http API? Why not just spawn the python tool?

FridgeSeal · on April 18, 2020

And communicate with it how?

If you’re suggesting spawning things over shell/cmd line I’m of the opinion that this a generally bad idea.

kccqzy · on April 18, 2020

Just fork and exec. Or if you fancy, posix_spawn a separate process. Then communicate over pipes, or if you have a lot of outputs from these tools, collect results from the file system.

Really, where does this fear of spawning things come from?

jononor · on April 18, 2020

Python module loading tends to be quite slow. Loading Tensorflow on our Heroku production environment takes 10 seconds (5 on my laptop). A client of mine running Rust and Python on Embedded Linux device found that loading numpy/pandas modules took over 5 seconds (laptop was under 1), and their computation took just x00 milliseconds. 10x overhead...

So the spawning of a short-lived subprocess approach has massive overhead, only suitable for multi-second workloads.

sfkdjf9j3j · on April 18, 2020

What's wrong with forking and using a pipe or socket?

umanwizard · on April 18, 2020

Spawning an external binary does not necessarily involve the shell (if it did, how could the shell itself spawn things? It’d be shells all the way down)

zetalemur · on April 18, 2020

> And communicate with it how?

The simplest approach is via a pipe to the process' stdin/stdout I guess. Of course you have to (de)serialize your data, but you would have to to the same if you go the HTTP route, which seems far more complex. Furthermore the suggested solution probably has a nice wrapper in the language's stdlib (e.g. `check_output()` in Python land).

staticassertion · on April 18, 2020

Ok, at that point, why not just write the Python directly in your Rust application?

dbsmith83 · on April 18, 2020

Because the code smell reeks? You've greatly increased complexity. Did you read about debugging the whitespace? What a mess.

ComputerGuru · on April 18, 2020

I wrote this [0] up a couple of weeks ago, which is a hack to put shell code in your rust, with reasons why you would want to.

[0]: https://neosmart.net/blog/2020/self-compiling-rust-code/

zengid · on April 18, 2020

I'm having issues seeing the whole post, when I scroll down i just see white, and when I click the subject headers from the menu it jumps down half the page and the mouse wheel scroll gets locked. I'm using Chrome on MacOS.

kzrdude · on April 18, 2020

Lua is a better fit for Rust, because it doesn't have the threading restrictions of (C)Python. With Lua you can run a thousand parallel vms in the same process if you want.

But the macro hacks are impressive!

amrx101 · on April 18, 2020

So I am not the only one with questions regarding its use case? I cant seem to conjure any.

random32840 · on April 18, 2020

Quickly introducing battle-tested data science frameworks, snippets and patterns without having to replicate them in Rust.

superdisk · on April 18, 2020

There's a typo where stringify! is referred to as strinfigy!

Cool article btw!

m-ou-se · on April 18, 2020

Thanks! Updated.

bsder · on April 18, 2020

Am I the only person having trouble reading this blog? That color scheme makes my eyes bleed.

It seems like it's a light gray background with a slightly (not much) darker gray text. The contrast and thin font weight is terrible.

And the pink is practically vibrating on the page.

At least the blue is okay.