One of the painful aspects of WASM is there's no blocking calls. You can't say "wait for the next event;" instead you must return to the outermost event loop, and wait to be called back.
How does Python-in-WASM work around that? For example, how does `for line in sys.stdin:` work if you can't actually block on stdin?
Emscripten has some support for this via the "asyncify" transform, which layers additional control flow to enable return all the way up the call stack, and then "rewind" back down into it. But this bloats the code (and is also buggy) so maybe it's not being used.
Yes, currently input goes into a propmpt() and it doesn't output anything unless you hit "Cancel" on the prompt, definitely a bad time.
Python allows you to reach in and replace the core interpreter loop, so this may be an avenue to have our own asyncify-like function pop out to JS land and restore state correctly (which we can be smart about since we are the interpreter).
This is definitely the hardest part of getting Python to work. Well, hardest after the hardest part of building a compiler toolchain like Emscripten :)
In WebAssembly.sh (https://github.com/wasmerio/webassembly.sh) they run WASM binaries in a Web Worker and then use `SharedArrayBuffer` to block the WebWorker while the main thread does some work (e.g. collect input). You could use a similar solution.
When building Runno (https://runno.dev) I forked off that project and did a bunch of other things on top to get blocking to work in Safari and non-cross-origin-isolated contexts.
Ultimately I think it's JavaScript's (or whichever host language) responsibility to block when the binary calls out (if that is the expected semantics).
Does the SharedArrayBuffer approach offer any way to tell the OS scheduler to wait? Because the only waiting method I know of is busy waiting, aka wasting cycles at 100% of a CPU core. In normal processes you can call a sleep function, which saves CPU cycles, but there is no synchronous method for that available in javascript.
It's on the embedder (Wasm VM) to provide this functionality. I'm working on a Wasm runtime [0] that is written in Rust and uses stack switching to allow you to call Rust async functions as if they were blocking. This keeps the Wasm bytecode simple (blocking), but at the same time provides high performance i/o.
There is also a proposal to bring stack switching to the browser.
I know Absurd SQL[0] uses SharedArrayBuffer and Atomics to turn the async IndexDB into sync for use by Wasm. I wander if it’s possible to use that here too although it’s obviously a little different?
In my experience, asyncify works pretty well everywhere but Safari on macOS/ARM or ios/ARM, where the stack sizes are too small to be useful. You do want to be a bit careful about where you block, which can minimize the number of functions that need to be transformed.
If you've found any bugs in Asyncify please file them! There are no open issues atm about any general bugs, aside from some corner cases with features like dynamic linking.
The PyPy team demoed a networked multiplayer browser-compiled Python game at EuroPython in 2006. Anyone remember the details? Seems it has dropped off the face of google.
Yep! I definitely want to build on and learn from existing patched versions of Python running in the web. Do you know what you folks do for synchronous I/O calls?
Of course. It will be nice to have the upcoming Python 3 version of Ren'Py based on the main branch, so we don't have to maintain patches.
Right now, most of the I/O is synchronous - the files are downloaded to the browser before a game starts, so all of the calls are fast, so far as I can tell, as they're happening within the browser.
Output is through SDL, and there's a call into a cython-defined function that calls __emscripten_sleep, with the path of calls to it listed in the ASYNCIFY_WHITELIST. That's the only place we block. (It's a bit late, so I might be misremembering the exact emscripten function.)
It's a pity WASM and all the tech around it wasn't available 20 years ago when Javascript was added to the browser, we'd all be using python now and node.js wouldn't exist at all ;)
Javascript was not fast until Google and others poured money into it because it was the only choice for the browser. Presumably Python would have had the same experience.
For Netscape in 1995, I believe Python would have been a better choice than JavaScript, but Lua would have been an even better choice, given how much smaller, simpler, and more efficient Lua is, and the eventual excellence of LuaJIT. (If only Lua indexed its array from 0 instead of 1...)
But Python and Lua didn't "look like Java" enough for Netscape.
>In 1990, Sun played with the idea of putting a PostScript interpreter in the SunOS kernel.
>Like NeWS was the Network extensible Window System, so NeFS was the Network extensible File System, or NFS 3.0.
>It was actually a great idea, just a wee bit before its time, and very poorly named and positioned!
>For example: If you want to make a copy of a file on the server, you can send a PostScript program that runs in the kernel and copies the file locally on the server in the kernel with ZERO context switches, instead of sending it over the net to the client, then back from the client to the server. Even if you rsh'ed the user command "cp" on the server, it would still incur context switching, but if your copy loop was running in the kernel then it didn't need to switch in and out and in and out for every block it copied.
>There are more examples of why it's a great idea in the paper.
>This comparison of NeWS to AJAX also applies NeFS, which is like kernel NeWS with file operations instead of a graphics library -- it also saves you lots of user/kernel context switches even if you're not doing any networking:
>>NeWS was architecturally similar to what is now called AJAX, except that NeWS coherently:
>>- used PostScript code instead of JavaScript for programming.
>>- used PostScript graphics instead of DHTML and CSS for rendering.
>>- used PostScript data instead of XML and JSON for data representation.
>It didn't go over very well because the unenlightened philistines of the time couldn't get their head around an API to the file system that wasn't compatible with creat open close read write and ioctl.
>Network Extensible File System Protocol Specification
>1.0 Introduction
>The Network Extensible File System protocol (NeFS) provides transparent remote access to shared file systems over networks. The NeFS protocol is designed to be machine, operating system, network architecture, and transport protocol independent. This document is the draft specification for the protocol. It will remain in draft form during a period of public review. Italicized comments in the document are intended to present the rationale behind elements of the design and to raise questions where there are doubts. Comments and suggestions on this draft specification are most welcome.
But if not PostScript, Python, or Lua, then at least Netscape didn't use TCL in the browser. Around 1994, long after NeWS and right before Java, Sun announced they were going to make TCL the official scripting language of the world wide web, which triggered RMS into kicking off the Great TCL War:
>And with that diplomatically worded message, RMS kicked of The Infamous TCL War.
That was Stallman's response to Sun bombastically pushing TCL as the official scripting language of the web, BEFORE Live Oak / Java was a widely known (or evangelized) thing.
>At the point anybody started talking about a Java/TCL bridge, it was already all over for TCL becoming the "ubiquitous scripting language of the Internet".
>Sun's unilateral anointment of TCL as the official Internet scripting language trigged RMS's "Why you should not use Tcl" message, which triggered the TCL War, which triggered Sun to switch to Java.
>After the TCL war finally subsided, Sun quietly pushed TCL aside and loudly evangelize Java instead. The TCL community was quite flustered and disappointed after first winning the title "ubiquitous scripting language of the Internet" and then having the title yanked away and given to Java.
>Any talk of bridges were just table scraps for TCL, the redheaded bastard stepchild sitting outside on the back porch in the rain, smoking a cigarette and commiserating with NeWS and Self.
>Tom Lord's description of what happened is insightful and accurate:
>[...] Mr. Ousterhout had, a few years prior, developed Tcl while on the faculty of UC Berkeley - mainly, I think, to have a handy tool for other research and only secondarily as an experiment in language design. And he topped it off with Tk. Tcl/Tk took off in a huge way. It was easy to understand. The source code, written in Mr. Ousterhout's methodical and lucid style, was a joy to read. At the time, about the most convenient option for developing a GUI to run on a unix system was to write C code against the Motif toolkit - an ugly, expensive, and frequently disappointing process. With Tcl/Tk in hand, people started handing out new "mini-GUIs" for this and that, like candy. Tcl/Tk started to find application in some rather intense areas, like, for example, the "control station" software for some oil rigs. It was a smash hit.
>Meanwhile, I don't think I'm letting too many cats out of the bag here, the informal Silicon Valley social network of well placed hackers were quietly and unofficially circulating some very interesting confidential whitepapers from Sun Microsystems. One of their researchers, a fellow called Mr. Gosling, had dusted off a language he'd once led the design of called "Oak". Oak was originally intended for use in embedded systems. Its basic premise was that devices ought to be Turing complete and hackable, whenever possible. Oak's approach to statically verifiable byte-code comes from that origin. Mr. Gosling came out of Carnegie Mellon University and the attiude behind Oak was popular there. As one grad student had quipped a few years earlier: "If a light switch isn't Turing Complete I don't even want to touch it."
>In light of the rising star of web browsers, the folks at Sun conceived the notion of offering up a derivative of Oak to serve as the extension language for browsers. (It is probably worth mentioning here that Mr. Gosling was earlier well known for making one of the very first unix versions of Emacs.) Oak was re-named "Java" and the rest of its history is fairly well known.
>I've read, since then, that up to around that point Brendan Eich had been working on a Scheme-based extension language for Netscape Navigator. Such was the power of the hegemony of the high level folks at Sun that the word came down on Mr. Eich: "Get it done. And make it look like Java." Staying true to his sense of Self, he quickly knocked out the first implementation of Mocha, later renamed Javascript. This phenomenon of Sun's hegemony influencing other firms turns out to be a small pattern, as you'll see.
>Mr. Ousterhout was hired by Sun (later he would spin off a Tcl-centric start-up). The R&D; team there developed a vision:
>Java would be the heavy-lifting extension language for browsers. The earliest notions of the "browser as platform" and "browser as Microsoft-killer" date back to this time. Tcl, Sun announced, was to become the "ubiquitous scripting language of the Internet". Yes, they really pimped that vision for a while. And it was "the buzz" in the Valley. It was that pronouncement from the then-intimidating Sun that led to the Tcl wars.
>Mr. Eich, bless his soul, brute-forced passed them, abandoning Scheme and inventing Javascript. [...]
Ah yes, it's a shame every website doesn't have to download and execute a python interpreter so it can slowly read scripts. Having the browser freeze after clicking a button would be a way better experience.
WASM is meant to allow for fast/efficient programs written in low level languages like C/Rust to run in the browser, not subject the client with slow clunky experiences because the developer only learned Python.
Guido responded and wondered if this could be integrated into github.dev for Python work in the browser without remote compute. That’s a very cool idea, but I wonder if this would work any better with the usual suspects (pandas, numpy, mathplotlib) than the other attempts like pyodide which make more modifications to CPython.
Yes exactly - numpy, matplotlib etc available in the browser via sensible JS interfaces could significantly impact the front-end ecosystem IMO. Exciting times and great work from the community.
I imagine the biggest downside of this approach is simply the size of the CPython implementation. Does anyone know how big it is when compiled to WASM?
I wonder if anyone has tried the same approach using MicroPython.
> I’d expect MicroPython (or Lua/mruby/etc) could be an order of magnitude smaller. Still larger (and slower) than just using JavaScript, though.
Fengari [0], a Lua interpreter written in JS, is a little over 200Kb. (And was intentionally written in JS [1] because of a variety of reasons that made WASM not work that well.
200Kb isn't that bad of a price to pay to switch languages, on most websites. It'll be about the cost of a single image added to the page. And it's fairly performant.
For most sites, the costs in terms of requests and performance will be negligible compared to what you're trying to achieve.
And Fengari makes it nice and easy to interact with JS, too. Using React with Lua's syntax was what sold me on it. No ecosystem lockout, like I'd expect with most WASM ports.
I once had to squeeze CPython down for embedding into a mobile app. I ran our workload under strace so I could include only the needed parts of the stdlib, and ended up with just under 3MB zipped. That's probably about the theoretical size limit.
This is it. This is the beginning of a revolution. Prepare your fork-picks.
Jokes aside, JS ecosystem really needs a competitor. Web developers have been cutting corner after corner for decades, with ever increasing disregard for performance and memory consumption.
Now with both Python and Rust in the browser, things may change for the better.
This looks exciting. If anyone knows how to, perhaps they can report how much space this uses in the browser? Some benchmarks? Someday soon I hope we can use this for developing code on the browser to aid with web applications.
It is definitely too early for benchmarks, this is a "I got it working!" update.
The original data file with all of the standard library was a bit over 200MB. Slashing what isn't going to be run in the browser (e.g. tkinter) and zipping the standard library got it down to about 20MB. There is probably more that could be removed, and there are modules we don't need to build that we currently do. There are other things we can do like set the less frequently used modules to be loaded asynchronously.
While I doubt this will be production ready "soon", I do hope to keep working on fixing bugs and such.
Brython is a complete re-implementation of Python, thus it doesn't support some features/libraries (at least, it didn't when I tried it last), and is not compatible with C extensions.
The demo I put in the tweet is the same code as when you type `python3` in the terminal, just running in the browser. So it is much more compatible and is mostly [1] feature complete.
[1] minus whatever libraries are likely never to be used that we ripped out
WASM can already access the DOM by calling out into Javascript. AFAIK interface types wouldn't change that approach much, except that the mapping between numeric ids and Javascript object references doesn't need to be handled by the JS shim anymore.
So the answer to the question "Can WASM access the DOM?" is both yes and no, always has been, and probably always will be ;)
The beauty of WebAssembly is that you don't need Google's permission to add support. Just send your Wasm blob to the browser and Chrome's existing Wasm runtime will just run it.
My knowledge is years out of date, but does WASM still require the application to request its maximum memory footprint up-front? Granted, that's what Sun/Oracle's JVM has been doing to allocate its heap from the OS for well over a decade, but I'm also not aware if WASM is able to use the equivalent of madvise() to tell the browser/OS that it's fine to unmap a region of memory and map it back zeroed-out when it's next needed.
Yep you need to specify the maximum memory amount up-front. Its defined as "webassembly memory pages". Each page is 64kb. You need to specify an initial and a maximum amount. The webassembly module can call memory.grow() to grow it by a page until it reaches the maximum. Though you can't "un-grow" or decrease the amount of allocated memory.
You'd need a pretty solid reason to want users to download the Python engine to run your code in their browser every time they visit an updated version of your site though. "I like writing Python better than JS" would be a sucky excuse.
If anyone does choose to do this I hope they spend significant amount of effort making their caching and code splitting optimal.
Say what you will about Jupyter notebooks and all that, but the talent pool for Python is still at a higher level. Then, this could be the worse-is-better equilibrium, but there's also a market-for-lemons situation regarding web dev these days.
My impression was that top-dollar was being payed for web-devs, with competition from some of the biggest tech giants driving the trend. Not sure that's a market for lemons, unless you are talking about the lower-end of web dev.
What the caching story like? Is it possible to cache the Python interpreter in one (unchanging between apps) blob and then send another blob with your app-specific code? I’m imagining a world where lots of apps want to use WASM Python but don’t want to have to ship the whole interpreter with their page.
Cross-Origin resource caching has been disabled by all modern browsers by now. So your page could use the same cached Python interpreter again and again, but you and example.com would each have to download the interpreter, even if it comes from the same URL.
I don’t see how that could be done without nerfing a lot of the power of JS. JS was bred by the web and is entirely async. Python has no concept of callbacks or promises etc. JS is perfect for frontend design because that’s what it was created from the ground up for
Python has always had the ability to do callbacks (ie. first class functions), though admittedly it's not a common pattern and the stdlib doesn't support it for IO (also its anonymous function syntax is poor).
There have been several libraries that provide the concept of a promise, but with modern python these are built into the stdlib with native async/await support, though many interfaces are still sync-only (though you can work around that with things like gevent that monkey-patch those interfaces to work with an event loop under the hood).
I agree that python would be a poor fit for the kind of uses that JS in the browser usually serve, but it's really a matter of ecosystem, not core language design.
i don't know the story under wasm, but i looked into what it would take to embed python into a browser years ago.
the hard part at the time was obviously all the hooks between the dom and the javascript runtime as well as concurrency story. python 2 was not built to be driven by callbacks, which is how the whole browser/javascript ecosystem works.
Nah, just catching up with what was already possible with ActiveX, Applets, Flash and PNaCL, just more cross browser and political acceptance across all parties.
Careful, you should reread the HN posting guidelines - specifically "Please don't use HN primarily for promotion. It's ok to post your own stuff occasionally, but the primary use of the site should be for curiosity." This site is for discussion, not promoting your startup, and a large portion of your comments have been just that.
Thanks for the feedback, but I'm doing this to try to help game developers solve a major problem, half of the show HN and 100% of the launch HN are startups trying to make money anyways and the point you're trying to make never gets brought up.
Think about it this way - if it solves a very real pain point for developers, which is the primary demographics of this site, then why try to suppress that..?
Maybe take that to heart then, and figure out what's missing in your messaging (or audience? there's intersection with game development here, but this is not a forum for game development).
How does Python-in-WASM work around that? For example, how does `for line in sys.stdin:` work if you can't actually block on stdin?
Emscripten has some support for this via the "asyncify" transform, which layers additional control flow to enable return all the way up the call stack, and then "rewind" back down into it. But this bloats the code (and is also buggy) so maybe it's not being used.