>the initial version of WebAssembly is a terrible target if your language relies on the presence of a garbage collector.
Lack of GC was one of the appealing parts of WASM to me. Keep it simple. It is good to be careful about utilizing memory on your visitors machines, so you better spend a lot of thought on memory management.
Counter-intuitively, I feel as though GC results in better memory usage in this situation.
Option 1: you rely on every single developer to be careful and make no memory-leaking mistakes.
Option 2: you take this problem out of the hands of the specific application devs and give it to experts in memory management, who write a GC.
If this platform is going to be the future of distributed computing, just imagine the number of terrible devs who will be releasing their code upon you.
> Option 1: you rely on every single developer to be careful and make no memory-leaking mistakes.
Nobody is expecting every single developer to carefully write their own garbage collector. WASM is a compilation target for other languages. The language you're compiling defines how memory management works, be that C / Zig (do it yourself), Rust (compile time borrow checker), Swift/ObjC (Rc / arc) or Go/JS/Java/C# (full garbage collector built into the language).
For GCed languages like Go or C#, the design choice WASM makes is between:
1. The compiler injects its own garbage collector into the created webassembly modules, just like it does when compiling native binaries.
or 2. The webassembly virtual machine provides a garbage collector that any compiled code can use.
The benefit of (1) is that it makes webassembly virtual machines simpler and safer. (This is what the GP comment wants). (Also compilers can do language-specific optimizations to their own GC.)
But (2) has a bunch of benefits too:
- The GC can be higher performance (its not written in webassembly).
- Interoperability between WASM bundles written in different languages is much better. Eg it'd be easier for Go in a WASM bundle to talk to JS in the browser. Or Go and C# code to interoperate via WASM.
- The .wasm modules for GCed languages will be much smaller, since they don't need to compile in a garbage collector.
The Go/Java/C# you write will be the same. The difference is who provides the garbage collector. Your compiler, or the WASM virtual machine?
This comment repeats a fundamental error and misunderstanding: WASM is (currently) not able to be a compilation target for an efficient GC implementation. It lacks features needed to be able to implement GCs reasonably, like memory barriers.
So 1. is out of scope if you don't want a bloated super slow language runtime running on top of WASM (like e.g. Blazor).
The issue is known since the very beginning. Nobody in charge is willing to solve it.
Also 2. is not happening, even it was discussed also since many years.
But that makes actually "sense" form the WASM people's perspective: WASM was never meant to by concurrency to JS in the browser!
WASM is merely a performance booster for where JS sucks (like numerical code).
That WASM will allow to use other languages (and their ecosystems) in the browser is just a day dream of some. The companies behind "the web platform" won't give up on their billions worth of investments into the JS ecosystem.
So no, GC languages other than JS won't ever run "natively" on the browser in a meaningful way. WASM is crippled on purpose in this regard, and as long as the current stack holders continue to control "the web platform" this won't change.
> WASM is (currently) not able to be a compilation target for an efficient GC implementation.
Even if it was, there are other problems that mean the dream of being able to use alternative languages and runtimes on the web is far off, if it even ever happens at all.
As a hobby I'm writing a design doc for an alternative non-web system design. It enumerates some of those problems and proposes solutions, along with other dissatisfying aspects of the web (e.g. the unimplementably large size of the web specs).
It's designed to be a very lightweight set of layered specs and projects that are way cheaper to implement than HTML5, can be developed and deployed incrementally whilst providing value from the start and which places other runtimes on a level playing field vs HTML. It also addresses many of the details you need to tackle for any serious web-like system such as transiency, sandboxing, tabbed WM, cache privacy, portability and hyperlinking.
I sent it to Ian Hixson who found it interesting, but of course the sticking point is funding models. Being way cheaper to implement than the web doesn't mean it costs nothing to implement. The web benefits from the largesse of rich patrons, any alternatives would need to either find a patron or find some business model that let it grow in quiet corners until it was strong enough to be fully competitive.
> This comment repeats a fundamental error and misunderstanding
Unless I'm mistaken, you seem to be vigorously agreeing with me about performance:
I said that implementing a GC inside WASM right now is possible (eg Blazor, wasmer-go) but slower and bigger than if a GC was built into the wasm virtual machine.
You said:
> WASM is (currently) not able to be a compilation target for an efficient GC implementation. It lacks features needed to be able to implement GCs reasonably, like memory barriers. So 1. is out of scope if you don't want a bloated super slow language runtime running on top of WASM (like e.g. Blazor).
... Which reads me like, "its possible (eg blazor) but doing it the current way makes it bloated and super slow". I agree!
> Also 2. is not happening, even it was discussed also since many years.
As another commenter pointed out, wasm-GC is in the implementation phase. Its already supported in Firefox and Chrome, though in both cases behind a feature flag.
What I've understood was that GC in WASM "is totally possible right now". Which it isn't.
> As another commenter pointed out, wasm-GC is in the implementation phase. Its already supported in Firefox and Chrome, though in both cases behind a feature flag.
I don't believe anything meaningful will happen there. The "GC support" was "announced" right at the same time WASM was introduced. Half a decade later, and nothing happened. Even a high-end GC is right there, namely in the JS runtime, and all that would be needed to use it would be handing over a handful of API wrappers.
I've read a little bit on that topic last year as I wanted to know about the current state and why it takes forever to implement this triviality. But I can see there are all over the place only stalling tactics… People are coming up with infinite "but"s. Since years now. More or less since the first day WASM exists.
I came therefore to the conclusion: This multi-language promise of WASM just won't happen (in any meaningful way). The people behind "the web platform" (Google) are mostly not interested in making web applications just a poor man's "Java WebStart" and "the web platform" just an arbitrary replaceable language runtime. At the very moment you could run any language (and it's ecosystem) on the web there wouldn't be any real initiative to invest into web-apps on "the web platform"—and Google's empire would fall apart.
> What does concurrency have to do with any of this?
Nothing I guess. :-D
I am not a native speaker. I fell for a "false friend"...
I wanted to say: "WASM was never meant to be competition to JS in the browser."
> I wanted to say: "WASM was never meant to be competition to JS in the browser."
I hear what you’re saying, but there’s no evil javascript lobby group running around trying to stop other languages from becoming viable in the browser. Google doesn’t care - they don’t make less money from advertising if Go becomes a viable language for frontend web applications. Google, probably more than any of the other big tech companies, is led by engineers. And I think lots of googlers really dislike javascript and would love to have other viable options.
So why has it taken years to get GC in wasm? After attending a few IETF meetings, I’m increasingly convinced that decisions take time proportional to the number of people in the room. The wasm working group includes all the browser vendors - Google, Microsoft, Apple, Mozilla. and a lot of companies and individuals. That’s going to make any big changes to wasm take longer than everyone wants, even if everyone is on board with the proposal.
Rust is suffering from the same thing. Their inclusive decision making process has made language evolution slow to a crawl in the last few years as more and more people have put up their hands to get involved. There’s too many cooks in the kitchen and they’re getting in each other’s way.
Another take on this is Hanlon’s Razor: Never ascribe to malice what can be explained by stupidity. Nobody is trying to undermine the GC proposal. It’s just slow going. Implementations exist already (behind feature flags). Hopefully wasm-gc is released before the end of the year. We’ll see.
It lowers the surface area for bugs (and thus vulnerabilities). WASM is gloriously simple right now. Adding a garbage collector dramatically increases the security surface area.
I do believe that the tradeoff is well worth it though. We can write very high quality software (think of JVM, V8, etc) when the incentives are different than the 163748th CRUD app.
Otherwise we could just as well use a trivial brainfuck interpreter as target, that won’t have a vulnerability ever.
do you have an opinion on which direction is more likely? naively I'd say 2 would be better since it'd require less language-specific toolchain work for a better devexp for the end developer.
Option 1 - Webassembly without a built-in garbage collector - exists today. You can compile Go or C# to WASM today, and it will output a WASM module with an embedded garbage collector.
There's a draft proposal to add a built in garbage collector (option 2) to WASM. I have no idea what the current status is - maybe someone involved can chime in. I suspect some version of the WASM GC spec will land eventually.
When it does, languages like Go, C# and Java should start to become competitive with javascript as frontend web languages.
WASM GC is already in the implementation phase[1], which means, IIUC that it's a matter of time for it to make into the official standard (i.e. the spec itself is mostly done now)... the implementations are behind a feature flag in Chromium[2] and Firefox seems to be far into making it available, see the current list of WASM-related issues they're working on[3]:
My experience with Option 2, is that unless you're not memory-constrained, you'll always reach a point where you need to be GC-aware. Those objects you kept a reference two? GC can't collect them. This hot path that does allocations ? Better implement an object pool.
So when someone comments "Or we could use a language that doesn't let you write memory leaks using the type system, like Java" then we can reply that that's not true about Java. But in this comment thread people only wrote that about Rust.
...which is also the most likely reason for a memory leak when you manage your memory through refcounting (or rather: with refcounting, memory leaks are just as unlikely as with a GC).
But even without both, these days (with memory debuggers and profilers integrated into IDEs) pretty much all memory related problems are 'trivially debuggable', this stuff isn't as scary anymore as it was 20 years ago (e.g. Xcode's Instruments has a leak detector which lets you click on the line of code which caused a memory leak, and Visual Studio has similar features).
Memory leaks have never been scary in tracing GCs and they are not too scary in RC as well, though I would guess that cyclic references leaking is definitely a top 10 bug for swift and alia. Memory corruption has been/is the real issue.
I've definitely seen janky webapps with the jank persisting even after all resources are fetched. Modern JS engines are very fast, but developers are faster (in pulling in quadratically more dependencies, negating all performance improvements instantly).
Are you sure everything has been fetched and processed? Some times this happens dynamically. Compare jank to an electron app where all artifacts are preloaded and preprocessed.
To balance that, I've also seen jank in native mobile apps where developers haven't properly used the background thread, or done something strange in interface builder where the constraints don't solve properly.
> in pulling in quadratically more dependencies, negating all performance improvements instantly
Adding a lot of dependencies will affect memory consumption which may have knock-on effects due to paging if your system is low on memory, but that aside it's what and how much code is executed per frame that matters to speed.
Dynamic memory allocation and garbage collect are probably the culprits you're searching for, but improvements of javascript engines and framework libraries have largely negated this issue for most websites.
If I compare electron apps like Slack/Spotify/VS Code/etc, where all the assets are preloaded and processed, the app runs smooth. No FOUC or glitches as the page changes layout during loading.
> Slow network latency also isn't taking up gigabytes of ram either
Can't argue with that, given they are 2 different metrics.
Can’t say I’ve ever seen them jank. In my experience, compute doesn’t seem to be the limiting factor for Spotify etc., it only ever seems to be waiting for API calls to return. There really isn’t a whole lot of compute going on in a typical front end, and modern browser engines have JIT virtual machines and parallel processing with the final rendering pushed into the GPU. Even the processing power in a low end mobile is not struggling on compute for these applications
If the WASM module has no GC'able objects, then there'd be no reason for the GC to run. At least in theory. Certainly the 0 GC'able object case seems like an easy optimization.
Maybe. Everybody assumes that once WASM gets a GC framework that their favorite language, e.g. Python, will become a first-class citizen. But I suspect this won't come to pass for two related reasons:
1) There's no such thing as a universal GC. GC semantics differ across languages because the semantics matter; language developers make different choices. For example, some GCs support finalizers, others don't. Some with finalizers support resurrection, some don't; likewise, some languages specify a well-defined order for finalization (e.g. Lua defines it as the reverse order of allocation).
2) Similar to GC, there are other aspects that will prevent popular languages with complex runtimes from being compiled directly to WASM without altering language semantics or the runtime behavior relative to the standard, native environment. For example, eval.
So even with GC, the choices will likely remain the same as they are now: if you want the full experience of your favorite, rapid-development language within the WASM virtual machine, you must incur runtime overhead, up to and including double virtualization. Or, alternatively, you must contend with a bifurcation in a language ecosystem--native semantics vs WASM semantics. This will all be compounded by how programmers typically treat even the slightest differences, compromises, or concessions in behavior or performance as ritual impurities to be shunned. Ultimately, I don't expect the current status quo changing--languages other than statically compiled, strongly typed, non-GC'd languages like C, C++, Rust, or similar seeing much more usage in WASM environments than they already do, either browser-side or server-side.
I think people don’t distinguish between static languages with GC and dynamic languages. Static languages will run well, they already compile to binary or bytecode. My guess is that Java and C# could compile to WebAssembly, or JIT or AOP would be relatively small.
Dynamic languages need to read source and either interpret or JIT it. The result is much larger runtime and worse performance. It will probably always be worse than running JavaScript. A few people will want to use Python but it won’t be common.
GraalVM’s Truffle project seems to get away quite well with a universal GC implementation.
With it, one can create a competitively fast language-specific runtime simply by creating an AST-interpreter: so far the more complete ones being JS, Ruby, R, Python and even LLVM bitcode — all of them mapping to the JVM’s state-of-the-art GCs. This allows easy polyglot programs as well, where the JIT compiler can even optimize across language boundaries!
I think that languages with more niche control flow/GC semantics should just accept the tradeoffs and find some other way around - which is not unheard of in case of WASM, if I’m not mistaken things like stack pointers also have limitations in WASM vs the native C world.
That's possible partly because the Java GC featureset is nearly a superset of what other GC runtimes offer, and partly because Truffle/Graal integrate deeply with the JVM compilers for things like barriers.
Would this mean that we might end up with a new language that compiles to WebAssembly with GC? Logically, it would be statically type since anyone wanting a dynamically type language could just use Javascript.
Existing compilers for GC'd languages that target WebAssembly today have to use inefficient schemes to make their runtimes work, making the website slower than they would be otherwise, which is pretty much the entire point of OP.
And anyway, regarding "lack of GC in WASM made it appealing" -- support for high level languages with GC semantics was always a long-term goal for WebAssembly and GC was thought about by the relevant parties long before the 1.0 spec was even ratified (the initial placeholders were added as early as 2017, 2 years before 1.0 ratification); it was just not within scope for the earlier versions because it's a pretty big topic, including many things that aren't even wholly GC related e.g. value types. But this isn't surprising either; most of the earliest implementations and concerned parties were browsers, and the interactions between WebAssembly and JavaScript's memory models, along with the popularity of Javascript-targeting compilers (which suffer from many similar contortions), meant that GC was always a pretty obvious "This is a thing we need to think about in the long run" feature.
You are correct, but I disagree with the premise. Javascript and the entire ecosystem is horrible and only came to rise because there was no other option.
For the first time ever we do a have shot of making something better and our goal is to adapt it to the lowest common denominator? We most certainly don't make it easy for ourselves...
It shouldn't. It's goal is to continue running in the browsers and to have proper interactions with browser objects like the DOM. And those are garbage-collected by the browser. So you need at least some GC support in wasm to tell it that an object is released and to check if the object is still held by wasm runtime.
> and our goal is to adapt it to the lowest common denominator
How is garbage collection the lowest common denominator?
Isn't it more likely that if there never was first-party GC, everyone who wants GC is just going to bundle a bad one instead, making those sites _slower_?
I don't understand this logic. If you take gc away the people who would leak using gc will leak using malloc instead. Unless you're actually proposing that those people shouldn't be allowed to ship software?
As opposed to now, where they might be using even more memory because bring-your-own-GC solutions are that much more inefficient, as outlined in the article?
Frequently the software running inside WASM sandboxes has its own garbage collector anyway, and it's not going to perform as well as a native host collector in some cases. Especially if you start doing cross-language GC in order to talk to the host (web browser) and manipulate its objects. Of course, exposing a WASM GC API that can actually meet the needs of real software is tough, which is why it didn't happen...
If you think WASM software is going to be efficient with its memory usage, you should look closer at the memory model and take note of details like 'you can't ever shrink the heap' and 'you can't do read-only shareable mappings'
Force people to think a lot more about memory than they want to, and they'll just keep using whatever they currently use, which is probably not optimal for performance.
It's not like there is any chance of a WASM-only future any time soon, and as long as it's competing with JavaScript (and Emscripten targeting it), there's a case to be made for it to be more accommodating.
>Lack of GC was one of the appealing parts of WASM to me.
Same for me, especially paired with Zig where memory management is far more central to development. It's a shame the JS interfacing overhead is so high, and I think energy would be better spent improving that, rather adding bloat to WASM.
Lack of GC was one of the appealing parts of WASM to me. Keep it simple. It is good to be careful about utilizing memory on your visitors machines, so you better spend a lot of thought on memory management.