Firefox’s new streaming and tiering compiler

markdog12 · on Jan 17, 2018

https://lukewagner.github.io/test-tanks-compile-time/

Firefox Nightly: WebAssembly.instantiate took 227.6ms (54.4mb/s)

Chrome Canary: WebAssembly.instantiate took 8576ms (1.4mb/s)

Wow.

(Edit: And I believe that's not even using the streaming compilation mentioned in the article, it's just the new baseline compiler in action)

callahad · on Jan 17, 2018

> I believe that's not even using the streaming compilation mentioned in the article

That's correct. Streaming compilation would finish earlier, but might actually benchmark more slowly because you'd be adding in the time that the compiler is idle and waiting for the network to catch up.

Preloading the .wasm file in the test lets us measure just the speed of the compiler, independent of the network.

derefr · on Jan 17, 2018

An interesting benchmark would be the total energy usage of the CPU while loading a page in each browser. The compiler managing to idle waiting for network packets should theoretically allow some of the CPU cores to enter sleep states. (Not all, since others are still busy rendering the page, and at least one is busy doing whatever non-DMA kernel bits are involved with receiving the network packets.)

etep · on Jan 17, 2018

It should be the case that if the same amount of work is done, then the energy used will be the same. If it takes less work to compile the web assembly, then less energy (holding all other parameters the same). If you have to idle a CPU, then you probably use more energy (holding all other parameters the same) i.e. because you will spend more time and accomplish the same amount of real work (but waste energy on the idled core, albeit waste very little, accomplishing extra, but non-productive, work). Cannot let some other CPU parameter changes as a result of cores being idled (e.g. frequency gets boosted on non-idle cores as a result of dynamic frequency scaling with idle cores) to run this experiment. Thinking about CPU energy use is interesting :)

yjftsjthsd-h · on Jan 17, 2018

I don't think that's true; isn't the relationship between clock speed and power non-linear?

chongli · on Jan 18, 2018

The real killer with energy usage in browsers is idle wake ups per second. If you've got a lot of tabs open and they're all running timers, waking up, hitting the network, etc. then they're keeping the CPU from going into low power state and thus wasting a lot of energy.

Even though I'd prefer to use Firefox I tend to stick with Safari due to the battery life advantage which really shows when you open a lot of tabs.

lloeki · on Jan 18, 2018

Indeed the most dramatic improvement I saw in battery life was when macOS and Safari started cooperating around timer coalescing using App Nap.

ethbro · on Jan 17, 2018

That's for max frequency on a given process node. Power scales with voltage squared. But that doesn't say anything about wasted power. And dynamic scaling screws that up in modern chips.

I believe I could summarize things by saying the only way you can really save energy* doing the same work+ is by using a different semiconductor process (either power/leakage-reduction-focused or smaller).

* For serious values of "energy"

+ Where the same work is not always true for a given task, if one optimizes an algorithm

naikrovek · on Jan 18, 2018

Can you explain the voltage squared thing? To me, power = voltage * current.

mmwelt · on Jan 18, 2018

Even for purely resistive loads, power is proportional to the square of the voltage:

  P = V * I

But,

  I = V / R

So,

  P = V * (V / R)
    = V^2 / R

etep · on Jan 18, 2018

For chips, power scales with voltage squared. Is also true that P=IV (since both are true, these observations cannot be in contradiction). Apparently, for chips, the current must be proportional to voltage also. Glossing over some details, turning on (off) a transistor is the same as charging (discharging) a capacitor. The energy stored on a capacitor is 1/2 C V^2. If you turn on and off the transistor periodically (say with frequency f) you use 1/2 C V^2 energy f times per second (energy per unit time is power). Normally the capacitance is ignored when discussing how power changes because for a given design the capacitance is a fixed quantity.

Filligree · on Jan 18, 2018

Running a processor at higher frequency also requires increasing voltage, which increases effective capacitance by that same formula.

That's not the primary cause of the power = frequency^2 rule, but actually adds a factor on top of that.

etep · on Jan 18, 2018

I think it is linear for frequency and non-linear for voltage, i.e. P~fCV^2. But in many current CPUS, the feature that adjusts frequency also adjusts voltage. That's why I stipulated that, for my comments to be true, such shenanigans as dynamic frequency (and voltage) scaling must be "turned off." I think the OP was asking, what happens to CPU energy if you load the web page with and without the optimized compilation. The OP was interested in core sleep states, but I think that dynamic frequency scaling is a confounding factor. It would be interesting to see the measurements w/ and w/out that feature perhaps.

tspiteri · on Jan 18, 2018

To increase the frequency, you also have to increase the voltage so that the transistors charge faster, otherwise they won't be able to switch in the shorter time.

depressedpanda · on Jan 17, 2018

I ran the tests on my Nexus 5X running stock Android 8.1.

Chrome: WebAssembly.instantiate took 12935.5 ms (1 MB/s)

Firefox Nightly: WebAssembly.instantiate took 1223.1 ms (10.1 MB/s)

Yikes, one order of magnitude in difference.

alexeldeib · on Jan 17, 2018

Similar for me, a touch over 10x

FF: WebAssembly.instantiate took 280.3 ms (44.2 MB/s)

Chrome: WebAssembly.instantiate took 3022.4 ms (4.1 MB/s)

weaksauce · on Jan 17, 2018

mine was even a bit more of a change.

Chrome: WebAssembly.instantiate took 13692.8 ms (0.9 MB/s)

Firefox: WebAssembly.instantiate took 330.8 ms (37.4 MB/s)

flobbers · on Jan 18, 2018

Safari 11.0.2: 4.7mb/s Chrome: 1.2mb/s

This is on MacOS 10.13.2. I'd love to run Firefox but the battery savings and reduced heat from using Safari makes it too hard to pass up in this regard.

schance · on Jan 18, 2018

Chrome Dev: WebAssembly.instantiate took 2254 ms (5.5 MB/s)

Firefox Nightly: WebAssembly.instantiate took 158.2 ms (78.3 MB/s)

Edge 41.16299.15.0: WebAssembly.instantiate took 99.2 ms (124.8 MB/s)

I did not expect Edge to be even faster.

markdog12 · on Jan 18, 2018

As stated elsewhere in this thread, Edge compiles WebAssembly lazily, so it's basically skipping the test: https://news.ycombinator.com/item?id=16170496

justinjlynn · on Jan 18, 2018

Is the runtime performance similar to the startup performance measurement?

mort96 · on Jan 17, 2018

I'm running Firefox 57, and even here, Firefox is significantly faster; 3182.3ms (3.9mb/s) in Firefox, 7575ms (1.6mb/s) in Chrome 63.

SubiculumCode · on Jan 17, 2018

Chrome: WebAssembly.instantiate took 22861.1ms (0.5mb/s)

Nightly:WebAssembly.instantiate took 1825.3ms (6.8mb/s)

Mine was a laptop in low power mode. Yours was much faster all around.

kbumsik · on Jan 17, 2018

My old i5 Thinkpad L450:

Firefox 57: 5053.8 ms (2.4 MB/s)

Firefox Nightly 59: 454.6 ms (27.2 MB/s)

Chrome 63: 9034.9 ms (1.4 MB/s)

Wow, it's over x10 faster...

therockhead · on Jan 17, 2018

My times are similar. I also ran on Safari which is faster than Chrome but slower than Firefox.

styfle · on Jan 17, 2018

Where did you even find that repo? Was that mentioned in the article or is this your code?

callahad · on Jan 17, 2018

Linked in the article under "give it a try" in the sixth paragraph.

The repo itself is at https://github.com/lukewagner/test-tanks-compile-time

clouddrover · on Jan 18, 2018

In Firefox 58 beta I get: WebAssembly.instantiate took 200.9 ms (61.6 MB/s)

In Safari 11.0.2 I get: WebAssembly.instantiate took 2885.9 ms (4.3 MB/s)

In Vivaldi 1.13.1008.40 I get: WebAssembly.instantiate took 7719 ms (1.6 MB/s)

vitorgrs · on Jan 18, 2018

I tried Chrome Canary, Firefox Nightly and Edge. Edge was the fastest here! But I believe they are using streaming compilation already.

MaxBarraclough · on Jan 17, 2018

A two-tier JIT. Interesting to see tiered JIT compilation catch on the way it has. I seem to remember a few years ago reading that the Java HotSpot team had given up on tiered JIT compilation as being not worthwhile.

How far we've come. A whirlwind tour of todays JITs (apologies for the million links):

.Net Core seems not to use tiered compilation. It never interprets the IR; everything is run through the same JIT compiler. https://github.com/dotnet/coreclr/issues/4331

HotSpot uses three tiers these days (counting direct interpretation as a tier) - https://docs.oracle.com/javase/8/docs/technotes/guides/vm/pe...

JavaScriptCore/Nitro seems to use four - https://webkit.org/blog/3362/introducing-the-webkit-ftl-jit/

Edge's Chakra engine has two - https://blogs.msdn.microsoft.com/ie/2014/10/09/announcing-ke...

V8 seems to use two - https://v8project.blogspot.co.uk/2017/05/launching-ignition-...

Firefox's SpiderMonkey JS engine uses two - https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Sp...

hyperpape · on Jan 17, 2018

I'm not aware of any effort to retire tiered compilation. It was even promoted to be the default in Java 8 (2014). http://www.oracle.com/technetwork/articles/java/architect-ev...

The only downside I'm aware of is that it increases the pressure on the code cache. If your code cache is not large enough, it will thrash as methods are discarded then recompiled. We had significant performance problems with a server and it took quite awhile until we realized that was the cause. A cache of 256 mb was more than enough for us running a 2 million LOC monolith under Tomcat, so the absolute memory use isn't that significant. (Reference we found while researching: http://engineering.indeedblog.com/blog/2016/09/job-search-we...).

Once you know this is an issue, it's easy to monitor, but it is one more thing that can go wrong in the JVM.

MaxBarraclough · on Jan 17, 2018

Oops, I wasn't clear. I'd meant that, if I recall correctly, the HotSpot team initially experimented with combining the 'client' and 'server' JITs for tiered compilation, but decided it was a lot of complexity for little gain, and didn't commit.

Only a couple years later did they re-attempt it and stick with it.

I could be mistaken here, and I wasn't able to find anything online to support me.

pjmlp · on Jan 17, 2018

.NET focus always was native code, either AOT with NGEN or JIT on load.

The only variants of .NET with interpreter support were from 3rd party implementations, and the .NET Micro Framework, used in NETduino.

And now their focus seems to be to improve their AOT story.

Another interesting evolution was Android, with Dalvik and its basic JIT, ART with AOT on installation, to ART reboot with an interpreter in Assembly, followed by JIT and AOT code cache with PGO.

bad_user · on Jan 17, 2018

On .NET Core's focus, that's not actually true:

http://mattwarren.org/2017/12/15/How-does-.NET-JIT-a-method-...

Android optimizes for battery life, but it's also worth noting that Dalvik was a really rudimentary JIT, having no benefits from JIT compilation, only drawbacks, the ART with AOT being a good upgrade.

But tiered compilation is in a different league, being about speculating what's going to happen depending on what the process has witnessed thus far. The point of tiered compilation is to profile/guard stuff at runtime and recompile pieces of code based on changing conditions, which is how you can optimize virtual call sites or other dynamic pieces, which you can't do ahead of time because the missing part is the decompiler which can revert optimizations based on invalidated conditions.

It's really interesting actually, because you can profile a C++ app and use that to optimize your AOT compilation, but the compiler is still limited by the things it can prove ahead of time, or otherwise it would be memory unsafe.

pjmlp · on Jan 17, 2018

I wrote "And now their focus seems to be to improve their AOT story.", I didn't say anything about .NET Core.

Should have been more explicit, as I was referring to CoreRT and .NET Native.

> But tiered compilation is in a different league, being about speculating what's going to happen depending on what the process has witnessed thus far.

Just as ART was refactored on Android 7 and 8. ART with pure AOT is only for Android 5 and 6.

https://source.android.com/devices/tech/dalvik/jit-compiler

AlphaSite · on Jan 18, 2018

I think tiered and speculative optimisation are independent concepts.

Tiered is specifically that you have a fast compiler and a slow compiler (or further tiers). Speculative is as you describe.

rcthompson · on Jan 17, 2018

Is it actually a JIT? It's just compiling everything unconditionally. I guess the fact that the second tier replaces previously compiled functions with more optimized versions makes it a JIT? Or does the definition of JIT require recompiling in response to information about which code would benefit most?

MaxBarraclough · on Jan 17, 2018

> Is it actually a JIT? It's just compiling everything unconditionally.

Still counts as JIT in my book, but you're right that it's a bit subtle.

Unix-style configure/build/install isn't considered JIT.

Installing a .Net application is pretty similar, but we don't consider it JIT.

In the usual .Net model, what's distributed is IR rather than source-code. Compilation to native code happens at install time. The build-and-install process is less explicit than the Unix way, and it's less error-prone (fewer dependency issues and issues with the compiler not liking your source code).

Really it's a very similar model to the Unix one, but we call one JIT and not the other.

Oracle Java, of course, only ever compiles to native code at runtime, and never caches native code. 'Proper' JIT. (This may be set to change in the near future though.)

Interestingly, .Net seems to be moving in the direction of full static compilation, or they wouldn't be asking devs to rebuild UWP apps to incorporate framework fixes - https://aka.ms/sqfj4h/

caf · on Jan 18, 2018

It might be fun to make a source-based distribution where every binary in /usr/bin started off as a link to a script that built and installed the requested executable (over the top of the link), before executing it.

MaxBarraclough · on Jan 18, 2018

Source-based distros essentially do that, they just cache the binaries.

Various research OSs are JIT-based, of course. It looks like JX (a Java operating system) caches its native code, so it's not 'pure JIT' https://github.com/mczero80/jx/blob/5fbeae79/libs/compiler_e...

It looks like Cosmos (a C# operating system) does the same https://en.wikipedia.org/wiki/IL2CPU

uep · on Jan 18, 2018

Or how about a fuser filesystem on Linux to do the same? That sounds like an interesting idea. Just don't make the mistake of accidentally typing some obscenely large binary like firefox, chrome, or clang...

I think it would need to be integrated into the package management system pretty tightly (or have one of its own) to get all of the shared library dependencies.

_lqaf · on Jan 18, 2018

Wouldn't be difficult to modify FreeBSD to do that. /usr/ports is just a little more than one indirection away.

chrisseaton · on Jan 17, 2018

Do you think for something to be a JIT it must only compile code immediately before it's used?

In that case the only real JIT I know of is basic-block-versioning. I think almost all JITs will compile branches or methods to some extent before they are actually needed.

Yours is probably not a reasonable definition therefore. I think a JIT is just a compiler that can compile as the program is running.

rcthompson · on Jan 17, 2018

> Do you think for something to be a JIT it must only compile code immediately before it's used?

I mean, that's more or less what the name "just-in-time compiler" implies. I'm aware that the name is not necessarily a precise definition, but I'm not sure how far the definition stretches. Does JIT have a precise agreed-upon definition, or is it somewhat more vaguely defined?

chrisseaton · on Jan 17, 2018

No these terms never have precise meanings, and trying to debate them too much doesn't achieve much. But if your definition doesn't actually work for any examples of the thing you're defining at all except one then it's probably wrong.

rcthompson · on Jan 17, 2018

Ok, fair enough. I was worried that there was some precise definition I had missed, but if that's not the case, I agree there's no point in debating it.

emn13 · on Jan 17, 2018

Well, there's at least one definition that's pretty noncontroversial, if not terribly satisfying or precise: it's not a JIT if you compile well in advance of any indication the program needs to be run.

Whether that lazy-compilation strategy is fine-grained or not isn't clearcut, I believe. I think if you distribute a C program with a bash bootstrapper calling plain old gcc to compile and run the C code only when needed, even gcc might be considered a (coarse-grained, rather rudimentary) JIT in that context.

kybernetikos · on Jan 17, 2018

Like TCCBoot? https://bellard.org/tcc/tccboot.html

didibus · on Jan 18, 2018

To me it's a JIT if the compiler is needed to run the code.

If it means compile on start, it still requires the compiler to be used at load time.

Non JIT would mean you can distribute the code without the compiler. If you can't do that, its JITTED or interpreted, if instead of requiring a compiler to be present you require an interpreter.

MaxBarraclough · on Jan 18, 2018

I like this definition.

bzbarsky · on Jan 17, 2018

SpiderMonkey has an interpreter too, in addition to the two JIT tiers.

hsivonen · on Jan 17, 2018

Unsurprising, considering that browsers load code on demand and despite the original vision for Java, the JVM and CRL tend to be used for apps for which it's acceptable to have slow startup time.

hthh · on Jan 18, 2018

What is "Nitro"?

ygra · on Jan 18, 2018

Safari's JS engine, as far as I know.

sjrd · on Jan 17, 2018

Nice article.

Although, as always with articles on WebAssembly, it keeps repeating that wasm is faster than JavaScript, without ever mentioning the limitations of wasm wrt. JS (no GC, no interaction with the DOM or with JS libraries besides numbers, etc.). And that means there are zillions of developers who keep being misled in thinking stuff like "Why don't you compile to wasm to make your stuff faster?". That includes absurdities like "We should write a compiler from JavaScript to wasm to make all our JS faster!"

yzmtf2008 · on Jan 17, 2018

No one is saying you should do your whole application in WASM though, it's really just like native extensions in any dynamic language: of course, you're not going to get GC or interaction with dynamic parts of the language in your extension, but the reason might be:

- libraries written in another language, such as SQL.js

- hot spots of an application that can benefit from fast number crunching (e.g., gaming, visualization)

- truly cross-platform at native performance

etc.

I don't think anyone serious enough to use WASM in their application is making the assumption of using wasm will make all your stuff faster. It won't. It's just another performance tool, with its benefits subject to performance methodologies.

Sidebar: Google Doc is an interesting application in this perspective, given they render the entire application in a canvas, and the application itself is probably not written in JS. I'm excited what the future holds for tools like Google Doc.

sjrd · on Jan 17, 2018

I understand all that. I have been writing compilers for more than 10 years, including the compiler of Scala.js. I am not criticizing wasm nor its benefits.

What I profoundly dislike is that such good articles about wasm, written by excellent technical people, all silently ignore that. I am absolutely certain that the authors know all about it, but they don't mention that to their audience, which, for the majority, doesn't know. Therefore, that very silence (not just glossing over, but actual silence) brings misinformation to the masses.

stevemk14ebr · on Jan 17, 2018

I don't think that's absurd at all. WebASM really should be the ASM of the web, everything should be compiled to it. A little pre-compilation of JS to WebASM makes sense to me

white-flame · on Jan 17, 2018

As far as not making sense goes, sorry but your post doesn't make any to me, either.

Why would we possibly return to a manual memory management, raw pointer oriented, assembly language level of abstraction from the much richer and safer abstraction that JS already has? Wasm doesn't even have any notion of Characters or Strings! You really want to return to the days of each project having their own String libraries, because it's all built on top of raw asm?

Webasm is not for JS-style code! You can't have "a little pre-compilation of JS to WebASM", that makes zero sense. We already have the incredibly complex JIT compilation of JS to x64/ARM/etc, which necessarily interacts with the garbage collector, type system, permissions & security, browser debugging/profiling tools, etc, all of which wasm does not have any notion.

Wasm is a raw C-ish sandbox environment for cross-compilation of static languages to expose their behavior to javascript, for straight-line CPU performance in number crunching as in games, software rendering, numerical analysis, compression/decompression, etc.

felipellrocha · on Jan 17, 2018

You don't really write wasm by hand. You use a higher level language that compiles to it. I wouldn't mind using go for the web, for example, if I wanted more performance, and it takes care of all the concerns you mentioned above for you...

rmrfrmrf · on Jan 18, 2018

> Why would we possibly return to a manual memory management, raw pointer oriented, assembly language level of abstraction from the much richer and safer abstraction that JS already has?

Just wait until the JVM and Flash Runtime are ported to wasm. Downloaded and compiled on every page load :).

krapp · on Jan 18, 2018

>Just wait until the JVM and Flash Runtime are ported to wasm. Downloaded and compiled on every page load :).

Why that, instead of caching common components like any other web asset, or even having the browser act as a dependency manager?

There are bound to better solutions than "download and compile on every page load."

adrianN · on Jan 18, 2018

Users could for example install plugins into their browser.

pjmlp · on Jan 18, 2018

Slow and steady. :)

http://www.teavm.org/

https://github.com/JasonHuang3D/AJC-Flash-WebAssembly-Exampl...

http://www.mono-project.com/news/2018/01/16/mono-static-weba...

xg15 · on Jan 18, 2018

> You really want to return to the days of each project having their own String libraries, because it's all built on top of raw asm?

If C++-to-WASM via clang catches on, I think this is exactly what will happen.

hateduser2 · on Jan 17, 2018

They’re adding all those things to wasm. I think you’re mistaken about the future.

testvox · on Jan 19, 2018

Can you point me to plans around standardized string encodings in webassembly that is an interesting feature.

sjrd · on Jan 17, 2018

See my answer here: https://news.ycombinator.com/item?id=16169705

flavio81 · on Jan 17, 2018

>JavaScript, without ever mentioning the limitations of wasm wrt. JS (no GC, no interaction with the DOM or with JS libraries besides numbers, etc.).

DOM will die as soon as the industry moves to one or two good GUI toolkits that run under Webassembly and are way faster to use than the cumbersome present combination of HTML+CSS+CSS preprocessor+JS libs.

Mark my words.

pcwalton · on Jan 17, 2018

I'm nearly certain that this will not be the case. Once you reinvent everything that the DOM does, it's highly unlikely you'll end up faster than the DOM.

Everyone thinks that the rendering engines in browsers are easy to beat in terms of performance. I thought that too, until I implemented one. They are definitely beatable, but not easily, and certainly not with an architecture like that of Qt or GTK.

xg15 · on Jan 18, 2018

I'm not so sure. You don't need to reinvent everything that the DOM does as the DOM is is burdened down with all kinds of backwards compatibility concerns and conflicting design philosophies.

E.g., I don't think any sane design of a UI toolkit would include the ability to read and modify the string representation of the UI code at runtime - yet it's a critical feature for the DOM.

Likewise, you wouldn't necessarily need the ability to access and mutate arbitrary nodes of the document tree at any time. (including mutations that might change which CSS selectors apply to a node) E.g., you could only expose higher-level widgets instead or only expose variables that feed into a template. That would allow optimisations which aren't possible with CSS and DOM.

Finally, a WASM toolkit would be shipped with a particular website anyway, so it wouldn't need to be general-purpose.

On the other hand, there is a great incentive for website operators to make their site into a single unparseable blob: Ad-blockers. If every site had it's own internal data representation and internal rendering engine, that would make it almost impossible for ad-blockers to modify certain parts of the site while leaving others intact.

pcwalton · on Jan 18, 2018

> You don't need to reinvent everything that the DOM does as the DOM is is burdened down with all kinds of backwards compatibility concerns and conflicting design philosophies.

Those can largely be avoided, and they typically don't cause global performance impacts.

> E.g., I don't think any sane design of a UI toolkit would include the ability to read and modify the string representation of the UI code at runtime - yet it's a critical feature for the DOM.

That isn't a problem. innerHTML is lazily computed from the tree structure: if you don't use it, you don't pay for it.

> Likewise, you wouldn't necessarily need the ability to access and mutate arbitrary nodes of the document tree at any time. (including mutations that might change which CSS selectors apply to a node) E.g., you could only expose higher-level widgets instead or only expose variables that feed into a template.

The main benefit of this would be to eliminate restyling, but cascading is really useful from a design point of view. That's why we've seen native frameworks such as Qt and GTK+ move to style sheets. And if you reinvent restyling, it'll be a ton of work to do better—remember that Servo and Firefox Quantum have a parallel work-stealing implementation of it. I've never seen any native toolkit that even comes close to that amount of performance effort.

xg15 · on Jan 18, 2018

> That isn't a problem. innerHTML is lazily computed from the tree structure: if you don't use it, you don't pay for it.

I'm not paying for it, the DOM implementation is - with increased complexity. (E.g., HTML parsing suddenly becomes a time-critical operation because some wiseguy decided to implement animations for his website using setTimeout and innerHTML.)

And they can't drop it because a lot of sites rely on it - however, if you wrote a new, limited-purpose renderer on top of WASM, you could decide to drop it and simplify the implementation without losing much utility.

> And if you reinvent restyling, it'll be a ton of work to do better

But that's kind of my point - if you can control which parts of the tree are exposed and which mutations are valid, you might not need to implement restyling at all. (Or in reduced scope)

I'm not talking about cascading in general, but about how you can make arbitrary changes to the DOM after initial load, which the restyler has to fully support.

pcwalton · on Jan 18, 2018

> I'm not paying for it, the DOM implementation is - with increased complexity. (E.g., HTML parsing suddenly becomes a time-critical operation because some wiseguy decided to implement animations for his website using setTimeout and innerHTML.)

We're talking about performance here, not implementation complexity. Besides, it's not a win in terms of complexity if sites ship a limited subset of the Web stack to run on top of the full implementation of the Web stack that's already there.

> But that's kind of my point - if you can control which parts of the tree are exposed and which mutations are valid, you might not need to implement restyling at all. (Or in reduced scope)

Sure, you can improve performance by removing useful features. But I think it'll be a hard sell to front-end developers. Qt and GTK+ didn't add style sheets and restyling for no reason. They added those features because developers demanded them.

xg15 · on Jan 19, 2018

I think we're talking past each other.

My point is that writing custom UI renderers using canvas and WASM might become a reasonable thing to do. For that you don't need to stick to the web stack at all, you can invent whatever language, API and data model fits your needs. Those can be a lot simpler than the DOM and therefore easier to implement with good performance.

gpm · on Jan 18, 2018

Are there any resources that describe the sort of UI rendering engine you think you would need to beat browsers?

pcwalton · on Jan 18, 2018

I don't think it's really possible to do much better than the next-gen browser architecture (Servo, fully-fleshed-out Quantum) if you support the entire feature set of browsers (once typed CSSOM is a thing, anyway). You can certainly do better in constrained environments, though. For example, Leo Meyerovich is doing really neat things with data viz, where all the elements to be laid out have the same shape and you can take advantage of that to do things like GPU layout.

swsieber · on Jan 18, 2018

But if you're making your own UI kit, couldn't you just eschew CSS and the like? I was under the impression that part of the reason browser rendering is such a gnarly process is because of the reflow issues that CSS and HTML layout quirks/changes can cause. I would assume that you can implement a couple of layouts that prevent those sort of pitfalls and thus speed up rendering...

Please correct me (you know a lot, and I'm betting some of my assumptions are wrong).

pcwalton · on Jan 18, 2018

Sure, you could get rid of CSS, but in favor of what? You probably need something just like flexbox, and it's not easy to beat an optimized implementation of CSS flexbox in terms of layout performance (especially if parallelized). You could eliminate the restyling step by not having cascading and selector matching, but that hurts productivity and maintainability, which is why you see frameworks like GTK+ moving toward CSS-like styling. There's no free lunch here...

tmzt · on Jan 18, 2018

Is there a good way to experiment with exposing native calls from Firefox/Quantum to WASM?

I'm looking into building an extension to build Quantum Display Lists from a WASM vdom.

tiborsaas · on Jan 17, 2018

No, the DOM will stay for a long time, since CSS is actually a really great way to build UI-s.

Last time I checked C/C++ based UI libraries even text selection was a problem. If there were a cross platform way to build UI-s as good and feature rich as a modern browser is now, then it will slowly die.

That's the reason we have so many Electron based apps, because it makes UI building really simple.

pjmlp · on Jan 18, 2018

There isn't anything about Electron that I feel it is simple over Delphi, WinForms, JavaFX, Android, Cocoa, Qt, XAML, other than being easier for those that grew with HTML/CSS.

ghettoimp · on Jan 18, 2018

Caveat: It's been a long time since I've used GUI toolkits like Gtk+ and QT, so this may be an out-of-date perspective.

What I think of GUI toolkits, I think of lots of imperative code to build out an interface, e.g., "Create a window. Add a vertical box layout. Create button1. Change button1.font to xxx. Change button1.style to bold. Set the minimum height of button1 to 20px. Add button1 to the box. Create button2. Add button2 to the box. Tell the box to grow button2 when it is resized. Create button 3..."

The declarative style of HTML/CSS seems so much better. The grouping of elements becomes apparent just by looking at how they are nested, with no need to keep track of what gets added to what. And CSS gives you a really rich ability to select groups of elements, style them, try out new styles, reuse styles across pages, and so on.

CSS has definitely gotten really complicated. But then, I could never build anything in a GUI toolkit without constantly referencing the API docs to figure out how to do this or that, either...

switchbak · on Jan 18, 2018

That certainly used to be the case, but look at something like QML for Qt, and it's typically far more declarative and succinct than (most) HTML, eg: http://doc.qt.io/qt-5/qmlfirststeps.html

I'm not a frontend whiz by any means, but I've always found the widget-centric (GUI) approach fit my mental model better than the HTML centric one.

SPA architectures help, but I find most HTML designers tend to prefer raw HTML to any composed/widget approaches.

CSS (as a concept) is actually quite great, which is why you've seen the older GUI approaches adopt it. Qt itself is also leading more towards a reactive approach where you interact with an abstract data model, and the UI reflects the updates.

pjmlp · on Jan 18, 2018

Gtk+ is quite basic compared to RAD UI tooling, even today I wouldn't pick Glade over VB 6 or Delphi.

Qt always had a nice UI designer, with layout managers for responsive UIs, no need for imperative code to build out an interface.

Since version 5 they have QML, a JavaScript like declarative language, quite similar to J3 the first version of JavaFX.

Usually imperative UI code tends to be a thing only among developers that dislike RAD tooling, or game devs using immediate mode UIs.

protomikron · on Jan 18, 2018

> Usually imperative UI code tends to be a thing only among developers that dislike RAD tooling, or game devs using immediate mode UIs.

Actually I think that most game UIs (the menus, settings, inventories - not the actual game) are done quite well - maybe immediate mode GUIs have a place outside of gamedev?

rickycook · on Jan 17, 2018

drawing the parallel between electron and gui toolkits probably isn’t right. i think there are lots of things at play here. just for starters:

- abstracts lots of file handling

- js is much easier than most languages to pick up and learn

- js has a large dev base because of the web

- people are well versed in coding for the web, and adding a “native” layer on top of that is actually quite easy

- cross platform c/c++ in general is not so simple even without the GUI

chrismorgan · on Jan 18, 2018

We’re nowhere near the point where that would be even vaguely feasible on one crucial point: accessibility.

If you’re talking about rendering everything on a canvas, well, there’s been the occasional discussion about making it a11y-friendly, exposing content in it to screen readers and so forth, but nothing has really happened with it.

Your WebAssembly GUI toolkit is going to be completely invisible to screen readers.

dragonwriter · on Jan 17, 2018

Declarative UI systems that look a lot like HTML+CSS+JS, keep cropping up in other domains besides the web; I think, but for increasing freedom to replace the JS part with some other languages to specify behavior, the basic model will be around for a long time.

kabes · on Jan 18, 2018

You can already replace dom with a canvas renderer. Flipchart does it with a react canvas renderer, but yet it doesn't take off, so why would wasm change that?

mr_toad · on Jan 17, 2018

The DOM will be around as long as the web has documents.

eyko · on Jan 18, 2018

You seem to ignore that even in desktop and mobile, HTML(ish)+CSS(ish)+JS is taking over.

pjmlp · on Jan 18, 2018

On mobile surely not.

avaer · on Jan 17, 2018

I'll do you another. If 2D screens ever cease being the main way of interacting with computers, something like a DOM-less WASM will take over consumer computing, and the DOM will get washed away in the process.

dfox · on Jan 17, 2018

Moving from 2D to 3D does not imply that applications will all use immediate mode. And having some standard way of doing retained mode 3D (which is the major feature provided typical 3D game engines and there is no reason why DOM could not be at least partially used as the underlying data model) seems like one of the requirements for that shift to actually happen.

rictic · on Jan 17, 2018

For an example of this model today, take a look at the wonderful A-Frame library for doing 3D scenes, including web VR.

dragonwriter · on Jan 17, 2018

Or something like X3dom will take center stage.

xg15 · on Jan 18, 2018

So in the end we really do end up in the alternate future where every website is just a single giant SWF file - except instead of Flash, it's WASM. Hooray! /s

ake1 · on Jan 17, 2018

do you know of any ongoing work?

gigatexal · on Jan 17, 2018

Yeah I read the article too and thought: but what can I do with wasm?

kbumsik · on Jan 17, 2018

You can import C (and many others) libraries without re-implementing it in JS for browsers. In my company's case, we compile C-written libopus almost directly to encode audio streams into Opus on browsers, not on servers. In this way we benefit much less data traffic and server CPU loads.

QAkICoU7IDNkpFu · on Jan 18, 2018

>You can import C (and many others) libraries

This means compiling the lib to wasm, right? At first I was thinking of running JS clientside that imports some C lib somehow, which confused me.

kbumsik · on Jan 18, 2018

Yes. The C lib will be compiled into wasm. You can use the wasm-compiled code in a browser through JS wrappers. So you can say importing C lib on JS client side anyway.

Note that wasm's main objective is to run non-JS code on browsers, not for a faster JS.

Here is the best overview and tutorial I've ever seen in HN , if you are interested: https://news.ycombinator.com/item?id=15958827

gigatexal · on Jan 20, 2018

So you could do the same with Cython then, too? Interesting.

kbumsik · on Jan 22, 2018

I've never experienced with Cython. But if Cython is to compile anPython file into a C file then probably.

icen · on Jan 17, 2018

> That includes absurdities like "We should write a compiler from JavaScript to wasm to make all our JS faster!"

Is this this absurd? Given you can compile WASM whilst streaming, you should really precompile your JS into WASM if you can.

sjrd · on Jan 17, 2018

I can't tell whether you're being sarcastic or not, so I will answer as if you are not.

Yes, it is absurd. Because your wasm that was compiled from JS needs to embed an entire implementation of the dynamic nature of JS. And to make all those dynamic features remotely fast, you cannot just compile them as is. You need to use a JIT to be able to perform speculative optimizations. But then, where's the JIT? Oh, it's built inside your wasm code. And basically you end up shipping a JS interpreter+compiler+JIT as part of your wasm, instead of just the .js code. Parsing and compiling all of that will be much, much worse than parsing the .js code and feed it to the already existing JS interpreter+compiler+JIT that is in the browser.

Yoric · on Jan 17, 2018

On the other hand, you could compile well-behaved subsets of JavaScript into wasm. Kind of what asm.js was doing.

Whether that's useful in practice is an open question, but it's plausible.

jahewson · on Jan 17, 2018

It’s really inaccurate to call that “JavaScript” any more. You’re talking about a subset of JS that would never need to perform a GC! Really just “the subset of JavaScript which also happens to be C”.

sjrd · on Jan 17, 2018

asm.js was not a well-behaved subset of JavaScript. It was a well-behaved subset of assembly that happened to be encoded in JavaScript.

There is virtually no human-written JS code that is amenable to compilation to wasm in a meaningful way. At the very least, you need a (mostly) sound type system to be able to compile to wasm with a positive expected ROI.

The speed of wasm comes in a large part from the fact that it is entirely statically typed, which means we don't need the speculative optimizations (and their deoptimization guards) all over the place.

Zekio · on Jan 18, 2018

Wonder if there will ever be a version of typescript that compiles to wasm

jahewson · on Jan 17, 2018

Exactly, plus in a dynamic environment a tracing JIT can outperform AOT compiled code. This is true for both the JVM and .NET CLR, both mature platforms.

hateduser2 · on Jan 17, 2018

They’re adding gc and dom support to wasm

kroltan · on Jan 17, 2018

JS is a garbage-collected language. It would require reimplementing most of the runtime, or be stuck with code that does marshalling for everything and would possibly be slower than plain JS.

pjmlp · on Jan 17, 2018

I don't see lack of GC as limitation, just bring your alongside.

Modern CPUs also don't have hardware GC support. Intel i432 was the last attempt at it.

Interaction with DOM can be achieved with a few imported functions.

cwzwarich · on Jan 17, 2018

> Modern CPUs also don't have hardware GC support. Intel i432 was the last attempt at it.

Intel i432 was far from the last attempt. Besides all of the Lisp HW developed after it, Azul made CPUs in the 2000s with hardware support for GC. Acceleration of concurrent copying collection requires a surprisingly low amount of CPU support.

pjmlp · on Jan 17, 2018

I had my history reversed regarding i432 vs Lisp HW.

You are right I also forgot about Azul, but eventually they dropped it, because it wasn't worthwhile anymore, just like it happened with all other specialized hardware implementations.

dfox · on Jan 17, 2018

Azul (and AFAIK most lisp machines) does not do garbage collection in hardware. In these contexts the "GC HW" involves hardware acceleration of read/write barriers required by concurrent and incremental GCs (which otherwise requires the compiler/JIT to inline implementation of barriers into user code). The reason why Azul can now use stock amd64 CPUs is that they found that you can abuse the MMU to provide exactly this kind of HW accelerated barriers.

Twirrim · on Jan 17, 2018

Out of curiosity, using just released versions of browsers on this 2015 mac pro:

Firefox 57: WebAssembly.instantiate took 2990.2ms (4.1mb/s) Chrome 63: WebAssembly.instantiate took 8736.9ms (1.4mb/s) Safari 11.0.2: WebAssembly.instantiate took 10341ms (1.2mb/s)

If more speed is about to arrive, wow.

I'm curious what optimisations are needed / valuable for wasm files to improve streaming performance. I'm assuming if, e.g.:

def foo(baz): bar(baz)

...

def bar(baz): baz = baz +1

Then compilation would start and get stuck until it had a definition for bar? If so, presumably the next build time optimisations for a website will be to shuffle the code around in to as optimal an order as possible so as to improve streaming compilation speed?

kllrnohj · on Jan 17, 2018

Function declarations are independent from function bodies. So think C/C++ headers/source file splits. You don't need to know what code is in bar if you know it takes 1 argument of type int and returns an int. That's all you needed to know how to call it successfully, so you can compile foo in your example perfectly fine. You just need to patch up the call location later when bar is resolved to actual address (this is the "link" step in a typical AOT compilation chain, or done by the loader if it's a dynamic dependency)

jnordwick · on Jan 17, 2018

Considering the major optimization in compiling is inlining, knowing the function body is very important to compilation, but I guess that can be pushed off until the next tier.

kllrnohj · on Jan 17, 2018

WebAsm is an intermediate not a source language. Initial in-lining & other optimization have already been performed long before it hit your browser. There could potentially be a JIT or similar to do a secondary optimization pass in the browser if something is hot, but it's probably going to be largely considered a codegen issue rather than a runtime issue.

callahad · on Jan 17, 2018

Don't forget about the local compilation phase from C/C++/Rust to WebAssembly before you ever hit the browser. At that point, LLVM is free to optimize and inline just like with any other binary target.

rcthompson · on Jan 17, 2018

I imagine the first pass could inline functions that are already compiled and skip functions that have yet to be compiled. Maybe tools that generate wasm will start reordering the functions they send to allow optimal first-pass inlining.

rplnt · on Jan 18, 2018

I have similar, just slightly higher numbers on 2017 Mac Pro. What's baffling to me is that they are something like 30% lower than they are on my over 6 year old low end desktop with i3.

_pfxa · on Jan 17, 2018

I'm not familiar with Web Assembly, but the recent trend is that as the downloads become faster, web performance in a vanilla browser becomes slower, because websites just send more stuff to you. Pages grow toward infinity. Also, if, like @sjrd mentioned, this code can't manipulate DOM or can use only a restricted set of JS objects, the where will the gain be? Is this intended to be used for number crunching code in the browser runtime? Help bitcoin miner scripts? What's the purpose then?

yzmtf2008 · on Jan 17, 2018

Examples of applications that require a lot of number crunching: gaming, visualization, graphics processing.

The list goes on, but the idea is that certain hot spots in an application will be able to benefit from having a fast number crunching engine.

sudhirj · on Jan 17, 2018

DOM manipulation is already on the cards, should be out soon. This is likely a separate sub-team of people just working on making it as fast as possible.

pas · on Jan 18, 2018

Could you give details on "soon"?

All I was able to find is this issue: https://github.com/WebAssembly/design/issues/1079 with not activity for a long time

linclark · on Jan 18, 2018

With the more recent host bindings proposal[1], direct DOM access was decoupled from GC. This is expected to move much more quickly.

[1] https://github.com/WebAssembly/host-bindings/blob/master/pro...

seren · on Jan 17, 2018

Running a native app compiled to wasm into the browser ? So the opposite of running an Electron app?

wlesieutre · on Jan 17, 2018

Calling it now:

1) Write a Windows app

2) Run it in the browser with wasm

3) Stuff that into Electron and distribute to Mac/Linux/Windows

Why distribute the electron wrapped wasm on Windows instead of using the real native Windows app? It's more consistent this way! Single codebase! Developer efficiencies!

swsieber · on Jan 17, 2018

Small deviation - write a native linux or mac app (instead of targeting Windows for the initial app)... mostly because I feel like developing on those platforms is so much more enjoyable.

wlesieutre · on Jan 17, 2018

Good point. Linux is probably the easier one because you'd need to build your UI toolkit into it and we don't have the source for Cocoa or whatever this year's Windows UI toolkit is called.

pjmlp · on Jan 17, 2018

Start here,

http://www.mono-project.com/news/2018/01/16/mono-static-weba...

krapp · on Jan 17, 2018

I realize this is likely to be the most likely way native WASM apps are implemented because it's the most obvious to web developers, but I also think it's the wrong approach.

Webassembly.org's own docs mention that it's intended to be agnostic about its runtime environment[0]. Electron is for packaging HTML, CSS and JS into a "native" application, but WASM doesn't actually need that if it's running outside the web.

Why not a native runtime on top of a cross-platform library like SDL? Just because it's "Web Assembly" doesn't mean it has to be limited to webdev paradigms.

[0]http://webassembly.org/docs/non-web/

viraptor · on Jan 17, 2018

Not wasm yet, but you can definitely stick it into electron: https://bellard.org/jslinux/

logn · on Jan 17, 2018

@sjrd is talking about limitations for JS not other languages.

C, C++, Rust and others already have their own DOM/JS support.

WASM is exciting for statically typed languages especially. JS is not the target. It might eventually benefit from faster parsing but that's not the motive now.

pas · on Jan 18, 2018

> C, C++, Rust and others already have their own DOM/JS support.

What does that mean? Could you expand on that?

Can Rust code compiled to wasm manipulate the DOM?

logn · on Jan 18, 2018

Rust -- https://github.com/koute/stdweb

C++ -- https://github.com/mbasso/asm-dom

I don't know exactly how these work but Emscripten allows interop both ways (embedding JS in native code and calling native code from JS) -- https://kripken.github.io/emscripten-site/docs/porting/conne...

steveklabnik · on Jan 18, 2018

Via calling through JavaScript, yes. Wasm itself can't directly manipulate the DOM.

I'm not sure what your parent means.

axelfontaine · on Jan 17, 2018

Using https://lukewagner.github.io/test-tanks-compile-time/

Chrome 63: 3143.7ms (3.9mb/s)

Firefox 57: 1499ms (8.3mb/s)

Edge 41: 97.3ms (127.2mb/s) !!!

luke_wagner · on Jan 17, 2018

To wit, as described in their blog post: https://blogs.windows.com/msedgedev/2017/04/20/improved- Edge validates and compiles wasm code lazily. Thus, this simplistic benchmark isn't really measuring compile time on Edge. In contrast, Firefox, Chrome and Safari are doing some amount of AOT compilation before WebAssembly.instantiate() resolves.

ehsankia · on Jan 17, 2018

Here's a stupid question, but is the result of the Firefox and Chrome "instantiate" the exact same? Is the compilation doing the same job, or one could be performing more optimizations? Aka faster compilation but slower execution.

computerphage · on Jan 18, 2018

They're doing slightly different things, but not to nearly the extent of Edge which is doing something very different.

isaacaggrey · on Jan 17, 2018

Could someone explain how Edge is performing so well or any references to what they have done in this regard? Has the Edge team already implemented this streaming and tiering compiler technique?

DiThi · on Jan 17, 2018

You tell Edge, "compile this", Edge replies "done!" when what it has done is just verifying it's valid WASM. When you then call a function, it's compiled.

qznc · on Jan 17, 2018

So it is actually sad. It means you just downloaded a lot of stuff and only used a fraction of it.

MrMid · on Jan 17, 2018

Wow.

Firefox 59: 474 ms

Edge 41: 164 ms

The difference is smaller with newer FF, but that is amazing from Edge!

computerphage · on Jan 18, 2018

Edge is doing something very different, see the sibling comments for details.

pimeys · on Jan 17, 2018

Firefox nightly on Linux: 165.6 ms (74.8 MB/s)

Chromium: 1835 ms (6.7 MB/s)

JepZ · on Jan 17, 2018

Quoting Yehuda Katz, the co-creator of Ember.js, when it comes to JS-sizes is kinda hilarious (random google result):

https://gist.github.com/Restuta/cda69e50a853aa64912d

No offense to Yehuda in general (he is doing great work), but Ember.js so ignorant of any js-size recommendations, that it seems weird to quote Yehuda in that context.

bsimpson · on Jan 17, 2018

Would be nice if it mentioned WebAssembly in the title - I presumed this was a new feature for JS in Firefox Quantum.

markdog12 · on Jan 17, 2018

I originally put WASM in the title, not sure what happened.

nothrabannosir · on Jan 17, 2018

does someone here, familiar with webassembly semantics, know if it’s theoretically possible to start streaming execution of code? I.e. as soon as the “main” (?) function is in, and block on every function call which is not yet compiled, recursively? Or could the last block of webassembly bytecode potentially change the semantics of the first?

Sooner or later, that’s an avenue people will want to explore, I assume?

tsavola · on Jan 18, 2018

Yes, it's possible. No, the last block won't change the semantics of the first.

zach43 · on Jan 17, 2018

Interesting article...I did not realize that the WASM needs to be compiled into machine code on the client system, I just assumed it would be directly interpreted by the JS engine.

As a side note, it is interesting to see that multithreaded compilation of a single page provides significant performance benefits here...this is usually not done with C/C++ code compilation from what I understand about it

Yoric · on Jan 17, 2018

Well, the difference between "interpreted" and "compiled" has become very blurry during the last 20 years. These days, most "interpreted" programming languages are actually compiled to machine code on the client system.

This includes the JVML, of course, but also JavaScript, Python (with PyPy), etc. PHP isn't quite there yet, but it's coming.

> As a side note, it is interesting to see that multithreaded compilation of a single page provides significant performance benefits here...this is usually not done with C/C++ code compilation from what I understand about it

It's slightly different, but native code is typically compiled concurrently, too. The meat of it is often handled by the build system rather than the compiler itself, but that's not so different.

pjmlp · on Jan 17, 2018

It was always like that on the mainframe world.

Assembly was actually bytecode, with a micro-coded CPU doing the actual execution.

All Xerox computers were like that. The first boot step was to load the right kind of micro-code for the environment being started.

The AS/400 native environment (nowadays known as IBM i), is based on bytecode TIMI, which gets AOT compiled via a kernel level JIT.

lclarkmichalek · on Jan 17, 2018

PHP is there with HHVM

noir_lord · on Jan 17, 2018

Which ironically isn't really any faster than PHP7 in the real world outside of benchmarks.

7 was a phenomenal release, I saw 50% reductions in processing time across the board and on old array heavy systems 5-10x memory reduction.

solidr53 · on Jan 17, 2018

Thanks to pressure from HHVM I assume. Nothing was happening in the PHP language for freaking three years.

To be fair the benchmarks usually take a wordpress or drupal installation and do a requests per second measurement, which IMO is a real world benchmark.

No hate, I just don't get why hhvm doesn't get any love for what they did. Maybe because from HPHPc to HHVM they seriously gave the PHPc a competition and people kind of got mad.

https://kinsta.com/blog/the-definitive-php-7-final-version-h...

littlestymaar · on Jan 18, 2018

> Nothing was happening in the PHP language for freaking three years.

To be fair that's not because they where sleeping, but because they attempted to do something that proved too hard (unicode support) and they had to abandon it. That's why PHP skipped version 6.

krapp · on Jan 17, 2018

>No hate, I just don't get why hhvm doesn't get any love for what they did.

I don't know - I expected to see a ton of Hack projects show up here but it's like no one cared about the language except as a wake-up call to PHP. Maybe the involvement of Facebook put people off.

Yoric · on Jan 17, 2018

My bad, I hadn't looked at that project in some time and I hadn't realized they had advanced that far.

nabla9 · on Jan 17, 2018

And then modern microprocessors receive x86-64 instructions and decode them into underlying microcode.

pcwalton · on Jan 17, 2018

> As a side note, it is interesting to see that multithreaded compilation of a single page provides significant performance benefits here...this is usually not done with C/C++ code compilation from what I understand about it

Well, that's because typically all cores are maxed out during a parallel build of large-scale C++ software, so there's no need to go any further.

With link-time optimization it's a different story…hence the work some compilers (like rustc for Rust) are doing to parallelize builds of single compilation units.

callahad · on Jan 17, 2018

There's a simple-but-useful WebAssembly Explorer at https://mbebenita.github.io/WasmExplorer/ that interactively shows the C/C++ -> WASM Text -> x86 ASM path that WebAssembly takes.

I wrote up a short article and video demonstrating it last year at https://hacks.mozilla.org/2017/03/previewing-the-webassembly...

chrisseaton · on Jan 17, 2018

> I did not realize that the WASM needs to be compiled into machine code on the client system

It doesn't need to be. This is a choice they've made. Other implementations of WASM could interpret it they wanted.

The Church-Turing thesis tells us that any program you can compile to machine code can also be interpreted, so it is not possible that any language needs to be compiled into machine code.

Rusky · on Jan 17, 2018

You can interpret wasm if you want, but considering the motivation for wasm is performance that kind of defeats the point.

filereaper · on Jan 17, 2018

Total Aside. As a compiler and runtimes guy, I'm super excited for streaming compilation. I think stuff like this and ethereum for distributed computation is really cool stuff! :D

dfox · on Jan 17, 2018

Streaming compilation is the way it was always historically done. One reason is that computers used to not have enough RAM to store whole non-trivial programs in it in AST or another intermediate form.

Second reason is that this approach matches how the underlying theory of languages and automatons works. One can view modern AST producing compiler frontend as compiler that compiles it's input into program that builds the resulting AST.

On the other hand many modern optimalization passes simply cannot be done in streaming manner or even by any pushdown automaton.

caf · on Jan 18, 2018

Yes, which is likely why in older languages like C you must declare symbols before you use them etc.

sehugg · on Jan 17, 2018

That's great news. On http://8bitworkshop.com/ I'd like to offer some additional WASM modules on-demand but they take 15+ seconds to load. (It seems 50% of the time is parsing and 50% module instantiation)

Dolores12 · on Jan 17, 2018

Guess what, downloading compiled executable code is even faster. Is that where we are heading to? Flash 2.0? Wouldn't it be great to save all the electric power that was used to compile very same code on millions of computers every day?

__s · on Jan 17, 2018

Sigh... let's have this thread again. Needs to be effectively sandboxed. Modern JS as a compiler target is just as opaque. JIT compilation is a win on energy relative to the energy spent in a slow interpreter. JITing WASM will take less energy than JITing emscripten-js (nevermind the energy to send over the wire)

littlestymaar · on Jan 17, 2018

Downloading a bigger executable wouldn't necessarily be faster actually, it depends on the size difference and the client bandwidth.[1]

Additionnally, it wouldn't be portable (executable compiled for desktop wouldn't run on mobile).

[1] See this comment : https://news.ycombinator.com/item?id=16171133

Dolores12 · on Jan 18, 2018

What you are saying is that source code is less in size than compiled native code which is nonsense.

littlestymaar · on Jan 19, 2018

First of all I wasn't talking about source code, I was comparing the output of a C/Rust->wasm compiler to the one of a C/Rust->x86 compiler. Since the wasm virtual machine has a JIT, I believe the compilation to wasm isn't too aggressive with optimizations. And since those make the binary bigger, I assume a wasm output would be lighter than an x86 one. I didn't benchmark it though.

And if you compare the size of the binary output with the size of the source code, the binary is bigger in many cases because of optimizations (and runtime size, for small programs). Additionally, the source code can be gziped with a good compression factor whereas the binary cannot. Then 99% of the time, the source code is lighter to send over the internet than the compiled binary.

breatheoften · on Jan 17, 2018

Does wasm do runtime code specialization? I wonder if there will end up being a way to do to timing attacks against the optimizing wasm compiler/linker step ... Is it possible to setup code such that the optimization time depends on the runtime inferred type of an 'x' that you aren't supposed to have access to ...?

yoklov · on Jan 18, 2018

The term you’re looking for is speculation, not specialization, but no, I don’t think it does either. C++ and other languages targeting WASM often do type specialization, but it’s entirely done before the browser sees WASM, and has nothing to do with what you’re describing. (Which would be speculative compilation).

I’d imagine that nobody does speculative compilation since the benefit is too low given how fast the network is. Also, yes, there would be security concerns.

stevemk14ebr · on Jan 17, 2018

Caching of compiled code! As i read it they want to cache the wasm bytecode at the client level. What if servers did the caching instead? Group clients by the architectures they use and serve the cached bytecode to the right 'groups' of clients.

sjrd · on Jan 17, 2018

That would assume you trust the server not to give you malicious machine code (which you of course cannot!). wasm is specified in such a way that it is still sandboxed by the VM that compiles it. If you fetch arbitrary machine code, you cannot verify it and that leads to huge security holes!

chrisseaton · on Jan 17, 2018

> which you of course cannot!

Didn't Google's NaCL implement verification of sandboxed machine code?

sjrd · on Jan 17, 2018

Maybe, but at what cost? I wouldn't be surprised if the cost of verifying the machine code was higher than the cost of compiling wasm to machine code.

hateduser2 · on Jan 17, 2018

Why can’t you just cache it with a hash?

icebraining · on Jan 17, 2018

How would the client know the hash is valid?

Zpalmtree · on Jan 17, 2018

Perhaps he means you compile the code locally, hash it, and then next time you can fetch the compiled code from a server, and check the hash matches?

icebraining · on Jan 17, 2018

That's possible; a kind of second-level cache. That assumes the compilation is reproducible, though.

TD-Linux · on Jan 17, 2018

Compiling wasm also enforces security rules. Offloading that would require full trust of the server, which is probably not desirable.

smcleod · on Jan 17, 2018

Firefox Beta (58) macOS 10.13.2: WebAssembly.instantiate took 151.9 ms (81.5 MB/s)

rmrfrmrf · on Jan 18, 2018

> But there’s no good reason to keep the compiler waiting. It’s technically possible to compile WebAssembly line by line. This means you should be able to start as soon as the first chunk comes in.

Maybe they can optimize further by speculating what the next line will be...

StreamBright · on Jan 17, 2018

What are the security implications of wasm?

markdog12 · on Jan 17, 2018

Good writeup here: http://webassembly.org/docs/security/

It runs in existing browser VMs, which have been pretty battle tested.

Another interesting note is that threads are now on hold for WebAssembly due to Spectre, that is, SharedBufferArray has been disabled. Hopefully it can be enabled in the future.

daveheq · on Jan 18, 2018

Where do I buy the Firefox coin?

bobsc123 · on Jan 17, 2018

This cracks me up. Modern web browsers really started to evolve in the 90's when security problems really ramped up. You used to just download excecutables and run them on your computer because the functionality wasn't there otherwise. Flash and Java applets were the initial answer to that before Javascript and HTML evolved. We've come almost full circle to browsers basically being little VM's that can do anything again, the main reason they were developed in the first place. Most people's entire computer experience is now in the browser and here come executables again which will require another internal layer to mitigate problems.

ajkjk · on Jan 17, 2018

The end state ends up looking a lot like (the user-facing side of) an operating system, except that:

* the filesystem is cloud storage (Drive/Dropbox/what have you -- the Unhosted (https://unhosted.org/) architecture)

* the apps are insecure but open-source by requirement (interpreted jS)

* ... running in a controlled sandbox (the browser)

* ... using a standard UI language (HTML/CSS)

* with functionality modifiable/overridable by user preference (extensions)

It's pretty much the ecosystem you would want if you were building this from scratch! Except you'd want Html/CSS/JS to be much more intelligently designed from the start (I'm waiting so eagerly for the day that browsers natively run more scripting languages than just JS...)

It never could be done in the 90s because everything ran too slowly, but it's feasible now.

pjmlp · on Jan 18, 2018

> It never could be done in the 90s because everything ran too slowly, but it's feasible now.

It used to be called Lisp Machines, Smalltalk, Oberon Juice, Java Jini, Inferno.

NanoWar · on Jan 17, 2018

Actually, browsers are designed from the ground up to handle insecure code! That's pretty awesome, but comes at a cost: The mentioned additional layer, battery, speed? And it's platform independent!

hathawsh · on Jan 18, 2018

I'm often reminded of the XKCD quip that web pages are in fact the easily installable executables that so much of the marketplace was looking for over the past few decades.

https://xkcd.com/1367/