Hacker News new | past | comments | ask | show | jobs | submit login
The Emterpreter: Run asm.js code before it can be parsed (blog.mozilla.org)
181 points by Rauchg on Feb 23, 2015 | hide | past | favorite | 79 comments



I actually really like the sound of this, not because of the stated benefits (though those are nice), but because it sounds like this would actually create a really good upgrade path to implementing a proper bytecode into browsers.

I mean, with asm.js and the copious amounts of compile-to-Javascript languages available these days, Javascript is already becoming a de facto bytecode for the web. But it's always been a weird and uncomfortable hack done for the sake of backwards compatibility: a higher-level language hijacked to work as a compile target, simply because it's the only thing supported by browsers.

If projects like this Emterpreter catch on, though, it allows for a smooth path to proper bytecode: for backwards compatibility, you have the Emterpreter read and execute the bytecode, but in other, more modern browsers you have the browser execute the bytecode directly. I think this would be an overall better approach than what we have now.


> create a really good upgrade path to implementing a proper bytecode into browsers.

I think that's very unlikely to happen. Instead, what will happen will be exactly "emterpreter"ing: There will be multiple bytescodes, and they will all ship with their own interpreters/compilers.

If the emterpreter, instead of executing the bytecode, would generate JS code for it (and feed it into the JIT compiler if there is one), you'll get the best of all worlds -- bytecode format, top performance, perfect backwards compatibility. I suspect that's kripken's next move.

Given that this is the case, why would someone want to shackle themselves to a specific bytecode format, which is practically impossible to get universally accepted? (This argument is supported by PNaCl; The technical problem is small; the political problem is huge).

Work on JS optimization by all vendors is already extremely impressive and is not going to stop even if everyone agreed on some bytecode. Why not capitalize on it? What does a specific bytecode buy you beyond slightly shorter load times (which the emterpreter already gives a way to greatly reduce), and not having to pull the (essentially universally cached) emterpreter code?

Because these two things, while nice, are not enough support for the revolution that a proper bytecode is.


As with asm.js it wouldn't need to be universally accepted: It can "sneak in the back door" by continuing to work fine in browers that "just" supports standard javascript.

If code using this method clearly labels the bytecode and interpreter and the part that background loads the "real thing", then implementations can opt to add whatever optimisations they like to speed it up as it stabilises . Doesn't matter if the bytecode changes, as long as it's labelled properly so the optimised versions falls back to just interpreting the JS if it comes across a version (of the interpreter/bytecode as a whole, or just a single opcode) it doesn't understand (or that the implementer hasn't seen a need to optimise).

If the interpreter is guaranteed to retain a certain structure, it could be very easy to just "unroll" the interpreter loop and selectively JIT portions of the bytecode based on hotspots. You can optimise that a lot in a non-bytecode specific way by annotating the interpreter loop with assertions that grants extra guarantees (immutable bytecode; markers to indicate which code is only interpreter scaffolding; decoding hints; if you also tack on "labels" for each instructions, implementations can special case on individual instructions that "settle" while still handling new instructions/changes by inlining the interpreter code.

> What does a specific bytecode buy you beyond slightly shorter load times (which the emterpreter already gives a way to greatly reduce)

The full speed from the start; note the substantially lower speed for the first little part. And the example codebase is small compared to some of the things people want to run.

I think we sort-of agree. I don't necessarily think there's a reason to specify a standard bytecode, exactly because this approach could conceivably be extended to effectively give us a "mostly standard" bytecode with the freedom to continue to change the format without a lengthy committee approach, because there's a demonstrably viable fallback.


> I actually really like the sound of this, not because of the stated benefits (though those are nice), but because it sounds like this would actually create a really good upgrade path to implementing a proper bytecode into browsers.

We already have a proper bytecode with several high-performance implementations that's supported on virtually every platform, which works as an excellent compilation target for both dynamic and static languages.

It's called JavaScript.

Now you might say JS is bloated! But that's only if it's not gzipped or minified.

Make a "proper bytecode" and you gain absolutely nothing except losing the wide support JS enjoys.


Except you gain better startup speed which was kind of the premise of this article.


True, that's maybe the only case where there's an improvement. But it's a one-time cost (the browser can cache the compiled version), and switching to bytecode is not necessarily the only way to improve performance.

For example, a JS parser specially optimised for asm.js.


Note that cache is not always a solution. Cache is a terrible answer for first-time user experience. A lot of stuff on the web is loaded outside of the cache too for various reasons. Caches gets flushed.


My suspicion is that it will never be possible to parse asm.js as quickly as bytecode simply because it is structurally different. However, a quick search does suggest there are still some improvements possible in Firefox to speed things up somewhat.


With proper bytecode you would have to do bytecode verification so no.


Javascript is as much of a bytecode for the web as Java is a bytecode for the JVM.

It is not. Where would we be if we could only execute Java on the JVM? Clojure and Scala transpiled into Java? Can you imagine how terrible that would be?


> Can you imagine how terrible that would be?

I can, and it won't be terrible at all. Compile times might suffer a little (but that's likely to be negligible with a good Java compiler - I think Jikes is now abandoned, but back in 2002 it was pretty much instantaneous even for huge files, unlike Sun's). I don't think you'd notice it on Scala, and likely also not in other JVM languages.

I've worked with Python extensions that compile indirectly through C, and I have no experience with Nim but it seems to do that very well and very quickly.

What exactly do you believe the problem would have been if Scala or Clojure generated Java rather than JVM bytecode?


Yea I'm crossing my fingers that asm.js eventually leads to a real bytecode VM in browsers and Javascript simply becomes a language that targets it.

For large applications compiling to asm.js (such as Unity3D) this experiment could provide significant gains in load time. Considering games usually spend their initial time presenting a menu, background loading of the fast-path makes a ton of sense.


Other than load time (which the emterpreter already makes great advances on), what other advantages does a bytecode carry?

I don't see "load time" as a good enough reason for a revolution of this magnitude. Do you?


Binary sizes, which would even further improve startup times, along with reducing network usage (and thereby hosting costs). GZip works just as good on binaries as it does on JavaScript.


There are two things here:

1. [citation needed] and a [specific bytecode] needed. Details are everything: an uncompressed .dex file is comparable to a gzipped compile jvm class file; Also, with a specific bytecode in mind, if you compared minified+gzipped gs to gzipped bytecode, I suspect the difference will be small.

2. Regardless, the benefit of binary size is already provided, yesterday, with the emterpreter, without requiring any buy in from any browser vendor.

So, again - what's the benefit, other than startup time (which might be solved in other ways) of a universal browser bytecode?


If you read the article, it downloads the full sized binary later. So actually the size with the emterpreter is higher


I commented on another thread, that I believe a future version of the emterpreter will generate the full JS back from the bytecode, thus eliminating that problem - it will increase the emterpreter size by a few tens of Ks, but that will probably by cached everywhere through CDNs.

And even if it never does - it's politically close to impossible to agree on a universal bytecode, so it is extremely unlikely to happen. (Google already tried with PNaCl - if they can't pull it, I doubt anyone else can)


The thing for now is that the asm.js "family" is riding on LLVM's back. More and more it's the LLVM that is becoming that VM people have longed for. It looks like LLVM is inching towards the JavaScript VM with every year that goes by. I read that Apple is using it for further JavaScript optimizations on the browser. LLVM already powers WebKit underneath right?

So as things stand right now we have 2 popular "VMs" people really use. One is JavaScript since it's going nowhere. And the other is LLVM that is "free as in beer" for companies all over. JavaScript was secluded to the client. And LLVM was secluded to the backend. Now they are going to be marrying and having lots of children. :-)


LLVM is not a VM by any stretch of the imagination. It is an intermediate language for compilers whose primary utility is to provide a common target for code generation and optimization. LLVM's name is a misnomer.


This is true, but the comment you were replying to was pretty clearly treating it as a language. With "VM" in scare quotes because it's a semantic model.


You're thinking of that email that someone sent on a mailing list a while back about how LLVM isn't really a bytecode because it includes architecture-specific codes.

It was a stupid email - you can make LLVM into an architecture-agnostic bytecode by disallowing those codes.

Don't believe me?... https://developer.chrome.com/native-client/reference/pnacl-b...


"You're thinking of that email that someone sent on a mailing list a while back about how LLVM isn't really a bytecode because it includes architecture-specific codes."

No, I'm not. In fact, I have no idea what email you're referencing.

LLVM as an intermediate language (I guess you could call it bytecode if you really wanted to) for compilers of arbitrary languages (expressly not a virtual machine!) is the only point I intended to make.


You are probably thinking of this email titled "LLVM IR is a compiler IR".

http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-October/0437...

It was not a stupid email. The arguments there are still pretty much correct today.


Now they are going to be marrying and having lots of children. :-)

Here's to hybrid vigor!

So free software/open source is analogous to an open society without any arbitrary marriage restrictions, whereas closed source proprietary software tends to cause aristocratic inbreeding?

I suspect that bastards also fit into the analogy somewhere.


asm.js code doesn't have to be shipped in pure JavaScript form. It could come as byte code that's "uncompiled" into asm.js code before being thrown at the JavaScript runtime.

A smart compiler could recognize the ordering priority and load in chunks sequentially, with hot code rolled in first, less frequently exercised methods last.


I think this is the first emscripten feature which I don't "get", e.g. not fully understanding the motivation behind it :) Firefox AOT compile time is great, and nothing at all compared to the time it takes to download the code. The real art lies in getting the actual emscripten compiled executable small, not because of emscripten, but because of C++'s tendency to bloat the executable size if one isn't really careful. Also, clang and emscripten's code generation passes have very aggressive dead code elimination, but there can be wrong decisions on the game engine architecture-level which can cause the inclusion of code that might never be called. A native executable that's several dozen MByte's big is a bad thing on any platform, it just doesn't appear as such a problem if it is a part of a 50 GByte game download. To come to and end, I think an emscripten client of up to 5 MByte (gzipped) doesn't really require a fix like emterpreter for its startup time(?), and it's perfectly possible to fit a full 3D game client into such a size with a little care about size optimization.

[edit: fixed a formulation which sounded like code bloat can be blamed on emscripten instead of user-code]


I agree that a moderate or small codebase doesn't need anything like the Emterpreter. At least not for startup speed - see the wiki page for other uses https://github.com/kripken/emscripten/wiki/Emterpreter

But there are some huge multimillion line codebases that you can't really reduce in size to 5MB. That's the startup time problem that the Emterpreter aims to help with.


I see. Guess it's mainly for UE4 then ;) I was surprised how small the Unity demos have been (the biggest was a bit over 3 MByte IIRC for the zombie FPS), guess they have their intermodule-dependencies better under control.

I actually find the bytecode part more interesting then the interpreter part, are you seeing drastically better compression for a bytecode module compared to compressed ASCII asm.js?


Actually, no - while the bytecode is smaller than asm.js, when you compress them both, they end up around the same. The issue is that the bytecode is designed for fast execution, not compressibility, so it contains absolute offsets for jumps, etc. - this lets you just start running the code immediately, but also means it looks like random noise that compresses poorly. And, JavaScript is fairly compressible, as it has less such noise, it's very regular.

However, it does seem likely that a binary format that is designed for compressibility could be smaller. But, it would not be as fast to execute.


Here's a binary format that is designed for compressibility and fast code generation [1]. It's a bit of a pet of mine (I just realised one of the Wikipedia links are to my blog) - it was the PhD dissertation of now professor Michael Franz at UC Irvine, who incidentally had Andreas Gal (now of course at Mozilla...) as PhD student.

SDE effectively compresses a low level syntax tree by building a dictionary that as a sort of sliding window over the tree by adding specialised nodes as well as more general nodes, and the encoding new variations of specialised sub-trees using the new elements in the dictionary and subsequently adding more specialised "instructions".

As a result you both get compression (think a tree variation of Huffman coding) and "hints" to let you re-use partially generated code as templates (e.g. one dictionary entry maps to "assign x to y"; you also add "assign x to _", and later you find a reference to the entry for "assign x to _" + z; now you copy the code for "assign x to <something>" into place and fix it up with the address of z).

The approach has always fascinated me, but it fell in the shadow of the JVM.

[1] https://en.wikipedia.org/wiki/Semantic_dictionary_encoding


Interesting, thanks!


I believe asm.js has been trying to counter concerns people might have when considering using it. Also, it could be that some companies may be trying to join efforts in trying to further the asm.js adoption, like Microsoft adding more support for asm.js. So it could be that they are planning ahead much more now. The asm.js effort could be seen as moving on from a prototype to an actual feature companies may depend on in the soon future.

Also, what makes people more cautious about performance concerns is that mobile network and hardware are still catching up to what people have on the desktop. And since about 8 years ago, mobile has been a big opportunity for many companies, which means that the desktop has been taken for a ride by the mobile devices and that is not going to stop.


Imo a stackbased bytecode has far better code density, which is probably pretty good for browsers.


.. at a significant cost to execution speed, experience shows. The fastest interpreted "general purpose" language is apparently Lua, which switched from stack to registers to get that speed a while ago.


Erm, it's a big tradeoff. You get better code density, but it takes longer to compile well and is slower to interpret.


This is very cool, and great to read more interesting research from Mozilla. However in this article I notice something curious: normal asm.js kicks off and shortly after reaches top speed by around 700ms. emterpreter gets going by 200ms, but takes another 1200ms to reach top speed by 1400ms. Why is it not another 700ms? Isn't it just doing the same work as asm.js but in the background while running with emterpreter?

Non-blacklisted emterpreter looks slow enough (5fps ish on that graph?) to simply not be useful for some use cases, like a game engine - it's not going to be remotely playable like that. Therefore emterpret => asm.js actually significantly increases the startup time. Playable by 1400ms is worse than playable by 700ms. But I guess this is all preliminary and improvable though!


There are a few reasons why full speed is reached later when starting up in the emterpreter and swapping in asm.js later:

* Compiling asm.js can use multiple CPU cores, so doing just that is faster than doing it while the emterpreter is running on (at least) one core.

* I believe, but am not sure, that compiling on a background thread is done at lower priority than stuff on the main thread.

* Swapping asm.js code in can only be done in between frames. At 10fps for example, that means around a 100ms delay just for that, and possibly more depending on the state of the browser's event queue.


As I sometimes see wmf say, "as predicted by prophecy:" https://news.ycombinator.com/item?id=6923758



I actually acknowledged that in the version of this I posted 4 days ago: https://news.ycombinator.com/item?id=9071064

However, he has JS hanging on for longer than I bet on... once asm.js gets a good DOM binding I expect the explosion of language diversity to take about two years, tops, and for it to rapidly become clear that JS is now just another way of accessing the DOM. I think there's more pressure built up there than people realize, because right now there's no point in thinking about it, but once it's possible, kablooie. Node's value proposition, IMHO, is in some sense correct, but backwards; it's not that we want to write in Javascript on the server, it's that we want "client language = server language"... and once there's no longer a technical handcuff pinning the client side of that equation to Javascript, it will not take that long for it to no longer be Javascript. It is not an impressive language, even within its own 1990s-style dynamic language niche.

(I think this is not because it's "bad", but because it has been developed in this really terrible multiple-vendors-that-actively-don't-want-to-cooperate way for most of its lifetime. It's gotten past that, I think, but during those decades all the other scripting languages were marching right along. None of the other languages could have survived such a process and gotten to where they are today, either.)


GC-ed languages are going to have to include the GC which I doubt can compete with JavaScript. No one wants to write CRUD apps and manually manage memory. I don't see asm.js being used outside of games.


Bear in mind that Unreal and Unity both have some form of internal garbage collection systems that are compiled to asm.js. In the case of Unity, C# is transpiled into asm.js code. You could write in potentially any GC language, it's just that the GC needs to be included.


Unless someone invents a language where non-GC memory management is easy and that language can compile to both the client and server...


Rust makes it not quite "easy" but at least it makes it automatic and not error-prone.


"GC-ed languages are going to have to include the GC which I doubt can compete with JavaScript."

There's no particular reason why not. It's all just bits and bytes in the end, and asm.js gives a pretty low-level view of the world. And if you're starting from a baseline of a language that can easily be 5-10x faster than browser-based JS you can afford a bit extra on the GC side.

Javascript isn't magic. It's just a language. It isn't even a particularly special one, once you ignore its browser support, and it certainly isn't one focused on performance (I stopped buying the "languages don't have performance characteristics" line a while ago). It gets to run the same assembly instructions everybody else does. It isn't as fast as a lot of people here suppose, and it isn't that hard to beat out its performance even now.


But why, what is the upside vs. just transpiling like the dozens of languages that already do?


Why are you asking as if it's some sort of theoretical question when asm.js is in hand, right now, and it performs wildly better than raw Javascript? Of course we'd rather compile to something that's faster than Javascript than compile to Javascript.

(Sorry, I can't condone the word "transpile". Usage of it just reveals someone who doesn't understand compilation technology and thinks there's somehow something "special" about compiling to one intermediate language ("javascript") vs. another ("assembler").)

I can't believe how many people seem to believe that Javascript is a C-speed level language, and downmod anyone who observes it's not. Well, it's still not. It's easy to see that it's not. It's not even close. If it were asm.js wouldn't exist. (I mean, if you're having trouble with my claim here, stop and think about that for a moment... if Javascript is so fast, why does asm.js even exist?)


I wonder how much React-style UI libraries can make up for the poor DOM access. Do all the calculation in asm.js and just dump out the diff for the plain js dom updater to deal with.


Hot damn, that's exciting. I prefer the asm.js approach over PNaCl because it lets browsers gradually move over to the standard rather than forcing a flag day. This solves the big issue, namely parsing that big, hot mess of raw JS.

I hope we'll eventually see a proper bytecode spec with bidirectional assembly/disassembly w.r.t. JS (ie: more a transformation than an assembly spec) evolve from this effort, but it's obviously something that needs to happen after asm.js has had its time to bake.

This, of course, assumes that browsers don't get AOT/background compilation to the point where it's no longer necessary to consider a bytecode spec.


After we have evicted Java from our browsers and turned Javascript into the new Java, what have we gained?


It isn't the new Java, it's the new JVM, and the answer is, "choice".


Choice? In what sense? The JVM is a ridiculously good VM, with tons of languages that compile to it.

Why are we reinventing the wheel?


The JVM as a plugin is almost dead on desktop already, due to very bad past malware vector experiences.

Anyway, plugins are dead code walking, not only because of security worries, but because "mobile" in full trumps desktop, and the de-facto "no plugins on mobile" OS-set standard.

Plugins alas are still "facultative not obligate" on the desktop, so browsers support them. Even on desktop, their days are numbered (see Chrome killing NPAPI plugin support; see also Shumway).

The JVM isn't going to be integrated directly (i.e., not as a plugin) into browsers, either.

For all its virtues on the server side, and I'm thinking of multiple JVMs here going back to the CLDC VM that Macromedia tried to license from Sun for Flash (rejected: they did Tamarin instead), "the" JVM is nowhere near a clean fit on the client side.

This leaves the already-obligate-not-facultative JS VMs, which are evolving fairly rapidly due to browser competition, serving both hand-coded and compile-to-JS workloads.

All as predicted! I think Dave Herman first spoke about this at Web Rebels in 2012, and a bunch of us have laid bets well before then, including on HN. Yeah, I've been hard on PNaCl as a would-be better plugin for safe native code, not due to the tech itself as because of the opportunity cost.

Now that it's 2015, everyone seems to be on convergent paths toward some kind of LLVM-compiled, cross-browser, safe-native intermediate code that's evolved from JS (starting from asm.js, but not restricted to the current subset, e.g., shared memory threads could be added only to the intermediate code and its runtime).

Such a "WebAsm" or js.bin format would then co-evolve with JS until source code download goes away -- if JS source download ever does (I'm skeptical, but given enough time, it could happen).

Lots of risk still, but this remains by far the shortest-distance evolutionary path from where we are.

/be


The JVM is indeed great in many ways. The main issues with it today, I would say, are

* Patent and copyright issues - the lawsuit with Google is still going on, last I heard.

* There used to be serious technical issues with startup speed. People saw websites with Java applets and saw how slow they were to load. If that hasn't been fixed, it's a serious problem, as websites do need to load fast, unlike typical Java applications.

edit:

* A big use case is compiled C++ code. I am aware of lots of languages compiling to the JVM, but I actually don't think I heard of C and C++. Is there such an option? If such an option doesn't exist, or exists but runs more slowly than asm.js currently does (which is pretty close to native already), it would be a problem.


Could Java applets modify the DOM? Were they nicely integrated with the Browser interface? Did they come bundled within the browser for easier installation and upgrades by their users?

Sure there's some parallels, but don't pretend industry ditched Java applets due to ignorance. Javascript in the browser, for 99% of web applications, provides a much better user experience. The developer experience we could argue here, but I'd say nowadays developing client side JS (or TypeScript, CoffeeScript, Haxe etc.) isn't that bad. The tooling is great and there's if anything, too many frameworks to choose from (competition is good!).

The same arguments apply for Flash, really. Flash integrated slightly better with browsers at the expense of being a resource hog.


Choice of language to compile into the VM.

As for why not the Java VM, my guess is that the browsers are staggeringly enormous piles of C++ code and trying to integrate a Java VM into it would probably be insanely difficult, and anything other than pure 100% integration, too slow to use. It is probably literally easier to continue with the already-integrated JS VM and improve it up to JVM-esque quality than to try to graft the JVM into the browsers that exist today.

Or somebody would have already have tried since nothing would have prevented JS from already running on the JVM, if that were feasible; asm.js is actually independent of this question when it comes down to it.


You know tons of language already compile and run on the JVM, right?

And you know the JVM ran in the browser almost 20 years ago, on hardware with a fraction of the CPU and memory resources we have today?


The JVM does not run in the browser. Its windows can run physically "in" the browser window, inasmuch as it would appear that the Java app was "in" the browser, but the Java app itself was a "plugin", and was a separate OS process. That's not the right kind of "in".

The JVM has never run in the same process as the browser, to the best of my knowledge, with the exception of the ill-fated "HotJava" browser: http://en.wikipedia.org/wiki/HotJava which ran Java applets "in" the browser by virtue of being written in Java, instead of C++. At the time this was too sluggish for general use, though.

So allow me to repeat myself: We can't run the JVM in the browser right now. The impedance mismatch between the JVM and the C++ world are just too great to have sufficient performance right now. All the current browsers are enormous piles of C++ code. There is no way to "just" integrate a JVM directly into them, and no bridge between the two is going to have enough performance for the demands we're putting on browsers. It doesn't matter how awesome Java may or may not be when the browsers aren't in Java. It's not an option today, short of replicating Sun's feat and rewriting your own browser in Java. But the only thing stopping you from doing that is the sheer size of the task, rather than any technical problem. (But make no mistake, it is an enormous undertaking now to write an engine capable of replacing any of the existing ones for even a single well-chosen use case.)


Oh, I see. I misunderstood your use of the word "in".


> You know tons of language already compile and run on the JVM, right?

Yes. So?

> And you know the JVM ran in the browser almost 20 years ago,

And it did such a good job that everyone has been prefering java applets to html/js/flash for those 20 years.

Oh, actually, they didn't. In fact, with few exceptions, they have been rejected by users for most of those 20 years. The implementation was horrible. I don't know if it's better today - maybe it is for some Windows browsers. But no browser on my Mac or Linux supports it without an external install -- and last time I actually used it (~2 years ago), it still took forever to start applets.

It might have worked if the execution was acceptable; it's not impossible - flash had reasonable execution which lead to widespread adoption that is still nontrivial despite a big part of the web - that of mobile - made it unusable. And yet, 18 years later (I'm up-to-date as of 2 years ago), Java applets are still a mess.

So, the bottom line is: who cares the JVM ran in the browser?


The applet user experience is horrible, you're right. It's pathetic that this hasn't been fixed.

Instead, we're developing apps on a platform intended for delivering documents, and targeting code to a "bytecode" (asm.js) built on a half-baked language originally intended for doing form validation, pop-up ads, and animating dancing bears.


I can play that game too:

Java and the JVM form a 3/4 baked language environment originally designed to run on washing machines, freezers and TV-top sets - that after 20 years can't even get applets and GUI properly.

See how easy it is?

Yes, I completely dislike JavaScript. I also completely dislike Java. But both are here to stay, regardless of their (lack of) merits compared to some ideals. Java missed the browser bus because it was horrible, and JavaScript drives the bus now because it was there.



Hah. I totally agree. I personally dislike both Java and Javascript, also. Javascript was born brain damaged. Java had a chance, but it jumped the shark sometime in the early 2000's when J2EE took off.


There is a tendency for history to repeat itself in programming, since, as Alan Kay put it, programming is "not quite a field" as it lacks a proper sense of its own history. So going by this, I predict that VM vulnerabilities are going to creep into the JS ecosystem, put there in the name of optimization, much like the "Use After Free" vulnerabilities in the JVM for Android.

I sincerely hope I will be wrong!


We're reinventing the wheel because the Java plugin didn't reach critical mass. There's never been a time that you could require Java in a web app and expect a trouble free experience for your users.

Perhaps the browser/OS vendors and Oracle could have worked together to make this happen, but they didn't, and the world moved on.


Yeah, maybe if we could compile to some byte code and simply download a VM to compile once and run everywhere ... hmmm where have I seen this before?


"Oh wow, wouldn't it be great if browsers had a bytecode interpreter in them?"

Two words: "Java applets".


a bit OT, but one thing i always wondered is, can't we extract inferred type information from the JIT after running plain JS code through 1M cycles and then compile that to an asm.js variant?


This is effectively what JavaScript VMs already do and have done for years. They compile to unoptimized machine code that has a bunch of hooks to monitor the types that flow through the code.

After a while, if a chunk of code is identified as "hot", a second-stage compilation kicks in. The code is recompiled to optimized machine code that takes advantage of the types that it previously saw the code use (with fallbacks in case those assumptions later fail).


yes i understand that. my question is, can that inferred type info be used to output the optimized bytecode as asm.js so that it can be included right away. like statically compressed .gz assets (pre-optimized).

it would effectively make a js -> asm.js compiler


Possibly, but the types by a VM are rarely guaranteed to be completely accurate (some traps will be included in optimised code to capture the cases where types may not be as expected). They may only be true for the particular input data, other other libraries used, or stuff like that.

Using previous runs to inform the JIT of expected types is entirely reasonable though, and think various JS implementations already do this.


> some traps will be included in optimised code to capture the cases where types may not be as expected

none of that should matter. if the resulting JIT'd code is faster with the traps than the plain js without them, so be it.


Erm, so you could possibly do some stuff like this (probably not for Js). It's called profile guided optimisation - you compile, run, take measurements, and the compiler uses those measurements to pick the right optimisations to use.

I don't know how desirable it would be - you'd be shifting the burden from the compiler to the network, basically.

In particular, you'd still have to compile it, which means that all you're gaining is the time before JIT decides that the code's super hot.

A nicer approach might be like Oracle hints - you have a small comment you place above a function that tells the JIT how you want it compiled. You test your software using an instrumented browser, and that adds the comments between you finishing and it running.

It kinda ups the complexity by loads, though - and I bet the benefits would be relatively small. Stuff like minifying and compressed assets has really tangible benefits, but here you have something that can easily be done wrong, and really only improves the execution of a very small overall percentage of functions.


Is it straightforward to reference a separate compiled library from an Emscripten binary?

Seems that caching commonly shared dependencies could be a good way to cut down on size and parse/compile time.


Quite similar to tiered compilation on the JVM


So, how long until someone gets the JVM or CLR running inside the browser using asm.js - and then reopens the world of running Java/C# in the browser. Only this time without plugins.


Why waste time and bytes on an extra VM, when you can just compile Java or CLR bytecode to JavaScript?


Because the behaviour of the JS VM and the CLR VM is very different?




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: