As someone with two Unity WebGL games in production, I wouldn't recommend this route. A fairly small game (~20kloc, so not a toy by any stretch, but small nonetheless) has 26mb of JavaScript code exported This kills the browser.
Only good desktop or laptop machines can run it well. Runs like shit on average machines (lets not even mention mobile). This is after several months of dedicated optimization time to reduce resource usage and minimize build size.
Overall, you are faaaar better off writing it in JavaScript with one of the pure js engines if you are going to go this route.
Can't upvote this comment enough. I'm also currently in the process of creating a full featured WebGL game, and nothing is faster than simply using pure JavaScript there. Most people won't accept this, but when you ask them, those never actually tried creating a complex WebGL game.
These are totally fine most of the time. Typescript in particular looks like a good choice for game development.
Sometimes the language is less of a good choice. E.g. i did a small toy project in coffeescript about two years ago, and found to my horror, that if your function ended in a loop, it would return an array. Always. This isn't possible to accidentally do in the host language, so I'd say it is a bad fit (for game development, where you care about such things).
I believe that point is that JS always returns something for every object. Many users of JS just don't realize it. (I know I didn't till I started drinking the coffee flavored kool-aid)
This is probably as a result of CoffeeScript's Ruby-esque syntax and semantics. It does mean you can write some very clear and succinct code at times, but the tradeoff is that it can have unforeseen performance implications.
They don't add much overhead during runtime; most of the time there is zero overhead, or sometimes they can (theoretically, not sure if it's done in practice) produce more optimal code than a "literal translation".
It depends on the language. TypeScript is basically JavaScript with added bits. Unity compiles to JavaScript, but it does so through a crazy chain of C# -> CIL (via Mono) -> C++ (via il2cpp) -> JavaScript (via Emscripten). And then remember that your browser then converts that JavaScript to assembly. Holy convoluted toolchain, batman.
There's also the alternative of using C/C++ for portability and performance, but without a huge engine runtime you'll have to drag along.
There is no inherent size advantage of doing things in Javascript vs. cross-compiled to asm.js. For instance, three.js is 100kByte minified and compressed, that's in the same ballpark as the (admittedly simple) demos here: http://floooh.github.io/oryol/
Once you are over the static overhead for the C runtime and web-API-wrappers of roughly 30..50kByte, emscripten demos don't grow much faster than minified and compressed Javascript code.
If you write in C you are probably fine for code size. Usage of the c++ stdlib can easily cause a lot of code bloat though if you are not careful, but I don't actually know how bad this gets (it used to be quite bad for native apps).
There are other issues as well. The biggest offender being the inflexible heap size, e.g. when you launch an asm.js module, you pass in an array buffer as the heap which cannot be resized. If you need more memory than this, you are SOL.
This wouldn't be that bad, except in practice it needs to be ridiculously small if you want to run on most machines. 256MB, which not much memory for a game at all, will fail to allocate, or kill the tab on about 33% of machines (being generous. For us its probably more like 50% of machines, but we make educational games that need to run on university machines, which are uh, on the lower end).
Either way, as a result, a great deal of the optimization time I mentioned was spent getting all of our games to run with under 128MB heap.
That said it's definitely possible if you're careful (I'm not sure this is quite as true for a voxel game though, but I could be wrong). My complaint wasn't about the idea of using asm.js, it was the idea that Unity would produce a good result. It just doesn't.
Are there ready made engines suitable for this?
Unity has taken 11 years and Three.js 6 years to get to their current estabilished status, so let's keep fingers crossed that something gets started soon on the asm.js/emscripten front.
I think that's a bit of a chicken egg problem, the large game engines need to be everything to everybody, and trimming the executable size down probably wasn't a priority in the past since mostly you'll have a big upfront download anyway.
However, I remember that the first Unity HTML5/WebGL demos a few years ago were much smaller, a complete zombie FPS had a 3.5MB upfront download which is actually pretty decent.
How much of the 26mb JS is actually invoked at run-time?
I tested several real world web sites, like google search, github, google docs - only around 30% of javascript loaded into browser is actually invoked at run-time. Maybe in your case the usage ratio is much lower?
emscripten-compiled code already has a very aggressive dead-code elimination, so this wouldn't help I guess. It is most likely the cross-compiled Mono runtime plus required Unity modules which produce the code bloat on top of the actual game play code.
The emscripten dead code elimination uses static analysis which is limited only to the code totally impossible to be reached during execution even in theory. But POCL leaves only code which is actually invoked at run time (keeping all other code on server in the "load at first invocation" form). The difference may be very significant.
Google products (google docs for example) use Closure Compiler for dead code elimination, but anyway, after clicking almost every menu item in document editor, only around 45% of code was invoked.
If that 26mb web site was available online, I would be interested to measure it's code usage ratio.
BTW, the idea is not radically new. Java loads classes only when they are accessed first time.
Operating systems load native applications by mapping executable file into virtual memory. The pages are loaded into physical memory only when CPU tries to access them.
Ok yes, if there's a mechanism that dynamically loads the missing parts then it could work (actually emscripten supports dynamically loading scripts similar to DLLs). But 'naively' deleting functions from an asm.js blob would result in some 'unknown function' exception if the analysis was wrong and a removed function is still called somewhere.
Only good desktop or laptop machines can run it well. Runs like shit on average machines (lets not even mention mobile). This is after several months of dedicated optimization time to reduce resource usage and minimize build size.
Overall, you are faaaar better off writing it in JavaScript with one of the pure js engines if you are going to go this route.