Unity3D also has WebGL support, and is surprisingly quick for having a full physics and high quality rendering engine where you can build in C#. That would allow you to make a MineCraft clone that runs natively in the browser, and because it's Unity (a fully grown game dev environment) you will not have to reinvent any wheels or worry about many edge cases.
The really hard part about MineCraft isn't rendering, though; it's multiplayer.
As someone with two Unity WebGL games in production, I wouldn't recommend this route. A fairly small game (~20kloc, so not a toy by any stretch, but small nonetheless) has 26mb of JavaScript code exported This kills the browser.
Only good desktop or laptop machines can run it well. Runs like shit on average machines (lets not even mention mobile). This is after several months of dedicated optimization time to reduce resource usage and minimize build size.
Overall, you are faaaar better off writing it in JavaScript with one of the pure js engines if you are going to go this route.
Can't upvote this comment enough. I'm also currently in the process of creating a full featured WebGL game, and nothing is faster than simply using pure JavaScript there. Most people won't accept this, but when you ask them, those never actually tried creating a complex WebGL game.
These are totally fine most of the time. Typescript in particular looks like a good choice for game development.
Sometimes the language is less of a good choice. E.g. i did a small toy project in coffeescript about two years ago, and found to my horror, that if your function ended in a loop, it would return an array. Always. This isn't possible to accidentally do in the host language, so I'd say it is a bad fit (for game development, where you care about such things).
I believe that point is that JS always returns something for every object. Many users of JS just don't realize it. (I know I didn't till I started drinking the coffee flavored kool-aid)
This is probably as a result of CoffeeScript's Ruby-esque syntax and semantics. It does mean you can write some very clear and succinct code at times, but the tradeoff is that it can have unforeseen performance implications.
They don't add much overhead during runtime; most of the time there is zero overhead, or sometimes they can (theoretically, not sure if it's done in practice) produce more optimal code than a "literal translation".
It depends on the language. TypeScript is basically JavaScript with added bits. Unity compiles to JavaScript, but it does so through a crazy chain of C# -> CIL (via Mono) -> C++ (via il2cpp) -> JavaScript (via Emscripten). And then remember that your browser then converts that JavaScript to assembly. Holy convoluted toolchain, batman.
There's also the alternative of using C/C++ for portability and performance, but without a huge engine runtime you'll have to drag along.
There is no inherent size advantage of doing things in Javascript vs. cross-compiled to asm.js. For instance, three.js is 100kByte minified and compressed, that's in the same ballpark as the (admittedly simple) demos here: http://floooh.github.io/oryol/
Once you are over the static overhead for the C runtime and web-API-wrappers of roughly 30..50kByte, emscripten demos don't grow much faster than minified and compressed Javascript code.
If you write in C you are probably fine for code size. Usage of the c++ stdlib can easily cause a lot of code bloat though if you are not careful, but I don't actually know how bad this gets (it used to be quite bad for native apps).
There are other issues as well. The biggest offender being the inflexible heap size, e.g. when you launch an asm.js module, you pass in an array buffer as the heap which cannot be resized. If you need more memory than this, you are SOL.
This wouldn't be that bad, except in practice it needs to be ridiculously small if you want to run on most machines. 256MB, which not much memory for a game at all, will fail to allocate, or kill the tab on about 33% of machines (being generous. For us its probably more like 50% of machines, but we make educational games that need to run on university machines, which are uh, on the lower end).
Either way, as a result, a great deal of the optimization time I mentioned was spent getting all of our games to run with under 128MB heap.
That said it's definitely possible if you're careful (I'm not sure this is quite as true for a voxel game though, but I could be wrong). My complaint wasn't about the idea of using asm.js, it was the idea that Unity would produce a good result. It just doesn't.
Are there ready made engines suitable for this?
Unity has taken 11 years and Three.js 6 years to get to their current estabilished status, so let's keep fingers crossed that something gets started soon on the asm.js/emscripten front.
I think that's a bit of a chicken egg problem, the large game engines need to be everything to everybody, and trimming the executable size down probably wasn't a priority in the past since mostly you'll have a big upfront download anyway.
However, I remember that the first Unity HTML5/WebGL demos a few years ago were much smaller, a complete zombie FPS had a 3.5MB upfront download which is actually pretty decent.
How much of the 26mb JS is actually invoked at run-time?
I tested several real world web sites, like google search, github, google docs - only around 30% of javascript loaded into browser is actually invoked at run-time. Maybe in your case the usage ratio is much lower?
emscripten-compiled code already has a very aggressive dead-code elimination, so this wouldn't help I guess. It is most likely the cross-compiled Mono runtime plus required Unity modules which produce the code bloat on top of the actual game play code.
The emscripten dead code elimination uses static analysis which is limited only to the code totally impossible to be reached during execution even in theory. But POCL leaves only code which is actually invoked at run time (keeping all other code on server in the "load at first invocation" form). The difference may be very significant.
Google products (google docs for example) use Closure Compiler for dead code elimination, but anyway, after clicking almost every menu item in document editor, only around 45% of code was invoked.
If that 26mb web site was available online, I would be interested to measure it's code usage ratio.
BTW, the idea is not radically new. Java loads classes only when they are accessed first time.
Operating systems load native applications by mapping executable file into virtual memory. The pages are loaded into physical memory only when CPU tries to access them.
Ok yes, if there's a mechanism that dynamically loads the missing parts then it could work (actually emscripten supports dynamically loading scripts similar to DLLs). But 'naively' deleting functions from an asm.js blob would result in some 'unknown function' exception if the analysis was wrong and a removed function is still called somewhere.
Three.js is the main high level WebGL lib most people seem to be using, also for many MineCraft clones. It's not as web-hostile as Unity. Besides WebGL, it has renderers for CSS and 2D canvas too.
There are other actively developed Web-friendly WebGL frameworks, like Goo Create, PlayCanvas, BabylonJS and many more.
In most arena-shooter-type games, you don't need 32x memory for 32x players, because the per-player state is fairly constrained and known. (Often limited to how many projectiles they can keep in the air at once.)
I suspect Minecraft has a much harsher profile, since it's possible for every player to be loading a different piece of the world with minimal overlap, and world-pieces involve a lot more state that needs to be preserved/calculated/saved.
Much of the persistent world-state in Minecraft consists of blocks laid out on a 3D grid, which are probably fairly easy to optimise. Compare this to other 3D games where there are lots of entities with dynamic physics that can move to just about any world position, and I think Minecraft would be the easier of the two.
Also I'm not sure how Minecraft handles it (I assume it pretty much ignores it, or does something very naive), but in FPS games you need to take into account latency and do predictions on the client side to minimise it.
Think of two players running towards each other, and shooting at each other. By the time the player (or even the server) receives the data saying the other player has fired their gun, they've both moved to completely different positions.
(There is a good set of articles on this by a game developer, I can't find them now)
There was a bunch of Source-engine stuff Valve published, which is usually my go-to citation when arguing with players who don't understand "netcode" for a game but criticize it for not giving them perfect instantaneous communication anyway. :p
Most of that state only has to exist on the server, which in turn doesn't really have to figure out visibility and doesn't necessarily even need textures.
The world pieces are a very specific set of permutations, which could easily be loaded in to a graphics card's memory at the start of the game. All you are really tracking are which pieces are where, and by the game's nature, there aren't that many pieces or positions they can be in...
The really hard part about MineCraft isn't rendering, though; it's multiplayer.