Hacker News new | past | comments | ask | show | jobs | submit login

Actually, no - while the bytecode is smaller than asm.js, when you compress them both, they end up around the same. The issue is that the bytecode is designed for fast execution, not compressibility, so it contains absolute offsets for jumps, etc. - this lets you just start running the code immediately, but also means it looks like random noise that compresses poorly. And, JavaScript is fairly compressible, as it has less such noise, it's very regular.

However, it does seem likely that a binary format that is designed for compressibility could be smaller. But, it would not be as fast to execute.




Here's a binary format that is designed for compressibility and fast code generation [1]. It's a bit of a pet of mine (I just realised one of the Wikipedia links are to my blog) - it was the PhD dissertation of now professor Michael Franz at UC Irvine, who incidentally had Andreas Gal (now of course at Mozilla...) as PhD student.

SDE effectively compresses a low level syntax tree by building a dictionary that as a sort of sliding window over the tree by adding specialised nodes as well as more general nodes, and the encoding new variations of specialised sub-trees using the new elements in the dictionary and subsequently adding more specialised "instructions".

As a result you both get compression (think a tree variation of Huffman coding) and "hints" to let you re-use partially generated code as templates (e.g. one dictionary entry maps to "assign x to y"; you also add "assign x to _", and later you find a reference to the entry for "assign x to _" + z; now you copy the code for "assign x to <something>" into place and fix it up with the address of z).

The approach has always fascinated me, but it fell in the shadow of the JVM.

[1] https://en.wikipedia.org/wiki/Semantic_dictionary_encoding


Interesting, thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: