The combined man-hours spent on producing various compilers targeting JS would be enough to:
* Build the next generation browser that supports a proper programming language.
* Write migration instructions.
* Convince major web-app providers that they need to migrate.
* Help with migration.
Browsers don't need a programming language. Browsers need a standardized bytecode. The problem is, bytecode can have a significant impact on potential performance. So, as long as browsers are still competing on JS performance, they are actively disincentivized from pursuing standardization of a bytecode...
In my book, some kind of byte code or other intermediate language is a lot better idea than distributing the source code of web apps. There are a lot of disadvantages to delivering programs (for immediate execution) as source code. Note: this is orthogonal to open source and licensing.
So what would be needed is an intermediate program representation that:
- Is well defined
- Is a good compilation target from various source languages
- Is easy to validate for correctness and security
- Is fast to interpret and compile
- Distributed in fast to read binary format
- Suits a dynamic language
In my opinion LLVM IR or CLR bytecode are the most technically suitable alternatives out of existing ones.
The web has quite a long legacy of Javascript and a tradition of supporting deprecated browsers so this kind of revolution is unlikely to happen any time soon.
> Besides: I can't think of language-neutral bytecode projects that ever worked. Remember Parrot?
JVM, CLR and LLVM are all popular compilation targets that are used with dozens of different source languages each. Bytecode and program representation was never the hard part, it's the frameworks of the OS/platform underneath.
I strenuously disagree that distributing the source for web apps is orthogonal to open source.
That the web is by default open presents the social dynamics that are quite different than in systems like java applets, which are by default closed.
Open source can exist regardless of whether the web is open or not, but you are kidding yourself if you think whether or not source is distributed with a web page doesn't effect the open source landscape.
Today, JavaScript is distributed in minified and/or obfuscated form. And the JavaScript code might have originally been generated by a compiler from a different source language. So calling the stuff your web browser downloads and executes the "source" is not really a valid argument.
Instead, JavaScript is used as a program representation with small size and immediate execution in mind, and it really sucks in that role.
Even if the code is distributed in source code form with white space, identifier names and comments intact (bandwidth isn't cheap, you know), licensing is still the factor that legally determines if it is free software or just open source by default.
The ability to figure out how a piece of code works is different than being able to take that code and reuse it elsewhere.
View source is important both for the sake of debugging in the case of 3rd party javascript, and also for the sake of being able to teach others.
Regardless of the minification issue, it's possible to use a code formatter to expand a minified file and at least identify where declarations are being made and what functional behavior is being specified. Minification makes viewing source inconvenient, but that's not equivalent with closed.
Minified files, more often than not, have the original identifiers, original code structure (e.g. you know a for from a while ). With a decompiler, you lose all of that.
Google closure is a compiler (and thus loses a lot of this data), but most minifiers are not semantically destructive.
Many other minifiers also munge the symbols (e.g. YUI compressor). And why should this process retain anything anyways? If you want your source-code available for download then... put up a download link.
> And why should this process retain anything anyways?
Because javascript has some reflection, and also because of weird scoping rules (with, eval, etc.) you can never be sure where a symbol comes from unless you have access to the exact version of every other script on the page, e.g. jquery, etc)
> If you want your source-code available for download then... put up a download link.
I agree - but there is still a huge difference between deminified and decompiled source readbility - which is all I was saying.
js minifiers not only remove whitespace, they reorder execution paths, eliminate useless declarations, identify unreachable code [1]. Reformatting a minified js file is not going to produce a much better result than disassembling a distributed binary file would.
Google's closure is a compiler, not a minifier (the distinction being: if it does semantic analysis it is a compiler; although I've never seen a minifier that does even real syntactic analysis -- they all do lexical)
LLVM's bitcode is an intermediate representation, not a bytecode (last time I looked at it, you couldn't even move it from one architecture to the next and be sure it worked), and developers of JVM-based language have almost as much work bridging the differences between their language's semantics and Java's (for which the JVM's bytecode exists first and foremost) as they'd need to create a new runtime.
> Bytecode and program representation was never the hard part
Bytecode is very much a hard part if you're trying to support more than a single language, many bytecode have both language and VM semantics leaking in, which essentially makes them even less useful than cross-language compilation.
> LLVM's bitcode is an intermediate representation, not a bytecode (last time I looked at it, you couldn't even move it from one architecture to the next and be sure it worked)
Most of LLVM IR is portable, but naturally there are target-specific parts. If LLVM IR was to be used in the web, these target specific parts would be defined for the "web" target. LLVM IR is used in a similar manner in NaCl and OpenCL where it is used as a portable binary format.
> Bytecode is very much a hard part if you're trying to support more than a single language
It's not easy but it's mostly a solved problem. JVM was designed with a single language in mind, yet it's used by dozens of languages today. Similarly LLVM and CLR are widely used.
> It's not easy but it's mostly a solved problem. JVM was designed with a single language in mind, yet it's used by dozens of languages today. Similarly LLVM and CLR are widely used.
How is this statement qualitatively different from "JavaScript was designed with human writers in mind, yet it's used by dozens of languages today"?
It's quite difficult to write a VM target that is fast for both dynamic and statically typed languages. The CLR probably comes the closest but is far from perfect.
Since we have to have legacy support for javascript approximately forever anyway, I'd say the best way forward is to embed a vm from the statically typed world as a target for static languages and leave js as a cross compilation target for dynamic languages.
To avoid nasty cross vm memory leaks you'd probably want to limit interoperability -- at least at first.
> It's quite difficult to write a VM target that is fast for both dynamic and statically typed languages.
It's also nigh-impossible to design a bytecode which does not significantly restrict (and dictate) the semantics of the VM running it. It mandates a bytecode-based VM, to start with (V8 does not use bytecode).
Only if you assume that view source was ever a good idea.
If the writer wants you to see the source, you can download it elsewere (githun, etc) if not, there is no difference between not getting the source to office 98 and not getting the source to google docs.
When the Web was young and there was no GitHub, viewing a page's source made it so that you could see by example how every page worked. It exposed you to a huge number of possibilities. The Web partially owes its success to this feature because without it far fewer people would have made it to the stage where they could produce their own pages. Every web developer/designer has used this to help them -- guaranteed.
Besides, the feature would have sprung up anyway whether the browser vendors wanted it included or not. Third parties would have stepped in. It's plain text transmitted over the wire. Unlike the source of Office 98 which is not retrievable from the Office 98 binary, a page's source is readily available to the machine that has to cobble it all together when viewing that page.
The presence of the View Source item in the context menu of all web browsers is where many people first interacted with source code. Long may it remain there.
I think minifiers and obfuscators already destroyed any hope of learning from websites. But on the other hand, Github and the likes are a much better learning source than we ever had.
With the advent of pretty print in Chrome's web development console (and add-ons available for Firefox), minifiers are less of an issue for learning how a site works.
I suspect that minifier arguments are no-win situations: either someone doesn't like that they can't easily read the JavaScript, or someone complains it takes too long to load the scripts for a given page.
You're right. I should have explicitly specified "aimed at dynamic languages". Compiling dynamic languages to CLR or JVM sortof works, but is less than ideal. And "ideal" (or more suitable, at least) language-neutral bytecode projects have never taken off.
What that has to do with anything? The post I replied to said that "language neutral bytecode project neve worked" and brought up Parrot as an example. He was not specific to them not working for Javascript.
Now, besides off context, your comment is wrong in four ways:
1) we are not discussing about a bytecode to implement Javascript, but a bytecode to use in browsers for implementing all sorts of languages.
2) There are other lnguages that have been implemented in CLR/JVM that are faster than Javascript.
3) V8 has been optimized by a special team, for millions of dollars, with speed as the #1 target. When that amount of money and effort, a JS running on JVM/CLR can be fast too.
4) A bytecode does not preclude a JIT. In fact it would be trivial to have V8 run a bytecode instead of raw js, just as Python runs pyc.
JavaScript is very relevant because there is tons of legacy code that has to run equally as fast in this new bytecode VM as it runs today under the various engines. The choice of VM comes down to how well it works with dynamic languages. JVM and CLR are both modeled after statically typed languages. Microsoft spent quite a bit of money trying to put JavaScript on top of .NET but finally gave up, and today it places it's IE platform on equal footing with .NET.
I think the problem with Java applets and Flash was not the bytecode but the slow startup (Java) and the closed source software (Flash) as well as the poor integration with HTML. There is no reason why browsers couldn't implement the same functionality we have now with Javascript using a bytecode based VM.
It just isn't standardized, or made available to website authors directly (there would likely need to be significant extra security audits before you could contemplate something like that...).
Since you would need backwards compatability, just make an argument to script elementa that specify the byte code to use instead of the javascript. Then older browsers could continue to function and the rest of us don't have to wait.
Couldn't agree more. perfunctory is ignorant to believe that building a new programming language would solve the problem. Google is trying just that with Dart, and it will take years before it will be adopted by the other browser vendors, if ever.
These micro-languages (that compiles to JS) try to solve the same fundamental problems that exist in JavaScript, which mostly are syntactic or semantic in nature. But there are some solutions and implementations that will likely help further the development of ECMAScript, so the man-hours spent on creating them aren't all lost.
And the combined person-hours spent on many non-essentials (from gardening to sun bathing to gaming) would be enough to solve world hunger, reduce human casualties in natural disaster zones to approximately nothing with notifications and infrastructure reinforcements, detect and stem disease outbreaks in days or hours, provide good public transports in all urban areas, and probably much much more.
Problem is that there is no proper programming language - you will never find one language that even comes close to suiting everyone. Even a VM would be hard to optimize for all the different kinds of languages people would want to run on it, or at least people seem to think so:
Compilers targeting JS exist because it is possible and because people prefer various other languages. Replacing JS with another language would just mean people would write various compilers targeting this new language.
The fact that you refer to "man-hours" makes me suspect you underestimate the magnitude of the problem.
The correct measuring unit for the things you're talking about is man-centuries (and not one, but several), as far as I can see, just for the first item in your list.
People have been making compilers targeting JS for several years now. Do you really think there are 200-300 people doing this full-time?
> It produces readable javascript that is reasonably debuggable
The examples on the page don't seem readable and reasonably debuggable to me. The simple square definition:
square x = x * x
Compiles into:
var square = function($_a) {
return new $(function() {
var x = $_a;
return _(x) * _(x);
});
};
Which, in turn, has all the "$" and "_" indirections.
This is basically why i think that trans-compilation to JS from a semantically very different language is not a good idea, unless a way of debugging _in the source language_ is provided. The success of CoffeeScript, i think, comes from it not differing too much from JS semantically; yes, it adds things like classes or everything-is-an-expression semantics, but the step of trans-compiling those things into JS is pretty trivial.
I hope the advent of source maps will help in the ease of use of these languages.
I could not agree more with all your points. Source maps are the key, and i look forward to them very much.
I am a coffeescripter, and i already have mixed emotions about the extra layer of complexity it puts on my code for others to get up to speed. Taking it further from the actual language (javascript) just adds a layer of obfuscation that does not really help javascript and its community at all (and many of the times your peers). It may make you feel better about what you are doing at the time (e.g. i must have typed vars, or whatever you people say), but its just making you feel better, it still is javascript, and you still have to debug javascript across browsers - that's just that. You still need to know all the javascript, you cant just use X to js and only know X. Only reason i get away with coffeescript is when i know im modern browser world, not old ie and the greater world of browser bugs. Otherwise im writing javascript.
A lot of people seem to be questioning whether we need a new programming language. Nobody needs a new anything most of the time. At least for me, I enjoy it when people create all sorts of new programming languages. It gives me a new way to try things and interesting insights into something I would have never thought about before. I think all of these languages that compile to Javascript are great.
I don't know very much about Haskell, but this looks really great from a quick glance. I wonder how performance compares to plain JS – surely all the laziness and resulting JS closures have a cost.
What do you mean? The performance will be exactly the same as if you implemented it in plain JS. The laziness and closures sure has an impact, but they fill their purposes and the resulting application wouldn't work the same without them.
If you always implemented it the most general way, sure, but most people don't write lazy, curried functions when using JS (and often needn't when writing Haskell even though it makes more sense in that case).
As always, its a matter of preference. Fay is for devs who prefer the syntax and semantics of Haskell before that of JS, and it will not solve anything else beyond that. The overhead is likely neglible in most cases, and is probably worth it if it lowers development time.
Not sure if I understand your suggestion... I suppose if one wants to write server-side code in a Haskell-ish language then one would simply use standard Haskell plus Yesod or another web framework.
It is always better to have alternatives. Being self-hosting and compiling to a widely-used language like javascript is definitely a plus. Another useful case is that an interactive "Try Fay" page can be setup (like "Try CoffeeScript" page).
If the average IQ of the world population is higher, I think Haskell will become a widely-used language. Haskell is a language much more than its syntax, and takes a lot of effort to learn.
I do have some plans to make a JSON Fay-in-JS-out service. Possibly with some “export compressed as .tar.gz” feature to give you a production-ready export.
It would even be quite cool to make a REPL and development environment, but that's a little far off.
Instead of comparing to Roy/Elm, which I have never heard of (has anyone?), he should compare it to livescript, a fork of coffeescript that has quite a few syntactic similarity to Haskell (it's inspired by it).
I think this is targeted at Haskell people looking to compile to JavaScript rather than JavaScript people looking to use a new language.
Roy and Elm are new languages that people who follow Haskell are probably familiar with. Comparing to them makes more sense than comparing to LiveScript if your audience is Haskell programmers.
LiveScript is not really anything like Haskell at all except in some entirely superficial ways. Even the syntax isn't all that similar. Essentially, it's a slightly more functional CoffeeScript. While certainly interesting to the same people and for the same reasons as CoffeeScript, it's not very interesting to people who primarily use Haskell.
Spot on summary. Fay and Elm are indeed both well known among Haskellers interested in language developments in the browser.
LiveScript is a good comparison only in the way it transforms and is part of the wave of "compiling to JavaScript". For that reason it might be nice for me to mention it on the page.
LiveScript is really only like Haskell in some syntax and naming of some functions.
The defining characteristics of Haskell (to me, bear in mind I'm a Haskell noob) are strong, static, inferred types, and an emphasis on pure or near-pure functional programming (limiting side effects as much as possible). The defining characteristics are not a standard library function named "fold" or "takeWhile".
LiveScript has JavaScript semantics, like CoffeeScript. These semantics differ from Haskell in deep and fundamental ways. Yes, it might also encourage functional programming, but so does Scheme and Clojure, both of which are really nothing like Haskell either. I don't think this is controversial.
It's far enough away in terms of semantics that it's not particularly relavent. I'm haven't written much Haskell, but from what I can tell, Haskell is really all about the types. And the laziness. And the purity (which is all about the types). It's missing most of the the things that make Haskell Haskell, and not another functional language.
It borrowed some syntax, but from what I can tell, it's much closer semantically to Scheme.
Which parts of the syntax are like Haskell? As far as I can tell, the syntax for calling functions is similar to Haskell, but that's about it. Maybe some of the operators are vaguely like Haskell's, except they're baked into the language instead of being definable by the user.
The vast majority of the syntax seems based on CoffeeScript.
LiveScript has many features over CoffeeScript that may not exactly be as in Haskell, but are inspired by it.
You can define curried functions, use partially applied operators (addTwo = (+ 2)), use operators as functions (sum = fold1 (+)), use infix functions, ("hi" `startsWith` 'h'), compose functions (h = f . g), have proper list comprehensions, and its standard library, prelude.ls, is based off of Haskell's Prelude module (inclusion is optional though if you want to use underscore.js or something else). For more information check out http://gkz.github.com/LiveScript/blog/functional-programming...
I've heard of Roy, and even though I know very little about it or about LiveScript, comparing Haskell to Roy immediately makes more sense than comparing it to a language that "has a relatively straightforward mapping to JavaScript."
I noticed the use of $, _, and globals like enumFromTo in the compiled js. It says $ means "thunk" and _ means "force". Does this means this compiled js further compiled so $, _, and globals are further replaced with something like Fay$$...?
Lovely article but I'd like to ask that we call things that "compile to javascript" transpilers. JS is not "web assembly" and if we keep communicating that message in the community it will be believed to be.
This is great. Now Haskellers can not only develop in a single language on the browser and server, but it looks like this could make it possible to leverage existing Node.js code on the server side. chrisdone, is there anything more that needs to be done to support using Node modules?
That's a pattern match. Math is a data constructor, it makes a Math object. It takes three arguments. So (Math 1 2 3) is a Math object.
A pattern is a way of deconstructing an object into its constituent parts and bringing some of those into scope.So (Math x y z) is a valid pattern, which would bring these values into scope: x=1, y=2, z=3. In this case I'm bringing only the second argument into scope. In a pattern, _ means "ignore this argument".
Hearing the words "compiling to Javascript" here, from Google, and elsewhere drives me nuts. Generating code in a dynamic and uncompiled language is NOT compilation! It is just a type of translation. If you want to make up a word, call it "relanguifying"- I don't care- just don't call it compilation.
Actually Wikipedia on the disambiguation page says that compiliation is: "In computer programming, the translation of source code into object code by a compiler."
On the main wikipedia page, you cut off the full definition: "A compiler is a computer program (or set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language, often having a binary form known as object code). The most common reason for wanting to transform source code is to create an executable program."
Note how it says "the target language, often having a binary form known as object code." and "The most common reason for wanting to transform source code is to create an executable program."
If you then go down into the description, you'll see: "The front end checks whether the program is correctly written in terms of the programming language syntax and semantics. Here legal and illegal programs are recognized. Errors are reported, if any, in a useful way. Type checking is also performed by collecting type information. The frontend then generates an intermediate representation or IR of the source code for processing by the middle-end.
The middle end is where optimization takes place. Typical transformations for optimization are removal of useless or unreachable code, discovery and propagation of constant values, relocation of computation to a less frequently executed place (e.g., out of a loop), or specialization of computation based on the context. The middle-end generates another IR for the following backend. Most optimization efforts are focused on this part.
The back end is responsible for translating the IR from the middle-end into assembly code. The target instruction(s) are chosen for each IR instruction. Register allocation assigns processor registers for the program variables where possible. The backend utilizes the hardware by figuring out how to keep parallel execution units busy, filling delay slots, and so on. Although most algorithms for optimization are in NP, heuristic techniques are well-developed."
So, just stating that it is "a computer program that transforms source code written in a programming language into another computer language" is inadequate. There is more to it than that, and unfortunately so many just don't get it.
Very cool except it has no NodeJS example. By the way, JavaScript does not suck. It has module and package systems, check out NPM, and OneJS for using all NodeJS utilities in client-side; http://github.com/azer/onejs
Fay will never be used for Node.js. If you're going to write Haskell to run in a multithreaded code environment, you'd simply run it in the superior-in-every-way GHC runtime, which has proper 21st century support for a wide variety of modern concurrency constructs, instead of warmed over event-loop ideas from the 1980s. (Or 1970s.) Haskell : Node :: Node : raw select loop code.
Of course it sucks, in lots of ways. That doesn't mean it isn't also good in lots of other ways. Don't get so attached to your tools, recognise that most things suck in at least a few ways and life will be easier :)
Once you recognise the suckyness you can start thinking about ways it might be reduced. You can also start reasoning about whether things that reduce some dimensions of suck are worth the cost of the other dimensions of suck that they increase.
> The core of how the language works (imperative, prototypical) is great.
Some might disagree with this assertion. I don't think I'd ever put imperative in the pros column for JS, and in any case its imperative nature doesn't really distinguish it from most popular languages.
Well, being that 'sucks' is subjective, and every language 'sucks' under different scrutiny, then this statement is as useful as "Javascript is a programming language". Useful in almost no contexts, but completely inane and irrelevant when having a discussion on web programming.