What worries me about this and similar efforts (like https://github.com/takeoutweight/clojure-scheme) is that clojure's standard library design assumes that the underlying runtime will be do some kind of polymorphic method inlining.
For example: the sequence library is all defined in terms of ISeq, which basically requires a "first" and "rest" to be defined for the data structure in question. These are polymorphic: there are different implementations of these for different data structures (list, vector, map, etc). So a dispatch step is required to choose the right one. In clojure-jvm, this is implemented using a java interface; this means the jvm will inline calls to said methods when they're being used in a tight loop. And if you use the standard library, calls to 'first' and 'rest' are going to be inside nearly all of your inner loops.
Compare this to a normal lisp or scheme: 'first' and 'rest' (or 'car' and 'cdr', whatever) are monomorphic. They only work on the linked-list data structure. So compiling these directly down to C functions makes perfect sense and incurs no performance penalty.
So in summary: clojure assumes theres a really smart JIT which is helping things along. This means it's not as suitable for alternate compilation targets as you might want it to be.
I wonder if there's something clever you could do here. Vtables could be reordered based on expected usage, certainly. Clojure can already do some measure of type inference, so this could be used for AOT inlining when it's available. Even if it's not, perhaps several versions of a call could be speculatively generated based on what the compiler does know already. The normal polymorphic inline caching technique could perhaps be abused to apply here. But it's hard to see how any of this can work in absence of a profile or heavy hinting.
(not a compiler writer, just interested in the problem)
You can do polymorphic method caching/inlining with a hybrid ahead of time / JIT compiler targeting C reasonably easily. The code fragments required for caching at least will be small, and code to generate them at runtime is not a big deal. I'm playing with a Ruby compiler, and Ruby badly needs these type of optimizations to get fast, so I've spent a fair amount of time looking at it.
For a fair amount of cases you can do static analysis to get good guesses at likely types, even for cases where you can't be sure. E.g. speculatively even looking near call sites by method name to see if you can guess the type of objects that will get passed in looks to get you a reasonable chance at guessing at the top contenders to let you speculatively generate inlined versions without creating too much junk. But to get the most performance out of this you're likely to need to be prepared to do some very basic JIT.
When you spend a few years speculating about what it would take to efficiently compile Ruby as statically as possible (I love Ruby, but I hate moving parts), devious becomes second nature...
The idea of speculatively looking at method names comes from testing that to create vtables ahead of time for Ruby classes, to avoid hash tables in the common-case.
As it turns out, most method on most Ruby classes are the ones inherited from Object or other standard classes, and the number of classes is usually fairly constrained, so again speculatively looking at method names in the compile-time available source and allocating sparse vtables for the most common names results in relatively little waste.
And it reduces typical method lookup to a vtable lookup for common methods, with expensive method dispatch becoming much more rare. There's the tradeoff between theoretical horrible blowup in vtable waste from apps dynamically adding tons of methods and tons of classes, with a unique vtable slot required for each method name across all classes, vs. falling back to doing hash-table lookups all the way up the inheritance chain for "unusual" method names ones you reach certain thresholds for waste.
You do incur the cost of propagating vtable changes down the inheritance tree when methods are dynamically redefined in other places than leaves, but it is fairly rare to see apps where this happens at a very high rate, and the number of subclasses usually fairly small, so it is likely to be quite cheap. Doing it that way is something I first saw in a technical report by (now) prof. Michael Franz from '93 or '94 on "Protocol Extension" for Oberon.
You can probably also get some decent gains by adding heuristics to give preference to names that appears to be used in loops when picking names for the vtables to reduce the need of any JIT'ing.
That's interesting; Apple's Objective-C runtime has a fast vtable for its most common methods and uses a more expensive lookup for the rest. Doing a static analysis to find other commonly-used methods is like the next step up from there.
No, I've been off-and-on, toying with writing a "as static as possible" Ruby compiler (see http://www.hokstad.com/compiler) - it's been about two years since I last posted an update, but I have one new part complete and another one mostly complete. Just holding off posting for a bit longer because I want to have a bit of a buffer (3-4 complete parts) before I get peoples hope of regular posts up again...
What is there uses vtable's exclusively - I effective punted on the slow path (and so on adding methods at runtime) completely, but keep track of how much of the vtable allocations is wasted space. If/when I get there, the goal is to use various mechanisms like this to determine when to fall back on a slow path, and couple both with polymorphic inline caching when suitable.
EDIT:
I don't see MRI as very interesting to work on, largely because interpreters aren't much fun, and ironically given the amount of time I spend using Ruby, I prefer compilers to be as static as possible. I also prefer my compilers to be bootstrapped in their target language. Hence my "ideal" Ruby compiler would be written in pure Ruby, do a ton of static analysis, with minimal fallback to JIT when users user features that are too dynamic to analyse fully ahead of time
E.g. there's a ton of annoying uses of eval() in Ruby code where a more complete meta-programming API would make it trivial for a compiler to do full ahead of time static analysis, so one big thing an AOT Ruby compiler really need to do is to provide a library of compiler specific meta-programming facilities with a fallback that uses eval() as needed, and either convince people to use it, or provide monkey-patches for a number of popular projects. Some of these uses don't even need eval() in the first places, but uses it just as a quick shortcut because it's simpler...
Just to make clear, I'm not sure when or even if my compiler project will ever get to a state where it's even remotely useable to compile Ruby. I started it out without even having decided to compiler Ruby, mostly to write about various parts of the process of writing a compiler that I find interesting. I find compiling Ruby as incredibly fascinating from a theoretical point of view because of the complexity involved, but unfortunately working on it takes a lot more time and effort than thinking about the problems.
Sure, MRI is boring, but it's the best implementation right now (though some may argue that JRuby is better), and it's in desperate need of VM innovations.
Any new compiler/VM starting from scratch will be years away from being available for use in production environments. By the time it's finished, we'll all be using Go. (Sigh.)
I think we need both. MRI can keep getting better, as can JRuby (which is an amazing feat, but to me, running on top of the JVM makes it a non-starter) or Rubinius, but they're fundamentally side-stepping the really hard problems.
E.g. nothing will stop MRI from having to interpret thousands of lines of code each time because it can't draw a line between runtime and compile time, while for an ahead of time compiler for Ruby, finding a pragmatic line between what needs to be executed at runtime vs. compile time is essential (consider for example the tendency to do stuff like getting the list of files in a directory and require all of them in turn).
We do need both. But some of the things you mention (like constructing vtables) could be applied to MRI's VM model without writing a compiler from scratch. Frankly, I would much rather have large performance improvements now than in 2-3 years.
(I agree about JRuby. I also wonder why Rubinius, which showed so much promise at the beginning, has stagnated. Is it simply the lack of developers?)
I agree the performance increase would be great, but I think it needs to come gradually to MRI. E.g. trying to do anything fancy with the old AST-based interpreter would've been pretty pointless. After YARV, it is probably starting to get more attractive, but at the same time they've added method caching which gives a decent amount of the benefits. A vtable will still get faster, but it might not be the most immediately expedient way of speeding things up vs. e.g. Sasadas latest project of adding a generational gc.
Regarding Rubinius, writing compilers for dynamic languages is hard. Most textbooks you'll find cover techniques most suitable for statically typed languages (the best resource I know for starting to catch up on compiling dynamic languages is actually the Self papers). So you need more than an unusual level of interest in writing compilers to be likely to try to tackle a language like Ruby which is tricky even for dynamic languages (e.g. my favorite pet problem to meditate on: What constitutes 'compile time' vs. 'runtime' for ahead of time compiled Ruby?), and even more to actually persevere until you start getting proper results where you can get decent results in days with a simpler language.
It's made worse because of Ruby's horrendous grammar. And I mean that from a compiler writers perspective - as a developer I love to use Ruby to a large extent because the complexities of the grammar means it reads and writes better 95% of the time. But MRI's bison based parser was 6k-7k lines with ugly parser/lexer interplay last time I checked... There are full compilers substantially smaller than that for other languages...
To me, these complexities are part of what makes it fascinating. I firmly believe you can parse Ruby fully with a much, much simpler parser for example. A lot of the ugliness can be abstracted away, and C parser code is rarely good examples of succint code.
I did start playing with MRI years ago, specifically the parser, actually, and started chopping out redundant pieces, but got frustrated and bored with it. That's part of the problem - it's one thing to play around with a toy compiler like I've done, and another entirely to put in the effort to push a major change to MRI through to production quality given the number of years of accumulated history encapsulated in it. Doing the latter as a hobby is a daunting task.
Just a note on Rubinius: The PyPy guys seem to have done pretty well at this. I don't know how similar they are to Rubinius; PyPy reduces to a RPython as an initial step, whereas I believe Rubinius compiles to LLVM's LI.
To tell me to "just use" it seems a bit glib considering that it's apparently nowhere being ready for production (the current code even has failng tests), let alone available as a swap-in replacement for Ruby.
That said: It looks like a really interesting project. I do wish they would phrase the project description as a "Ruby VM" instead of a separate programming language, through. There should not be any need to fork the entire language just to provide better performance.
I would love to have a higher-level-than-C language to work with on embedded systems, and Clojure seems good.
However my applications run on systems with very limited resources (and a Harvard-ish architecture, e.g separate data/program memories) and I wonder how far tools like ClojureC could go with regard to these constraints.
re: JIT on iOS I've heard of people compiling their own JavaScriptCore (to use Ejecta), is this still an issue? Embedded systems is another story, I'd be far more concerned about Clojure's assumptions about GC.
You can workaround it by making use of PGO (Profile Guided Optimization).
This requires you to execute the application with profiling turned on.
Then you compile the application a second time with the additional input of the profile results, this way the optimizer gets additional information that helps it to do decisions similar to a JIT.
Perhaps you could overcome some of these programs with whole-program-optimization. E.g., the "Stalin" Scheme compiler.
It might be interesting to translate Clojure into a subset of Scheme that is widely supported -- and use one of the mature Scheme -> C compilers (Gambit, Chicken, Bigloo) to generate the target executable.
Could the use of Reducers help alleviate this problem? I'm not completely familiar with the internals, but my naive understanding of how they work is that the reducer function would dispatch once on the sequence type, and then iteration stays entirely within that sequence's reducer method.
So if I wanted to write CLI applications in Clojure, is this my best bet? Cause the JVM is about the least suitable platform I have ever worked with for CLI apps...which is most of what I do. I'm constantly in this pickle of wanting to use Clojure but defaulting to Ruby because the JVM is so terrible at it.
The problem is that the slowness is not limited to running command line programs, the REPL is slow to start, compilation is slow and so is running any leiningen task. As suggested in another comment I gave a quick try at ClojureScript but the compilation process is also slow. Clojure is an appealing language but to me the general sluggishness of the environment is too irritating. The existence of things like Nailgun and Drip seem to confirm that there's a real problem there.
I suppose if you're coming from a C-based runtime like Ruby or Python, you have to adapt your workflow. I imagine that a lot of Clojure people come from Java and have already shaped their workflow around the JVM.
Nailgun is not a good solution to the problem. If the JVM startup speed is a problem, your best option is to use something not on the JVM and not use hacks that make it feel faster.
That's not true. Yes, it's worth considering outside of the JVM if being on the JVM means your application isn't interactive enough (startup speed doesn't matter for long running, fire & forget tasks).
At the same time, splitting your application into a client/server architecture is not a hack but an engineering decision. There are times when this decision is natural e.g. Music Player Daemon (MPD)[1]. For most CLI applications, there's no clear benefit (but the general approach has no clear downside either - the code overhead of this approach can be brought very low).
Certainly, in a production application you would want to secure the messaging channel (Nailgun doesn't).
The readme for ClojureC says it's still experimental at this point. I'm not sure how far along it is. You might poke around its tests to see what it does so far.
I'm not familiar with the compromises of clojurescript nor running node.js as a CLI platform. Got any links that can help explain? Can it be compiled to a static executable?
For those interested in Lisp implementations which compile to C, providing cross-platform benefits and a nice FFI, gambit-c and chicken are performant, mature implementations of the Scheme programming language.
Similarly, structured control flow (C's if, for, while, switch etc.) is better than unstructured (LLVM IR just has the equivalent of gotos (yes, LLVM IR has switch too, but it can go anywhere, so it still isn't inherently structured)). So it goes both ways.
Some other advantages of C over LLVM IR:
ABI compatibility with C automatically. In LLVM IR, being ABI-compatible with C is often a considerable headache and you have to do different things on each platform.
You can use C libraries by #including their header files (the way many C libraries were designed to be used), instead of hardwiring all that information (macro expansions, enum values, typedefs, inline functions, struct definitions, etc.) from their header files in your code generator.
You have the option of switching C compilers; you're not as locked into a single backend.
With a bit of care, you can make your output much more readable. LLVM IR demands either SSA form (most front-ends don't want to do this) or herds of allocas, loads, and stores everywhere. In C, you just say "int x;" to declare an int, and just "x" to refer to it.
You can use C features like bitfields, designated initializers, short-circuit operators (&&, ||), compound assignment (+=, -=, etc.), and so on. You can do all these things in LLVM IR, but you have to lower them yourself.
I can't stress the C++ api part of this enough. At the moment I'm building a Pascal compiler in Clojure for course work and when the time came to do byte code generation the first thing I looked to was the LLVM infrastructure both for ease of use and because it would trivially provide me with the ability to target non-x86 platforms.
The reality of the matter is that interacting with non-java linked libraries is a real pain from every JVM language know of. Two years ago this was posted here [https://github.com/jasonjckn/llvm-clojure-bindings], but since then LLVM has gone through two major restructurings so it didn't work out of the box and I estimated that it would take less effort to build my own naive infrastructure than to patch this one & integrate it.
As a result I'm generating code in terms of lists of newline terminated assembly statement strings that I can just print or write to a file when I'm done. While I agree that C is a sub-optimal output format in that you have to compile the output, it is also the clear lingua franca for systems programming and assembly generation these days. Generating C gives you interesting options like linking to other C codebases or your own C code the same way that cljs gives you the option of interacting with "native" javascript libraries as well as clojurescript toolkits.
bah just as I was finishing my register allocator...
My Firefox history indicates that I have read the Mjolnir page before, but I don't recall why I didn't use it at the time. Taking another look :-P. Thanks for the link!
Author of Mjolnir here. There's some major updates going on in the "datomic" branch. If you're thinking of using Mjolnir in a project, you'll want to keep an eye on that branch over the next few months.
With Datomic, the inference engine is completely re-written in datalog. This allows for a massive code clean-up, and the code in that branch is much cleaner.
People have made this argument for ClojureScript as well. In the case of ClojureScript the benefits are considerably less clear as we rely on the ability of Google Closure to do fairly amazing optimizations on the generated JavaScript source.
That said I'd love to see a serious Clojure-in-Clojure targeting LLVM.
Generating C is much simpler than LLVM IR (and you can do structured code generation for C: generate and serialize C AST); it's infinitely easier to read and debug; it allows switching compiler; it provides C stdlib to piggyback on with very little overhead.
It's a fine strategy to start with, there are more important things than fucking around with LLVM IR at this point, they can switch to LLVM or a native generator if they get the thing effective and off-the ground.
I'm not sure if this is what the previous poster was talking about, but clang has some APIs that let you get access to the AST pretty easily. It's been a really long time since I've looked at it, and at the time I don't think it was exposed as a library API for general consumption. But for example: https://github.com/bratsche/clang/blob/gtkrewriter/tools/cla...
What is his definition of better? It sounds like he thinks it's entirely objective, so he should be able to express it clearly and logically.
Does he think that it's technically more powerful? Again he should be able to prove that if that's the case.
Otherwise he's just giving a shitty opinion, and should say that.
I think the claim is nonsense because with inline assembler there is nothing that you cannot express in C that you can with LLVM. So the decision between the two is opinion.
>Does he think that it's technically more powerful? Again he should be able to prove that if that's the case. Otherwise he's just giving a shitty opinion, and should say that.
It's not like it's some controversial opinion what he said -- it's both self evident and common place. It's you who offers the more controversial opinion (and in a rude way, to top).
>I think the claim is nonsense because with inline assembler there is nothing that you cannot express in C that you can with LLVM. So the decision between the two is opinion.
It's not about "expression", and nobody argued that you can express more in LLVM.
This is missing the point by miles!
It's about having more structure and less of an ad-hoc pipeline, which helps with better tooling, error prevention, etc.
(Not only what you wrote is wrong, but even if the original argument was about expression, your opinion would still be wrong. Two things offering equivalent expressive power, does not mean that they are just as good to use in practice at all. Might as well ask "why invent new languages, when assembly can express everything").
The only benefit to using C for something like this is portability, which is something else altogether.
> It's not like it's some controversial opinion what he said -- it's both self evident and common place.
As someone who has written more than one compiler, I don't see how it is self-evident at all. It's also not at all that common-place compared to generating C or asm output textually.
> It's about having more structure and less of an ad-hoc pipeline, which helps with better tooling, error prevention, etc.
Those provide some benefits, sure. At the cost of massive amounts of complexity in the case of LLVM.
> The only benefit to using C for something like this is portability, which is something else altogether.
Now it is you who are wrong. Other people have already pointed out, for example, that C provides an easy-to-read intermediate format, and is simple to generate, as other benefits. Not having to deal with a massive C++ codebase is another.
You may disagree that these other benefits are worth it, but for me at least they are (just taking a break from a compiler that generates textual asm because I find even that preferable to dealing with LLVM).
Sure, I agree about this: "C provides an easy-to-read intermediate format, and is simple to generate, as other benefits. Not having to deal with a massive C++ codebase is another.".
So portability and less dependencies, plus easier.
I don't know what his definition of anything is. I happen to agree with you. The example I pasted above was just from my only experience working with a C AST. In that case I wanted to input C code and then write out a transformed version of it.
What's with this childish "can you prove this, can you prove that" thing? Are you 12 year old?
Not everything can (or should) be proved of the drop of a hat in a discussion list -- that doesn't make everything without a formal proof "baseless" opinion.
If you cannot see the self-evidentness (sic) of a STRUCTURED API to produce an AST makes it easier to avoid mistakes compared to spitting out text to compile as a C program, then I'm not sure any proof would help anyway.
It's like asking me to prove why using an XML processor to crete and save a DOM tree would produce more error free results than manually compiling tags as strings.
Or why parsing a JSON file and working on the nodes is less error prone than using regular expressions to extract values from the JSON as a big string.
Not everything can (or should) be proved of the drop
of a hat in a discussion list
If you say something that doesn't make sense, you're saying that one shouldn't have to back that up?
It's like asking me to prove why using an XML processor
to crete and save a DOM tree would produce more error
free results than manually compiling tags as strings.
It is in fact the equivalent of asking you to prove that an XML parser after serializing a DOM tree and then parsing that same document produces a non-equivalent DOM tree to the original.
I think the reason pat_punnu is balking at what you have said is because of this: You can think of C as a serialized form of an AST. In order for parsing to produce a non-equivalent AST to the one used to serialize it, the C grammar must be non-deterministic. The C grammar is not non-deterministic, therefore what you said does not make sense and pat_punnu (somewhat rudely) asked you to back up what you were saying.
Asking people to back up what they have claimed is part of intelligent discourse. This happens often on HN and is one of the reasons I like this community because when the person who makes a non-intuitive or seemingly wrong remark turns out to be correct, I learn something.
UPDATE: I should point out that I'm responding specifically to this sub-thread of the tree and not arguing about whether or not this should target the LLVM IR. I think that would be nice.
>>Not everything can (or should) be proved of the drop
of a hat in a discussion list
>If you say something that doesn't make sense, you're saying that one shouldn't have to back that up?
Read what you quoted from me, and what you ask. Where do I say that people should not back up things they say that "don't make sense"?
Where do I even say they do not have to "back up" the things they say? I merely say that they do not have to PROVE everything. You can back stuff up with some arguments and counterargurments, you don't need to provide some "proof".
>I think the reason pat_punnu is balking at what you have said is because of this: You can think of C as a serialized form of an AST. In order for parsing to produce a non-equivalent AST to the one used to serialize it, the C grammar must be non-deterministic. The C grammar is not non-deterministic, therefore what you said does not make sense and pat_punnu (somewhat rudely) asked you to back up what you were saying.
And the reason I'm balking at this is that you examine the case AFTER C has been generated. I'm not talking about that stage (when reading back C to generate an AST). I (and pat_punnu) and talking at the previous stage of spitting out the C code to disk in the first place. I'm saying that a structured way to do that (LLVM API) is safer than merely creating strings yourself.
So your: "It is in fact the equivalent of asking you to prove that an XML parser after serializing a DOM tree and then parsing that same document produces a non-equivalent DOM tree to the original"
takes this from several steps ahead. I (and pat_punny) were concerned with the generation of the document in the first place.
We work in a technical field. We call ourselves engineers. The academic version is computer SCIENCE. Unsupportable opinions and opinions presented as facts are inexcusable.
This argument can just as easily be used to support a claim that generating C via string manipulation can very well be less error prone than relying on a huge, complex API like LLVM.
It is not a given that there are "more moving parts" in generating C output from a compiler than in using LLVM.
The LLVM moving-parts, yes, but you need to write code to use them too, and it's not a give that there are more moving parts in generating C output that those.
> Your moving parts in your own solution, you'd have to write yourself.
That's not always a bad thing for error rates, if the alternative is figuring out to use a massive library correctly.
It's easier to get Lein as standalone. The Ubuntu package depends on half the build infrastructure of java, using really minimal parts of it here and there. If you just get the shell script and stuff it in your path, it can self-install a 10MB jar that has all it needs.
Wait, are you trying to install it as an APT package? It's probably easier to just download the script and install it that way. If you do that, it will install all the libraries to a folder in your home directory.
sadly, this doesn't seem to support any sort of multithreading. Even something as simple as swap! isn't thread-safe in this implementation. So that kills one of the main reasons to use Clojure in the first place.
Lack of multithreading seems like a result of the implementation being heavily based on ClojureScript ;) (it's actually pretty cool to see how reusable core.cljs is IMO). I imagine ClojureC will have its uses like ClojureScript does when the JVM is not an option.
What worries me about this and similar efforts (like https://github.com/takeoutweight/clojure-scheme) is that clojure's standard library design assumes that the underlying runtime will be do some kind of polymorphic method inlining.
For example: the sequence library is all defined in terms of ISeq, which basically requires a "first" and "rest" to be defined for the data structure in question. These are polymorphic: there are different implementations of these for different data structures (list, vector, map, etc). So a dispatch step is required to choose the right one. In clojure-jvm, this is implemented using a java interface; this means the jvm will inline calls to said methods when they're being used in a tight loop. And if you use the standard library, calls to 'first' and 'rest' are going to be inside nearly all of your inner loops.
Compare this to a normal lisp or scheme: 'first' and 'rest' (or 'car' and 'cdr', whatever) are monomorphic. They only work on the linked-list data structure. So compiling these directly down to C functions makes perfect sense and incurs no performance penalty.
So in summary: clojure assumes theres a really smart JIT which is helping things along. This means it's not as suitable for alternate compilation targets as you might want it to be.
I wonder if there's something clever you could do here. Vtables could be reordered based on expected usage, certainly. Clojure can already do some measure of type inference, so this could be used for AOT inlining when it's available. Even if it's not, perhaps several versions of a call could be speculatively generated based on what the compiler does know already. The normal polymorphic inline caching technique could perhaps be abused to apply here. But it's hard to see how any of this can work in absence of a profile or heavy hinting.
(not a compiler writer, just interested in the problem)