Ruby 1.9.3 preview1 released

carbonica · on July 31, 2011

Just to toot my own patches, this version of Ruby has a huge number of improvements to Ripper, the built-in Ruby parser in 1.9. There are some critical bug fixes, and local variable references have a different node type than bareword method calls. I dare say it's close to being ready to power a full Ruby implementation at this point.

neilc · on Aug 1, 2011

Cool -- I'd never heard of ripper before (Googling for it doesn't produce much information either).

Can you comment on the differences/advantages of ripper, compared with RubyParser? For my use case I just want to get a sensible AST from some Ruby source code, a SAX-style interface is not necessary.

carbonica · on Aug 1, 2011

> Can you comment on the differences/advantages of ripper, compared with RubyParser?

Absolutely! To provide context, what got me working on Ripper was my undergraduate thesis: Laser, a Ruby static analyzer, written in Ruby. It targets Ruby 1.9 only. To find out more, check the github: https://github.com/michaeledgar/laser/

My comparisons will be to the latest version of Ripper (the one in 1.9.3).

1. RubyParser only parses Ruby 1.8.x code - this was a dealbreaker for me, but won't be for some.

2. Ripper is C, primarily - it's actually just a separate set of action routines brutally hacked into the normal Ruby grammar file. This also means the code for it is pretty inscrutable. But, it's very fast and is integrated into the actual parser. It should grow with the language by design. However, bugs have shown this isn't as reliable as it could be.

3. Ripper does not provide comment nodes for any nodes - RubyParser provides them for def, singleton def, class, and module nodes. I had to reconstruct this manually using the lexer stream for my purposes, though I get the added benefit that I can attach comments to many nodes (but not all). See the code here: https://github.com/michaeledgar/laser/blob/master/lib/laser/...

4. Ripper is a true AST, in fact it is closer to a concrete syntax tree sometimes, which makes working with it a bit harder. RubyParser has more friendly output, but goes too far in the other direction - it infers semantic information for you, sometimes (imo) quite egregiously.

4a. Constant literals - from numbers to regexes - are returned as objects, with no AST information. The ruby parser does this internally to save some execution time, but I don't believe that's appropriate for a general-purpose parser.

4b. It adds :scope nodes inside class, module, and def nodes, to indicate closed scopes. They don't need to be there - they aren't part of the syntax, they're a property of how you interpret the syntax as a Ruby program.

4c. The worst offender is if you parse "begin; rescue Foo => x; end", it actually inserts an assignment "x = $!" into the rescue block. This is well beyond an AST.

5. Sometimes RP's output is a bit inconsistent in the number of child nodes for a given node. For example, the :rescue node in "begin; foo; rescue; end" has two child nodes. In "begin; rescue; end", it has 1 child node: just the :resbody node. A proper tree would have at least a nil node for the begin body, but RP elides it. This means if you see a :rescue node, you have to always check the first child node's type before you can do anything. That's why consistency is important.

6. RubyParser doesn't pick up errors as often, and as far as I can tell, doesn't report them at all. For example, parsing "def foo(x, x); end" in Ripper will give you a "param_error" node. I'm not too happy with the exact reporting style, and I've blogged about it (http://carboni.ca/blog/p/ripper-plus-How-Ripper-Must-Change), but it at least notes the error so you don't have to, as a parser should in this case (as it's a parse-level error). RP doesn't note it. If you do an invalid global alias: "class A; alias $foo $1; end", RP just ignores the node, whereas Ripper will report it with an :alias_error node.

All told, RubyParser is a nice library and it's important to have an all-Ruby option out there. But for me, Ripper is far more appropriate from a theoretical and practical standpoint for a large-scale project.

Edit: grr, my formatting wasn't saved twice now.

neilc · on Aug 1, 2011

Awesome! Thanks for the information.

stephth · on July 31, 2011

To install with rvm:

    rvm get latest
    rvm reload
    rvm install 1.9.3-preview1
    rvm use 1.9.3
    ruby -v

Compiling took two or three minutes on my machine.

toisanji · on July 31, 2011

Does this contain the fixes with require to make loading rails and other libraries startup faster? I didn't see anything relating to it in the change logs.

riffraff · on July 31, 2011

it seems so https://github.com/ruby/ruby/blob/v1_9_3_preview1/ChangeLog#...

Notice this does not solve rails becoming slower every release (I recently recovered a 1.2 project and it runs the tests something like five times faster then a fresh one with 3.0)

wycats · on July 31, 2011

I know, right?

stephth · on July 31, 2011

Good question. I'm guessing we should at least see some improvement pathname library C rewrite. But I can't tell if the require fix made it into 1.9.3.

For context:

http://www.rubyinside.com/ruby-1-9-3-faster-loading-times-re...

petercooper · on July 31, 2011

I can confirm that yes, the load.c patch as outlined in the Ruby Inside post IS in Ruby 1.9.3 preview 1 :-)

tenderlove · on July 31, 2011

Yes. That was fixed in r31875 which was committed before the 1.9.3 branch was made.

  http://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=revision&revision=31875

petercooper · on July 31, 2011

While the load.c patch made it in this time, is that a good indicator of things making it into 1.9.3 from head? Just thinking about Module#mix here..

wycats · on July 31, 2011

Why do you want Module#mix in favor of Module#prepend. The former introduces a whole new system for mixins alongside the old one, while the latter simply makes the existing system work for extending existing classes (a first-class, no caveats alias_method_chain)

petercooper · on Aug 1, 2011

I wasn't really flying the flag for Module#mix, just intrigued as to the pattern of things making it into head and then either making it out into tagged releases (or not).

That said, I confess I was a fan of mix, but it might be in my favor for things to get more complicated sometimes.. ;-)

tenderlove · on Aug 1, 2011

Only bug fixes (not new features like Module#mix) will be merged from trunk to the 1.9.3 branch.

tvon · on July 31, 2011

FWIW, the improvements are not that significant (based on the last time I tried the patch a month or so ago).

petercooper · on July 31, 2011

I had a 36% improvement in a mid-sized Rails 3 app start time using the core load.c patch. Will be trying again on 1.9.3preview1 soon but in theory it should remain similar. DHH also tweeted noting a significant improvement at the time with one of his apps.

It may not be significant to you or in your use case but it seems in some cases there are definitely reasonable gains to be had, which can't be a bad thing.

dgregd · on July 31, 2011

"... and date library were reimplemented in C" Is this merge of https://github.com/jeremyevans/home_run ?

jeremyevans · on July 31, 2011

Nope. It was a new implementation by Tadayoshi Funaba (the ruby Date maintainer), but it uses a similar data structure to home_run, and performs about as well.

Fluxx · on July 31, 2011

Not sure, but it's a lot faster either way:

https://gist.github.com/1117138

wycats · on July 31, 2011

I'd be interested to see how a better-implemented, properly optimized pure-Ruby version would perform.

stephth · on July 31, 2011

That's an impressive speed boost.

How much of the Ruby stdlib is implemented in C?

SlyShy · on July 31, 2011

I'm excited for the IO/Console library. Even though messing with curses is a rite of passage for anyone writing a roguelike, I'm looking to get out of that business.

smcj · on Aug 1, 2011

It is still painfully slow, right?