The problem is that RubySpec isn't used to TDD cross-implementation features. It isn't a "specification" so much as a reflection of what MRI does.
The Rubinius team's complaints (though hyperbolic) still stand. Developing features in MRI and then testing other implementations against that is not a good design process. It makes sense for historical reasons, but there is significant work to be done cleaning out the cruft of Ruby.
I had no idea the main ruby implementation uses plain old bison instead of a hand written parser, but what exactly is wrong with that? If anything having all the grammar rules already spelled out means another implementation can just reuse them, there are bison-like tools for almost any language.
Over half of those lines are a hand written lexer. The grammar is basically useless without the lexer, which is very complex, full of weird corner cases, and changes version by version.
And much of it, in both the lexer and the grammar rules, is very heavily tied into MRI internals.
The Rubinius team's complaints (though hyperbolic) still stand. Developing features in MRI and then testing other implementations against that is not a good design process. It makes sense for historical reasons, but there is significant work to be done cleaning out the cruft of Ruby.
For a sample of such cruft, please refer to the 11k+ line "parse.y": https://github.com/ruby/ruby/blob/trunk/parse.y
Done this way, all other implementations are held back by the early mistakes of MRI and there is little incentive for MRI to change that.