This is more a function of Ruby than of tree-sitter. The tree-sitter grammars for other languages are hopefully less inscrutable. For Ruby, we basically just ported whitequark's parser [1] over to tree-sitter's grammar DSL and scanner API.
I didn't mean the tree-sitter grammar was not understandable - it's very understandable - I just can't work out how to managed to find such a concise way to express grammars. Even compared to Whitequark it's 1/3 the size. What's the unique thing you do that makes it so concise?
It also seems somehow to be completely declarative? How have you managed to transform Ruby parsing to be context-free? For example where's the set of what's currently a local variable so you can distinguish from method calls?
But for example how do you parse the difference between `x = 14; x` and `y = 14; x`? In the latter case `x` is a method call, and in the former it's a local variable read. I can't see where the parser maintains a set of local variables and where it queries this set. Is it somehow done declaratively? If so that's a huge achievement I don't think that's really been done before in a parser generator.
I really want to try tree-sitter for using in an actual Ruby implementation because it's so beautiful!
In both cases the bit after the semicolon just parses as (identifier).
For some use cases (e.g. syntax highlighting, depending on your colorization rules) it doesn't matter, and so we don't want to pay the cost. If it does matter (like in an actual implementation), then you'd have to implement this yourself and drive it by the parse tree you get from tree-sitter.
Right you could just have a phase to fix-it-up after parsing. Much better than trying to shoe-horn an imperative action into a nice more-pure parser. Great idea!
The code is obviously much simpler than its syntax - most importantly, its syntactical simplicity makes it way easier to deal with. So when you write the code to parse it you don't have to try to parse it in one fell swoop like you do in Whitequark.
So you can't read anything from a method call!
I can make it so, if you're doing a class method (of any kind) you have to invoke the constructor, as described in "What is a method?" There's also a few new techniques like "new_class_method", which requires creating an object (of some kind) for that class... but what about that?
It's not "I've just fixed Tree-sitter's problem"; it's that Tree-sitter hasn't yet resolved the problem yet - there are other parsing problems besides Tree-sitter in Ruby itself like those of classes (and classes are not part of Tree-sitter) and things that are known as "type-traits" and so on - so as it's not quite enough it can be done by other things. The reason for using LR grammar is that when it comes to this - what do I want from that grammar?
The point I'm making here is that LR doesn't give a reason for what you're doing. As a programmer you are trying to write code that is portable because - if it works in a domain you don't understand (such as Ruby) - then you don't know what you're doing is wrong. There can be a domain (as in any language) that's a lot more complex than this - but since we've got that, how can I be sure it won't mess up the code I'm writing?
[1] https://github.com/whitequark/parser