Hacker News new | past | comments | ask | show | jobs | submit login

Haven't used it myself, but have heard many rave about Perl's Parse::RecDescent



I think Damian Conway (Perl Author, Keynote speaker, compsci professor, seemingly super nice guy too) wrote the recursive descent parser module and its successor module, although someone else maintains it now on CPAN.


if his code is as interesting as his talks then I want to write parsers in perl


I barely even write perl, but still read part of his OO book and learned a lot


Perl OO is underrated. The bless() function and Perl's packages are straightforward and make OO seem less magic.

You even get to pick the encapsulation. Most people go with hashtable based objects, but you can use lists or even strings instead. It's a really good language to see how OO works under the covers.

https://perldoc.perl.org/perlootut.html

https://perldoc.perl.org/perlobj.html


Agreed. I found it a bit of a pain though to write using it as most languages have so much syntactic sugar for OO.


Recursive descent is very different (if I understand correctly.) Some languages that can be parsed with a recursive ascent parser cannot be parsed by a recursive descent parser.


Yes; left recursion. However in practice left recursion can usually be eliminated by adjusting the grammar. Other cases can be handled using various degrees of grammar action cleverness. And finally, recursive descent lends itself naturally to parsing a subset of the language using a different scheme; top down operator precedence for expressions to avoid lots of redundant recursion is a fine example.


> can usually be eliminated by adjusting the grammar

Never understood this backwards way of thinking. You first write/immagine the grammar that you want to write, whatever is most intuitive for you.

Then you figure out what kind of parser your grammar needs.

Why would one contort a grammar? I mean, even "contorting it to not be context-dependent" is already a limitation enough...


Making parsing easily incrementalizable is a valuable property, because gramars which incrementalize easily also localize parse errors well, and having both of these properties hold lets you offer convenient APIs for IDEs and text editors much more easily.

For this, you want (as much as possible) for local edits to result in local modifications to the syntax tree, and ensuring that can require thinking carefully both about lexical syntax and abstract syntax. At the lexical level, if the literal syntax forbids newlines in string literals, then the changes to the token stream on a single-character edit are limited to changes to the tokens on the same line. Likewise, in the abstract syntax, if top-level declarations have a keyword, then the effect of a single-token change (eg, adding a mismatched paren) are limited to the declaration it occurs in.

There are a lot of tricks like this (eg, setting things up so you can find and parse the type declarations without having to parse a whole expression), but it's reasonable not to want to learn them. In that case, a really good heuristic is to make your language LL(1) parseable. It's not perfect, but it will rule out the vast majority of things that make incrementalization hard.


Why would you let the grammar dictate the architecture of your compiler?

Left recursion in things like expressions is completely trivial to eliminate. If you want to e.g. embed a DSL in a language, something that's really easy to do with recursive descent, it's worth spending 30 seconds or so to remove left recursion from your expression rules. Etc.

> You first write/immagine the grammar that you want to write, whatever is most intuitive for you.

IMO this is the road to hard to parse languages; if you genuinely advocate this path, you'll end up needing a GLR parser before you know it. I think it's much better to ensure your grammar is parseable with LL(1); not only is the tooling simpler, but the language will be less ambiguous for human users too. Follow the example of Pascal, not C++. This is what I'd advocate even for people doing hand-written recursive descent on a new language: validate your grammar with an LL(k) tool even if you don't use the generated parser.

(You've ended up sounding a bit like Ira Baxter, who's often seen around the net selling his DMS toolkit. He has a strong self-interest in people writing complicated grammars, so that he can sell them his services...)


> but the language will be less ambiguous for human users too

I'd bet against this. The human brain works very differently, we always keeps something like "trees of contexts" in our head. And why would you thing left recursions is less intuitive to a human user than other kinds of recursion.

> ended up sounding a bit like Ira Baxter

Now I gotta take time and google who this guy is :P


> You've ended up sounding a bit like Ira Baxter

Haha, I was thinking about him too. He does have a point, though, that on 2018 machines we might be able to afford parsing technology a bit more expensive than the 1970s approaches commonly used today.


For the most part you can apply some simple rules to transform the BNF notation from being left recursive to being right recursive. It's most useful when you have a right recursive language but are using yacc or bison for the parser.


Right. Was responding to "But recursive-descent, if you're going to use it, seems to cry out for tooling"


As I mentioned in an uncle comment, recursive descent permits localised hackery to switch parsing scheme or dynamically fix tricky context-specific grammar constructs, which argues against tooling.

Tooling is still very useful for grammar validation though, and it's the way to go for ad hoc parsing of a simple or new language.


That was a typo, he meant to say recursive-ascent needs tooling. Recursive-descent parsers can be hand-written quite easily.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: