It can generate elegant and efficient parsers for LR(1) grammars. > I tend to pr...

hardwaregeek · on Aug 7, 2023

What makes them more elegant than the average parser? How’s the error recovery? Can you parse into high fidelity syntax trees efficiently?

I don’t know of many production compilers that use parser generators

klodolph · on Aug 8, 2023

You use the generated parser as a platform for experimentation.

If you know the language, and you have a bunch of users, and you are writing a parser for it, by all means, write a parser by hand and give it the best error recovery that you can muster. If you are developing a language and want to do a bunch of experiments, it pays dividends to use a parser generator. And then there is the whole space of DSLs and mini-languages you encounter, where beautiful error messages are a nice-to-have, but you would rather ship a generated parser and move on to more important work.

It’s easy to focus on compilers from the perspective of familiar languages like writing compilers for Rust or for OCaml, but you may end up writing a compiler that gets used by a much smaller number of people, for smaller tasks.

hardwaregeek · on Aug 8, 2023

That's great and I agree for these use-cases a parser generator makes sense. But that doesn't answer my question about what makes these parsers particularly elegant. Nor does it seem to be a benefit specific to Mehnir, since any parser generator has the quick iteration speed.

I don't mean to blame you for that; you are answering a question about something that you did not say. But I find it frustrating that Mehnir seems to be cited as this fantastic cornerstone of the OCaml ecosystem when I haven't been presented with a good example of how it's better than any other parser generator. Not just my parent comment (who also decided to cast aspersions on my knowledge), but others in the OCaml community too.

brmgb · on Aug 8, 2023

> with a good example of how it's better than any other parser generator.

I already told you that it has full support for disambiguating LR(1) grammar and still generating a parser which is easy to read. How do you want me to paste a full parser in a HN comment?

Most generators only support LALR(1) grammar which is limiting and don’t deal with corner cases as gracefully.

I get that you are hell bent on wanting Rust to prevail here but Rust will always be a subpar experience for wiring anything which doesn’t strongly benefit for its low level primitive. Rust has annoying semantics and a convoluted syntax. I can bear with that when the performances are needed but writing a compiler in it is just unnecessary pain. It’s also one of the only thing for which I would actually use Ocaml.

hardwaregeek · on Aug 8, 2023

Okay…so it accepts more grammars than other parser generators. That doesn’t seem massive if I’m being honest. If you had said mehnir works with your IDE (navigating C code inside bison drove me nuts), and had good support for error recovery and idk, gave you syntax highlighting for free, I’d agree. But a minor upgrade in grammars? Not exactly Christmas here.

I have a question for you then: why is it that so many projects that are not performance bound, that are not low level systems projects, why do they use Rust and not OCaml? OCaml had a 22 year head start after all.

klodolph · on Aug 8, 2023

I didn’t intend to answer your question—your comment had multiple aspects to it and I am responding to one part of it, rather than every single part of it.

My main take on the “which language do I write my compiler in” conversation is that some of the various code transformation passes are just more convenient to write in a language which makes it easy to use both mutation and garbage collection, which puts OCaml above both Rust and Haskell. I think parsing is an easier problem to solve in the first place.

derriz · on Aug 8, 2023

I don’t think so - I’ve written parsers using flex/lex yacc/bison, antlr, a bunch of functional combinator libraries and maybe others I’ve forgotten but now would never consider anything except hand-written recursive descent with an embedded Pratt parser for expressions and precedence.

Simple to write, debug, recover from errors, provide decent error messages, unit test, integrate into build systems, IDEs etc.

I also believe that nearly all the popular compilers these days do something similar - gcc was rewritten a few years ago in this fashion because of the technical benefits I’ve listed above.

moomin · on Aug 7, 2023

I mean, it’s also common with people who value great feedback in their tools.

59nadir · on Aug 8, 2023

> That’s common with people used to languages which provide poor parser generators.

The vast majority of real languages have hand-rolled parsers and there is no real reason they shouldn't.

foderking · on Aug 9, 2023

>That’s common with people used to languages which provide poor parser generators.

lmao