Hacker News new | past | comments | ask | show | jobs | submit login

Funnily enough it's also where a bunch of mature compilers eventually ended up with, because of speed, heuristics and so forth. Like GCC, which used to have bison based compilers IIRC, but but is hand rolled rec dec...



It's not so much for speed and heuristics but for error reporting and/or recovery.

As it turns out (and I've tried many times myself too), designing a parser generator that will retain a compact representation without losing good error reporting is very hard.

E.g. consider a BNF type grammar. Almost every symbol represents a potential error. But often deriving that error message automatically in a way that is helpful to a human is incredibly hard because the grammar lacks information about which mistakes are more likely for humans to make.

So we end up with bad error reporting, or you end up having to write code to analyse errors anyway, and then a lot of the benefits disappear - the actual parsing is usually easy to do compactly with a recursive descent parser and a tiny library of helpers anyway.


Yea, I was imprecisely using heuristics to primarily be about error reporting. You essentially have to partially accept a wider set of grammar, to be able to throw more intelligent errors.

But speed ime also is a major issue. GCC sped up a good bit when switching, and we are seeing speed issues with bison generated parsers.

Ironically one of the benefits of bison is that you get feedback about whether your grammar is conflict free. Can be a significant advantage over a hand built recdec parser for a complex language like SQL.


Speed certainly can be a nice bonus, but you can also generate recursive descent parsers fairly easily if you have a suitable representation. My own parser generator attempts did that.

The feedback certainly is useful, and I'm all for people using such tools to validate grammars for testing. Though people manage to abuse that by writing extra code too (Ruby being a prime example of using a parser generator and then abusing the hell out of it in ways that complicates the grammar).

I really wish more work would go into research on automating error reporting, though. As much as I handwrite most of my parsers now, I wish I didn't have a reason to.


> I really wish more work would go into research on automating error reporting, though. As much as I handwrite most of my parsers now, I wish I didn't have a reason to.

Indeed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: