Hacker News new | past | comments | ask | show | jobs | submit login

> Odd that you put ambiguity in quotes. I guess it's not real ambiguity then.

Because a parser is never really ambiguous is it? Nobody writes a parser that has a random decision on which rule to take.




In parser theory there is such a thing as an ambiguous grammar. Parser generators detect such a grammar and never produce an actual parser for it. So in a sense most parser that you actually use don't have ambiguous grammars by definition. That said, you could have a manually written parser that thinks it implements an "ambiguous grammar", but in reality chose an deterministic (even if possibly arbitrary l) resolution of the ambiguity.

It's still ok talk about the grammar as being ambiguous no matter what a purported implementation of such grammar is deterministic (because it ultimately does not implement the actual grammar)


I guess I don't understand why we use languages for writing grammars that let you express ambiguity in the first place.

I think parser generators were a mistake.


for someone who is the founder of TruffleRuby (I've checked, and you are <https://docs.oracle.com/en/graalvm/enterprise/21/docs/refere...>), you sure have made some very odd comments and I don't know what to make of them.


> you sure have made some very odd comments and I don't know what to make of them

Well they’re from practical experience aren’t they.


Well, first you don't seem to understand what 'ambiguity' means in the context of parsing "Nobody writes a parser that has a random decision..." which is weird. It's a very establish meaning here.

Then you say "I guess I don't understand why we use languages for writing grammars that let you express ambiguity in the first place" without any suggestions, or even if this is possible, or if it is, whether the resulting language might be too constrained to be useful (interesting question though. Edit: even context-free grammars have ambiguity, eg. the regexp "aa" is legal but ambiguous for sentence "aaa").

Then "I think parser generators were a mistake" which is surreal. If you've ever had the tedious misfortune to write one manually, you know how much faster it is to have the computer do that work.

So I am thrown a bit here.


If you choose to specify your language using a class of grammar which permits ambiguity, then you have to resolve that ambiguity.

If instead you choose to specify your language using a class of grammar which does not permit ambiguity, then you don't have to resolve it because it never existed.

That's the point.

The suggestion is to either specify your language using a formal grammar which does not permit ambiguity, or to specify your language imperatively, using a reference parser.

I've written many parsers, both using parser generators and manually. I'd choose to write one manually. In fact, I'm currently looking at a project at work right now to take a generated parser and to re-write it manually because it's easier to work with.

That's a pretty mainstream opinion amongst professionals in the industry - not sure why you think it's surreal or why it's throwing you.


> If you choose to specify your language using a class of grammar which permits ambiguity, then you have to resolve that ambiguity.

I added an edit after which you may not have seen:

   even context-free grammars have ambiguity, eg. the regexp "aa" is legal but ambiguous for sentence "aaa"
So what grammar can you suggest that's even weaker than context-free (to make expressions of ambiguity impossible) and still useful?

> I'd choose to write one manually

Matter of taste I guess.


> So what grammar can you suggest that's even weaker than context-free (to make expressions of ambiguity impossible) and still useful?

It doesn’t need to be weak. For example use a Parsing Expression Grammar.


PEGs are ambiguous, just that that's resolved by rule ordering. From the wiki page (https://en.wikipedia.org/wiki/Parsing_expression_grammar#Amb...)

"A PEG parser generator will resolve unintended ambiguities earliest-match-first, which may be arbitrary and lead to surprising parses"


> PEGs are ambiguous

You're misunderstanding the rhetoric there - they're unambiguous by construction because they deterministically resolve earliest-match-first - a well-defined rule.

It's impossible to write an ambiguous PEG. Any grammar you write in a PEG is never ambiguous. A grammar you write using a CFG can be ambiguous.

As your own link says "Unlike CFGs, PEGs cannot be ambiguous".

Or read Ford's original paper https://bford.info/pub/lang/peg.pdf - they solve "the ambiguity problem by not introducing ambiguity in the first place".

(I wrote a much-cited thesis on PEGs and issues such as ambiguity in precedence parsing - I'm not just talking off the top of my head here.)


You're the expert, thanks for the pointers!


I agree it's a pain. Could you share some alternatives?


Imperative parsing - PEGs or just plain old recursive descent.


Ok that's about the "parser generator" part. But what about the "formal grammar" part?

Do you really want your language to be defined by a particular imperative parser implementation?

I'm asking because I do have a practical problem: my team has a hand crafted parser for a language. It should follow a spec written in EBNF. That spec could be ambiguous it's not because an automated tool checks that that EBNF grammar is not ambiguous.

I find the ability to document the grammar using something that everybody understands (BNF and variants) to be very useful. Yet, your comment seems to imply that since it allows ambiguous grammars we should be using it.

EDIT: our grammar is implemented by three different parsers, written in two programming languages.


> Do you really want your language to be defined by a particular imperative parser implementation?

Yes please. The code defines completely and unambiguously how the language is parsed.


If the code only handled parsing the grammar, I'd be with you, but more than half of a parser deals with what to do with what you just parsed which may be quite different between the various implementations of the grammar




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: