> Odd that you put ambiguity in quotes. I guess it's not real ambiguity then. Be...

ithkuil · on July 31, 2022

In parser theory there is such a thing as an ambiguous grammar. Parser generators detect such a grammar and never produce an actual parser for it. So in a sense most parser that you actually use don't have ambiguous grammars by definition. That said, you could have a manually written parser that thinks it implements an "ambiguous grammar", but in reality chose an deterministic (even if possibly arbitrary l) resolution of the ambiguity.

It's still ok talk about the grammar as being ambiguous no matter what a purported implementation of such grammar is deterministic (because it ultimately does not implement the actual grammar)

chrisseaton · on July 31, 2022

I guess I don't understand why we use languages for writing grammars that let you express ambiguity in the first place.

I think parser generators were a mistake.

zasdffaa · on July 31, 2022

for someone who is the founder of TruffleRuby (I've checked, and you are <https://docs.oracle.com/en/graalvm/enterprise/21/docs/refere...>), you sure have made some very odd comments and I don't know what to make of them.

chrisseaton · on July 31, 2022

> you sure have made some very odd comments and I don't know what to make of them

Well they’re from practical experience aren’t they.

zasdffaa · on July 31, 2022

Well, first you don't seem to understand what 'ambiguity' means in the context of parsing "Nobody writes a parser that has a random decision..." which is weird. It's a very establish meaning here.

Then you say "I guess I don't understand why we use languages for writing grammars that let you express ambiguity in the first place" without any suggestions, or even if this is possible, or if it is, whether the resulting language might be too constrained to be useful (interesting question though. Edit: even context-free grammars have ambiguity, eg. the regexp "aa" is legal but ambiguous for sentence "aaa").

Then "I think parser generators were a mistake" which is surreal. If you've ever had the tedious misfortune to write one manually, you know how much faster it is to have the computer do that work.

So I am thrown a bit here.

chrisseaton · on July 31, 2022

If you choose to specify your language using a class of grammar which permits ambiguity, then you have to resolve that ambiguity.

If instead you choose to specify your language using a class of grammar which does not permit ambiguity, then you don't have to resolve it because it never existed.

That's the point.

The suggestion is to either specify your language using a formal grammar which does not permit ambiguity, or to specify your language imperatively, using a reference parser.

I've written many parsers, both using parser generators and manually. I'd choose to write one manually. In fact, I'm currently looking at a project at work right now to take a generated parser and to re-write it manually because it's easier to work with.

That's a pretty mainstream opinion amongst professionals in the industry - not sure why you think it's surreal or why it's throwing you.

zasdffaa · on July 31, 2022

> If you choose to specify your language using a class of grammar which permits ambiguity, then you have to resolve that ambiguity.

I added an edit after which you may not have seen:

   even context-free grammars have ambiguity, eg. the regexp "aa" is legal but ambiguous for sentence "aaa"

So what grammar can you suggest that's even weaker than context-free (to make expressions of ambiguity impossible) and still useful?

> I'd choose to write one manually

Matter of taste I guess.

chrisseaton · on July 31, 2022

> So what grammar can you suggest that's even weaker than context-free (to make expressions of ambiguity impossible) and still useful?

It doesn’t need to be weak. For example use a Parsing Expression Grammar.

zasdffaa · on July 31, 2022

PEGs are ambiguous, just that that's resolved by rule ordering. From the wiki page (https://en.wikipedia.org/wiki/Parsing_expression_grammar#Amb...)

"A PEG parser generator will resolve unintended ambiguities earliest-match-first, which may be arbitrary and lead to surprising parses"

chrisseaton · on July 31, 2022

> PEGs are ambiguous

You're misunderstanding the rhetoric there - they're unambiguous by construction because they deterministically resolve earliest-match-first - a well-defined rule.

It's impossible to write an ambiguous PEG. Any grammar you write in a PEG is never ambiguous. A grammar you write using a CFG can be ambiguous.

As your own link says "Unlike CFGs, PEGs cannot be ambiguous".

Or read Ford's original paper https://bford.info/pub/lang/peg.pdf - they solve "the ambiguity problem by not introducing ambiguity in the first place".

(I wrote a much-cited thesis on PEGs and issues such as ambiguity in precedence parsing - I'm not just talking off the top of my head here.)

zasdffaa · on July 31, 2022

You're the expert, thanks for the pointers!

ithkuil · on July 31, 2022

I agree it's a pain. Could you share some alternatives?

chrisseaton · on July 31, 2022

Imperative parsing - PEGs or just plain old recursive descent.

ithkuil · on July 31, 2022

Ok that's about the "parser generator" part. But what about the "formal grammar" part?

Do you really want your language to be defined by a particular imperative parser implementation?

I'm asking because I do have a practical problem: my team has a hand crafted parser for a language. It should follow a spec written in EBNF. That spec could be ambiguous it's not because an automated tool checks that that EBNF grammar is not ambiguous.

I find the ability to document the grammar using something that everybody understands (BNF and variants) to be very useful. Yet, your comment seems to imply that since it allows ambiguous grammars we should be using it.

EDIT: our grammar is implemented by three different parsers, written in two programming languages.

chrisseaton · on July 31, 2022

> Do you really want your language to be defined by a particular imperative parser implementation?

Yes please. The code defines completely and unambiguously how the language is parsed.

ithkuil · on July 31, 2022

If the code only handled parsing the grammar, I'd be with you, but more than half of a parser deals with what to do with what you just parsed which may be quite different between the various implementations of the grammar