Hacker News new | past | comments | ask | show | jobs | submit login

I played around with tree grammars. I think there's a formal language reason for them being incapable of parsing expressions properly. I tried anyway, to find a subset or workaround that would work well enough, but failed. (IIRC) At best, I needed parentheses around one of the operator operations e.g. always around +, like (a+b)c or (a+bc)

I'm interested that you have this too, but it seems to have the same problems I found: https://news.ycombinator.com/item?id=17650770

It would be very exciting if you have found a way around this!




Yeah, this is a fundamental problem. Luckily, arithmetic expressions are always left- or right-recursive, never middle-recursive. That means they're recognizable by a regular grammar, even if the parse tree doesn't match what you'd expect:

    expr = number (('*' | '+') number)*
You can replace number by (number | [ '(' expr ')' ]) to add support for parentheses as well.

My solution was to add special syntax which makes rules like this automatically:

    expr = number : num
     .operators infix left '*' : times
     .operators infix left '+' : plus
While the parse tree is built, the operators and operands are rearranged into the expected parse tree using https://en.wikipedia.org/wiki/Shunting-yard_algorithm. Here's an interactive example with some more operators: https://ianh.github.io/owl/try/#expr


Thanks, the eg I tried before works now, I must have misinterpreted the output.

Nice, a language can be recognized, even if a different parse tree.

I vaguely recall the shunting yard algorithm... am I right in characterising your solution as a second level of parsing? It's not "pure", but from a practical point of view, it's no worse than back-references in PCRE. The practical issue is just of ui usability.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: