import sys
from functools import reduce
from operator import add, sub, mul, truediv as div
def calc(s, ops=[(add, '+'), (sub, '-'), (mul, '*'), (div, '/')]):
if not ops:
return float(s)
(func, token), *remaining_ops = ops
return reduce(func, (calc(part, remaining_ops) for part in s.split(token)))
if __name__ == '__main__':
print(calc(''.join(sys.argv[1:])))
Does support order of operations, in spite of appearance. Extending to support parentheses is left as an exercise for the reader :)
This is pretty clever. There's a video by Jonathan Blow, creator of Braid and currently making a programming language called Jai, that talks about parsing expressions with order of operations. He gives an simple approach that apparently was discovered recently (though he didn't remember the paper name) that does this in a single pass and hardly any more complicated than what you wrote. You can see it here [1]. He also trashes on using yacc/bison before that and on most academic parsing theory.
I like Jonathan Blow and all, but he should be careful with trashing academic parsing theory if he thinks that Pratt parsing was only discovered "in the last few years".
> Luckily, one can make the grammar LL(1)-parser-friendly by rewriting all the left recursions in the grammar rules as right recursions.
The problem with this is that it changes all the left-associative operators to be right-associative. For example, `python compute.py '3-2-1'` gives the result 2, but it should be 0.
You are correct, the result 2 from the submitted parser.py is erroneous. Downvoters should be ashamed of their abusive behaviour.
It is not well known amongst programmer who would like to use parsers that most algorithms as implemented in countless software libraries are wrong and broken beyond repair. We should strive to make it a well-known fact and promote those parsers that work properly, so that less programmers and end-users need to suffer.
This is horrible to read and to maintain. Real grammars use names like "additive_expr", "multiplicative_expr" etc., or maybe "term" and "factor", though I can never remember which is which.
For me it's actually the opposite, I find using the priority makes it actually a lot more readable, especially once you have a lot of operators. It's also quite close to a generalized (and potentially dynamic) version that is just a function that takes in a priority and returns the correct parser.
There are reasons why we usually give variables more meaningful names than var1, var2, var3.
(Obviously I agree that if the parser can be expanded dynamically with operators with numeric priorities, then you need numeric priorities. But that is a very different case.)
If you can't remember which is which, this is a good indication that the names don't matter, and in fact they don't, because it's an artifact of the way you need to write the grammar.
labels definitely matter. Naming is one of the two hardest things [0].
The point in this case is that the names are hard because they're making up for limitations in left-recursive grammars. Personally, I still find "Addable_expr" to be more intuitive than "additive_expr": where it means "an expression that can be added". Both are more intuitive (to me) than "Expr2". But that's personal preference, and part of why naming is hard :).
I am not disagreeing with you, I just think that the "grammar" still works even with terrible naming. It is harder to read, it is harder to maintain, but it is still real. Perhaps I am missing some subtler point here...
"It won't work" is a hilarious claim seeing how actual language standards do exactly this, and it works well. See the C standard for example: https://web.archive.org/web/20161223125339/http://flash-gord... . Search for "additive-expression" to see the relevant grammar rules.
Really nicely-written project. The documentation is clear and sufficiently comprehensive; the project structure is immediately obvious and navigable; and each source file is both intuitively named and very readable.
Where would I read up on what's currently considered best practice for implementing a parser? We learned recursive descent in school, but I have heard several remarks that this is no longer considered en-vogue.
I don't know about best practice, but having implemented a few different parsing algorithms for the same language, I found that "Pratt parsers" to have unique advantages, keeping the logic simple, clear, and flexible.
At least for the particular needs of that language's syntax, and my preference for hand-rolled parsers (rather than generated from grammar), it was the best solution. It allowed me to elegantly solve some syntactic groupings that were difficult/complex to parse with other approaches.
Here's an excellent overview of the algorithm, with examples in Python:
Recursive descent is what I use today. En-vogue or not, it's the most robust, flexible and - most importantly - maintainable approach in the long term.
It's also easy to explain to someone who has not studied parsing, and easy for them to implement. That was why I chose recursive decent when I needed to trick someone into writing a C compiler.
This was back in the early '80s, back when there was much more variety in personal computers both in CPU architecture and operating system, and C was fairly rare on such small machines.
A friend who was a really good programmer but who had not taken the compiler course in college (he was a math major, not a CS major) had started a company aiming to produce a C compiler for a new small computer that was coming out in in a few months, with the compiler meant to be available as soon as the computer came out. He'd hired another friend to write the compiler.
This was well before the internet, when advertising was a lot slower than it is now. The company had to buy its ads in the relevant computer magazines well ahead of time.
That guy writing the compiler was a genius, and was on track to produce an amazing compiler especially given the limited resources of the personal computers back then...but it turned out he was not a fast genius. The compiler would not be finished until long after the new computer launched.
The first friend was getting rather depressed, knowing he had to get someone faster to write the first version of the compiler but not knowing anyone else who could do it and was available.
So one evening I decided to convince him that he could do it himself. I sat him down in front of a whiteboard, and told him he needed three things: a lexical analyzer to tokenize the input, a parser to parse the token stream, and a code generator to turn the parse tree into assembly.
I then outlined recursive decent parsing on the whiteboard, giving examples of using it on C. I was able to convince him that parsing was not hard, and that he could do. I then moved on to code generation, and showed him that he could generate code for a simple stack-based virtual machine, and that could easily then be mapped to the real, register-based CPU--it would be inefficient, but I also went over how later he could add a simple optimizer that could remove a lot of the inefficiency.
Once I had him convinced that he could handle parsing and code generation and some optimization by their deadline, I pretended we were done and started to leave. He interrupted and asked about lexical analysis.
I acted embarrassed that I had forgotten that, and "admitted" that it was the hardest part, and said I'd try to think of something and went home.
Of course that was a lie. When I got home, I quickly wrote a lexical analyzer for C, and then the next morning went back and handed it to him.
He now actually believed that the hardest part was done, and was now sure that he could do the remaining easier parts himself. And he did. We both got a good laugh later when I finally told him that lexical analysis is the easiest part.
I just tested a recursive descent parser and it fails a lot. Since I could only find one working-off-the-shelf software on short notice, I do not know whether the failures are the fault of the algorithm or the implementation. Though due to prior experience, I suspect it's the fault of the algorithm, and if that's the case, we should not use it to implement parsers. I would gladly test more recursive descent software if you could show me some and share a link.