I once asked my college professor about operator precedence in C. He had been writing C code in industry for decades.
"I have no idea" he told me.
He said many programmers try to make their code as short and pretty as possible, as if there was some kind of character limit. Instead of falling into that trap, he just used parentheses. And the order of operations never was an issue.
I agree with your prof, and have a peeve with linters that complain about unnecessary parens. They may be unnecessary, but they can sure be helpful for those of us who don't have the precedence tables fully embedded in our neural code scanners.
Even when I have no issues with the precedence order, I often add parens anyway because it helps with mental grouping at a glance, instead of having to scan and mentally parse the line.
As a young egotist I would often omit parens in complicated C expressions. I did this intentionally and in a very self-satisfied way - writing multi-line conditionals and lining them up neatly without parens with a metaphorical flourish of my pen.
Then one day, chasing a hard-to-find bug, I realised it had happened because I'd mixed up the precedence of && and || in a long conditional. I was an idiot. Since then I've made a point of reminding myself that I know nothing and that there's nothing to be gained from pretending I do, and putting parens in everywhere.
Sometimes, even now, I get those grandiose moments when I think the code I'm writing cannot possibly go wrong. Those are the moments that call for a bit of fresh air and an extra unit test or two.
That's a great observation. Passing on that expertise is what wizened veterans can do to move our capabilities along and not let learning go to waste. I've heard that a big reason for sometimes struggling software engineering quality in countries and companies is places where the only way to grow is to become a manager - you need people to stay in technical paths so they can pass on that learned knowledge.
Exactly this. I add parenthesis not for computer but for myself. Mostly because it is easier to comprehend. I think the reason for this is that I learned precedence order something like 25 years ago in fifth grade maths and still do it to this day.
If I were writing a linter I'd do the exact opposite: complain about a lack of parens if it might be confusing.
I don't know if a / b * c is (a/b) * c or a / (b*c). Don't tell me, no matter how many times I'm told I'll forget 5 minutes later.
Exactly. Linters often complain about the amount of whitespace as if readability was important, but then complain about "too much" readability of the operator precedence. Except one can cause a programming error while the other can't.
I have a peeve with linters in general. The useEffect dependencies lint rule for React is one the worst I’ve seen - “fixing” it changes the behaviour of your program. They are horribly opinionated and often leave you with less readable code. Bah
“Instead of falling into that trap, he just used parentheses.”
That’s my conclusion too. Especially when I deal with several languages at the same time I don’t want to spend time on thinking about these details. Makes my code a little more verbose but I think it adds clarity.
Some languages may have boolean types, where others evaluate non-boolean types as true/false. So do you say "if myflag" or "if myflag=true" to make sure it's valid?
And some languages don't have short circuit operators, which I find annoying, but have learned to work around. Should I then write C or whatever that way, when I do have short circuit and/or?
I try to be reasonable. There is probably no hard rule (a lot of people seem to want that) but you have to see where things are causing problems and then addressing them.
The moment you catch yourself adding parentheses to other people's code to be able to understand it, and then having to "git checkout --patch" to flip it back the way it was.
How often I wanted to apply the same to contracts and would wish to see it in legal texts as well. Parenthesis (and lists) would make legalese so much more readable and remove ambiguities. Still wondering why they do not use these tools.
I'm completely ignorant in the legal texts/contracts area, but they don't use parenthesis and lists? That seems hard to believe - or is it that they just don't use it to indicate precedence and order?
Canons of legal construction exist because of ambiguity in human language.
Here are a few that illustrate common imprecision in language.[0]
Conjunctive/Disjunctive Canon. And joins a conjunctive list, or a disjunctive list—but with negatives, plurals, and various specific wordings there are nuances.
Last-Antecedent Canon. A pronoun, relative pronoun, or demonstrative adjective generally refers to the nearest reasonable antecedent.
Series-Qualifier Canon. When there is a straightforward, parallel construction that involves all nouns or verbs in a series, a prepositive or postpositive modifier normally applies to the entire series.
Nearest-Reasonable-Referent Canon. When the syntax involves something other than a parallel series of nouns or verbs, a prepositive or postpositive modifier normally applies only to the nearest reasonable referent.
Proviso Canon. A proviso conditions the principal matter that it qualifies—almost always the matter immediately preceding.
General/Specific Canon. If there is a conflict between a general provision and a specific provision, the specific provision prevails (generalia specialibus non derogant).
How would one parse "if A and B or C"? You'd want to add parentheses "(A and B) or C"; or "A and (B or C)". For a simple case, a comma might suffice, but more conditions can get difficult to express unambiguously in plain language.
Indeed, this is the pattern most industry programmers follow: never rely on operator precedence and always uses parentheses to disambiguate where things aren't glaringly obvious.
That makes your code both more robust and maintainable.
You should write your code with consideration for the next person tasked with maintaining it. If for some reason you can't manage that (?!?), I suggest coding with consideration for _yourself_, six months from now, still slightly drunk at 2am when the page comes through...
Joke aside, I use parentheses liberally. Even if I know operator precedence it saves me from errors when I edit the code and another person reading it might not know precedence rules perfectly.
* Unary suffix operators (C doesn't have these, but Rust's ? operator applies)
* Unary prefix operators
* Arithmetic operators, following normal mathematical precedence rules (i.e., a + b / c is a + (b / c), not (a + b) / c). Note that I don't have any mental model of how <<, &, |, ^ compare to each other or the normal {,/,%}; {+,-} rank.
Comparison operators
* Short-circuit operators (&&, ||)
* Ternary operator (?:)--and this one is right-associative.
* Assignment operators
This list I think is fairly objective, although C and some of its children "erroneously" place bitwise {&,|,^} below comparison operators instead of above them. The difference between suffix and prefix unary operators is somewhat debatable, but it actually does make sense if you think of array access and function call expressions as unary suffix operators instead of binary operators.
Those show why I called it a heuristic. With some code similar to yours, I would probably use parentheses unless they are frequent enough to deserve some brain cells.
I certainly always always always used parentheses in situations like these back in the day. More because it was the idiom I learned early, rather than a conscious decision to be risk averse or professional.
(...Over the reals. Your mileage may vary in less exact types. The Surgeon General recommends avoiding division in production code. Regulations vary by state.)
That's kind of one problem with it, it's less PEMDAS and more P, E, MD, AS. Multiplication and division don't have precedence over each other and neither does addition and subtraction. Both of those go from left to right.
Obviously yes, but that is question-begging. How does the novice programmer know that it is a good practice to use parentheses? x + y == z is correct, so it seems reasonable to conclude that x & y == z is also correct, particularly when the compiler does not complain about it.
How does the novice programmer know about order of operations? Maybe they are blissfully unaware of both features, but if someone is being taught one feature, they should be taught both. In my case, I'm self-taught and just applied what I learned in basic algebra about parens and boolean logic.
With an answer like that he would no doubt flunk the modern interviewing process:
"We had a guy come in, tons of experience, aced all of our coding tests... but when we asked him about operator precedence in C, he just shrugged and said 'I have no idea'. So we had a to let him go, for his lack of strong CS fundamentals."
Said one 25 year-old SSE to another in the follow-up.
Interviewing skills are a thing like everything else. A good candidate should jump to the opportunity to explain why she can't trust even herself to do it right (let alone other people, including her future self!). Brining (or making up plausible) examples from experience, etc.
Yes, sometimes it's silly, but there is the harsh reality of how interviewing is executed, especially by those companies who want to base the assessment mostly on the judgement of peers as you said.
If you're really senior, in most cases, you can't expect that the majority of people in the company you're applying to is going to be as senior as you. You'll need to do a lot of convincing and explaining of things that might be obvious to you even after you join, on a daily basis. As with everything, the interview can be a good place to show you can do it.
Knowing the precedence and knowing it well is necessary so that you can read code quickly and accurately. Not so that you can write code using the minimum number of parentheses.
Sometimes, even when you have the power to add parentheses to existing code and merge the commit, you still have to know what the unmodified code is doing: just so that you're sure your readability improvement is not changing the behavior, for one thing!
You can be looking up precedence tables all the time, or adding prints to test things empirically run-time.
> The moral of the story is: The best time to make a breaking change that involves updating existing code is now, [...] It’s fifty years since this mistake was made
Another possible moral is: The languages that stay in use for fifty years are the ones that avoid making breaking changes. :)
Though I get the humour and agree with the sentiment, the history of C tells us that a language can massively succeed even with constant "breaking changes". One of the things I learned working at Coverity is that there is no such language as C or C++; rather, there are hundreds of mutually incompatible languages with those names. Almost no compilers implement any of the standards exactly, and there are so many standards to choose from. The vast majority of real-world line-of-business C programs are designed to work with a single compiler and are therefore never evaluated in terms of their correctness when ported to another compiler.
That said, the C# compiler team was and continues to be extremely concerned about breaking changes because we very clearly perceived the cost to customers and the barriers to upgrading entailed by breaking changes. I introduced a handful of deliberate breaking changes in my years on the C# design and compiler team, and every one was agonized over for many hours by members of the design team who were experts on the likely customer impacts.
Similar story: Stuart Feldman, the author of `make`, on why Makefiles require tabs by default:
> After getting myself snarled up with my first stab at Lex, I just did something simple with the pattern newline-tab. It worked, it stayed. And then a few weeks later I had a user population of about a dozen, most of them friends, and I didn't want to screw up my embedded base. The rest, sadly, is history.
<source>:5:11: warning: & has lower precedence than ==; == will be evaluated first [-Wparentheses]
int t = x & y == z; // ?
^~~~~~~~
<source>:5:11: note: place parentheses around the '==' expression to silence this warning
int t = x & y == z; // ?
^
( )
<source>:5:11: note: place parentheses around the & expression to evaluate it first
int t = x & y == z; // ?
^
( )
The problem is "mostly solved" by adding a warning to a single compiler?
Sure, there are plenty of ways to mitigate the problem. That's not the point. The point is that the problem should not have arisen in the first place to require ongoing mitigation fifty years later!
My blog is about the design and implementation of programming languages; by understanding the causes of past mistakes we can learn to recognize them again today. The best mistakes to learn from are other people's!
There’s a similar problem that comes up in business often enough, where a change in your software leaves behind “legacy” code or data. When developing you almost always have to write things such that the new model and old model can live side by side, and customers using the old model can (at least for a time) just keep going while new users go on the new model.
Whenever a project like this is done, it’s very tempting from a business point of view to just leave the “legacy” users / data alone. But, technically, all your code continues to have to support two different models. Yes, with the right abstractions you can make this work without being terribly painful, but in practice there are always rough edges.
The time and effort it would take to migrate the old stuff into the new format is high, and the perceived business value is low. In many cases the migration never happens, and the old code is never removed.
But the problem is, you are now paying a tax to deal with that old code for all time. Every time a decision like this is made, the tax goes up. The tax slows you down, and it makes all future work you do more complicated.
So, there is a game theory problem here: for each individual decision, it arguably makes more sense to leave the legacy case alone and move on. But if you choose that path every time, it is a mistake, because the costs compound.
So far, the only thing I’ve seen that prevents this is very strong technical leadership that ensures the migration is baked into the project and made non-optional to the business teams. That’s really hard to do though, and arguably not always the right decision for the business, which may need to move fast now just to survive.
The operator precedence rules that are discussed here are a bit tricky and programmers could be excused to have to think a few secs before knowing how it was again. Therefore, it is better to use parentheses for these tricky cases. On the other hand, everybody should know without thinking that * has precedence over +, therefore, a * b + c * d should not have parentheses if it is intended to mean what is written. Parentheses would just be needless clutter in this case.
I know * and + precedence without thinking about it, but when I see it without parentheses, I can't help but pause to be sure I'm not having a brain fart, and then also question why it wasn't just put in parentheses. So I just put in the parentheses or split up the steps, every time.
* has precedence over + because it's something we all learned as kids. Everything else is quite arbitrary and I wouldn't expect everyone to know it off the top of their head. I'm in the camp that believes there's nothing wrong with a few redundant parentheses.
Same thing about frame rates derived from NTSC, where e.g. 60 fps in not really 60 fps, but 60/1.001 fps. Those extremely inconvenient frame rates are ubiquitous in systems that have nothing to do with analog broadcast television (e.g. on your YouTube).
No, I really meant what I said. If you see a video that's labeled "60 fps", quite likely it's actually 60/1.001 fps. It applies to the entire 24/30/60/120 range.
I re-read, and your claim was "derived from NTSC", not that NTSC itself was 60/1.001 fps. So, yeah. NTSC itself was (30 frames or 60 fields) / 1.001. "Derived from NTSC"... yeah, that could be "60 frames".
However it is perfectly possible to skip the half scanline between fields and produce a non-interlaced display with half the vertical resolution and twice the frame rate.
I think the author discounts the cost of "there are only a few users, so it's not a big deal to have breaking changes". I think stability and the backwards compatibility of a tool are important considerations to make in "is this worth using?".
If a tool has a large community of usage, breaking compatibility affects a large number of users.
If the tool doesn't have a large community, then a lack of stability means there's risk the tool won't in-future support current use cases, or that I'll have to spent much more time maintaining my use of it than with 'boring' tools.
It's fine for a tool in v0.x to have breaking changes. I think not supporting stable versions of a tool would make it hard to retain users in the long run, though.
Here's a counterargument: if a tool starts developing userbase, it's likely to have even more users in the future. Up to a point, the more users it has, the more it will have. So a breaking change while the userbase is still relatively small can be justified on the grounds of making things better for a larger number of soon-to-be users.
I agree. People complain that Swift changed too much in the earlier versions but I'm glad it did and I'm kind of sad that it's slowed down now and there are things I wish could be changed but can't. Yes it was a bit of a pain but there were migration tools that helped. And that pain is long since forgotten.
I think elaborate syntax is in itself a "hundred year mistake". I've been messing about with a language that uses Reverse Polish Notation (like Forth) for a bit, so of course there's no operator precedence at all. Also the distinction between short-circuiting Boolean combinators and "mathematical" Boolean operators is obvious, the former require a quoted function as one of their inputs (to conditionally execute) while the latter just take two ints (or bools or whatever, primitive values rather than quoted functions).
That's actually how I did it at first, but then I realized that it was silly because you always run the first quoted function right off anyway so it's cleaner to just require a Boolean value.
(There's a way using Category Theory to firm up that assertion into proper math, but I'm not good enough at CT to do that.)
> You might say “just search all the source code for that pattern” but this was two years before grep was invented! It was as primitive as can be.
If you control the compiler, you can make the compiler diagnose all the cases which are affected by the precedence of &, emitting a file name and line number.
That was the thing to do with those several hundred kilobytes of code. Get the precedence right, and then warn about any code actually relying on the precedence. The compiler then "greps it out" accurately; even instances that are obfuscated by preprocessing.
Or, add the warning first before rolling out the change to the precedence:
foo.c:35: warning: obsolescent & precedence: use parentheses
Let the programmers go through a period of obsolescence whereby they rid their code of these warnings by using parentheses. Then, the compiler can be changed, while maintaining a diagnostic in that area of the language for some time. Eventually, when everyone has forgotten the old dialect with the bad precedence, the diagnostic can be removed.
When the new precedence is rolled out, a compatibility switch can be included for compiling code with the obsolescent precedence. Actually, that option can even be rolled out first (so it initially does nothing).
When the option is eventually removed, the compiler will refuse to run if it is specified.
So anyway, Ritchie had a few fairly easy and cheap alternatives to sticking with bad precedence; he perhaps wasn't aware of them.
So... yeah, I totally agree with the moral of the article. Unless you are actually planning for how to handle a user uprising due to incompatible changes, fix your stuff now. But let's talk about operator precedence instead, because that's more interesting:
The boolean operations are the biggest gotcha, but these are everywhere. Similarly the "*" indirection operator binds more strongly than many programmers think it does (and the same symbol in type expressions too, which is why those parentheses in function pointer declarations that no one really understands are needed). The shift operators got picked up by C++ to do I/O of all things, where get to wreak havoc in a whole new regime.
Frankly the only precedence rules that most programmers understand (because we were taught them in grade school!) is the two-level/left-to-right grammar for the four basic arithmetic operations, along with a vague sense that these should bind more tightly than an operator with "=" in it, because that separates the "other side of the equation".
Given that, why don't new programming languages just require fully parenthesized expressions for expressions involving other operators?
“Given that, why don't new programming languages just require fully parenthesized expressions for expressions involving other operators?”
That would be excellent and be in line with the trend towards “safe” languages. I have countless examples where I read code with several expressions and when I asked the dev if they really understood the situation rules of this language they usually didn’t. They just wrote something they thought should be ok.
> C++, Java, JavaScript, C#, PHP and who knows how many other languages largely copied the operator precedence rules of C
Ok, that's understandable for C++, but for all the other languages I really wonder what the reasoning behind this (if any) was - "our language looks similar to C, so we have to copy as many of its warts as possible so C developers feel at home" ?
The operator precedence in the article is just an example of a principle that does hold: A series of choices, each individually plausibly the right call, lead to a bizarre situation that doesn't feel right.
However, often (and I submit this very article as exhibit A), the logic is reversed: One observes a bizarre scenario, points at it, and goes: I don't know what or how, but clearly somebody somewhere messed up.
That's not true; sometimes seemingly simple things simply aren't simple, and you need to take into account all users of a feature and not just your more limited view.
Take operator precedence. __operator precedence is an intractable problem__.
Some languages try to make it real simple for you and say that all binary operators are resolved left to right, and have the same precedence level. This makes it easy to explain and makes it much simpler to treat anything as an operator; a feature many languages have. Smalltalk works like this. The smalltalk language spec fits on a business card; obviously, you must use such principles as C's operator precedence table would otherwise occupy most of your card!
But that does mean that `1 + 2 * 3` is 9 and not 7, and that is extremely weird in other contexts.
So, you're in a damned if you do and damned if you don't situation: There is no singular answer: No operator precedence rule is inherently sensible regardless of the context within which you are attempting to figure out how operator precedence works: It is an intractable problem.
One way out is to opt out entirely and make parentheses mandatory. If you then also make it practical and say that the same operator associates left to right regardless of what operator it is, you can still write, say, `1 + 2 + 3 + 4`, but you simply aren't allowed to write `1 + 2 * 3` – you are forced to write `1 + (2 * 3)` or `(1 + 2) * 3`. But now anybody who feels they can copy problems from math domains, or from technical specifications such as a crypto algorithm description now gets tripped up. You may disagree (I certainly do), but a significant chunk of programmers just prefer shorter code. I would certainly prefer my languages to work this way, and have configured my linters like this as well, but it's still not a universally superior solution regardless of point of view.
In other words, had Ritchie chosen to go with this 'when in doubt, do not compile it' strategy, this article could not be written. And we'd still have programmers unhappy with how the operator precedence rules work.
> No operator precedence rule is inherently sensible regardless of the context within which you are attempting to figure out how operator precedence works: It is an intractable problem.
I disagree. FORTH got it right. Postfix notation is inherently sensible, as any RPN calculator user will tell you. It eliminates the need for both precedence rules and parentheses, unambiguously.
1 2 + 3 * is 9. 1 2 3 * + is 7.
It's infix notation that's the problem, precedence confusion is a consequence of it. (Prefix notation can work too, but requires more parentheses).
Avoiding precedence issues is great. Absolutely. However, it's not crazy to want to write it like most humans are used to. I have no problem with the idea that it's more trouble than it sounds like, and '1 2 +' is the best answer to the dilemma. That still doesn't make it a slam dunk. Or should we start saying that just about every programming language out there has a major design flaw, in that it has infix and not postfix operators?
Perfect is the enemy of good. The fact that it's intractable doesn't mean you shouldn't try to find a reasonable local optimum.
For instance, if I have five minutes to code a traveling salesman implementation, I'll probably do a greedy algorithm that just does the closest city next until I've visited all cities. If I have more time, I might check that paths don't cross, and if they do, swap the order of the cities in that closed loop. That sorta thing.
Is it perfect? Of course not. Is it better than throwing up my hands and saying the problem is intractable, therefore I should just return the list of cities in whatever order they were given to me? Absolutely.
In the case of bitwise operators, Boolean algebra teaches us that & behaves like \* and | behaves like +. Therefore & should have the precedence of \* and | should have the precedence of +. Is this perfect, or as good as RPN? No. Is it better than a world where 'if (x & 0xff == y & 0xff) [...]' almost certainly does the wrong thing? Absolutely.
If Ritchie had designed C to use postfix notation instead of prefix notation for expressions, it simply wouldn't have caught on. It would be as dead as FORTH. If he had designed it to have better precedence rules for bitwise operations, the world would be a better place.
See also null terminated strings vs length prefixed strings. Length prefixed strings are not perfect. They are merely a lot better.
The hundred year mistake is ambiguous syntax with hidden precedence rules itself, rather than any specific choice of precedence.
At least, particularly poor choices of precedence can be diagnosed. Whenever the compiler applies a precedence rule between two operators that has been identified as awkward, it can emit a diagnostic mentioning those operators: "warning: quirky precedence: use parentheses when & is combined with ==".
I guess it's my turn to be the dipstick who complains about the website's design instead of discussing the contents, but I'm finding the font extremely hard to read in Chrome. In Safari it's better, but it still doesn't look great in terms of kerning. Anyone else seeing this?
It's rendered using the default WordPress theme ("Twenty Eleven") with the sole customization being that the body text colour is changed to purple. If it is rendering poorly, take it up with either WordPress or your browser provider; there's not much I can do about it.
operator precedence is arbitrary to a language. There is no universal "right" or "wrong" and the LISP postfix people know this, there's no operator precedence there, just a sequence of operators and the operands that follow that explicitly declares the order of evaluation. It does have parentheses though :-)
So the mistake here seems to be expectation, which is arbitrary. Everyone knows the +-*/() precedence of basic arithmetic, but boolean operations and others? That isn't at the cultural level of basic schooling, so the author's contention that the precedence is wrong isn't a priori.
The article rests on the notion that this is confusing, but fails to prove it:
int x = 0, y = 1, z = 0;
int r = (x & y) == z; // 1
int s = x & (y == z); // 0
int t = x & y == z; // 0
None of that surprises me, but moreover I would never expect myself or anyone else to be sure about it, I would refer to the operator table to be check. The article also supposes that this line has some natural meaning that everyone expects, but I'm not seeing it:
if(x() & a() == y)
I think the author is projecting their own cognitive patterns onto the rest of us without justification.
I'm curious to know if you knew from the moment of your birth that you should look at the operator table in this case, or if you learned that mitigation on a particular day. If the latter, what might you have done before you learned that?
I don't know. I feel like I've always needed to think about any line of code with several operators on the same line, but I've been in this business for a long time, so I might have learned the habit from negative experience.
I'm glad you mentioned it though, because now I'm curious if there's a difference among programmers who score well or poorly on a cognitive reflection test. Maybe people with a tendency to suppress their intuitive response are also less likely to think that a line of code has an obvious meaning?
I only know not to write things so complex because either I have made the mistake or some generation of someone I've learned from has made the mistake. It's not a logic problem, where the less complicated, better understood and more general rules of precedence allow writing equations that are terse, objective, and easily parsed by humans: it was born with C's invented, abstraction-breaking rules of precedence. You only learn it with a C-like language. What's a bit to a mathematician?
I'd argue the practices that are most popularly known as 'good practices' are to avoid mistakes that are common.
This article is really funny because it is true. I started programming C in 1987 and got my first programming job in 1992 (writing drivers for flash memory for windows 3.1). I was lucky to have a really good mentor. He beat me over the head to always use parenthesis whenever I used more than one binary operator and I never stopped. Sure, my macros looked more like LISP than C, but he burned the fear of god into me.
"I have no idea" he told me.
He said many programmers try to make their code as short and pretty as possible, as if there was some kind of character limit. Instead of falling into that trap, he just used parentheses. And the order of operations never was an issue.