Hacker News new | past | comments | ask | show | jobs | submit login
Why does `True == False is False` evaluate to False in Python? (2013) (stackoverflow.com)
268 points by 2arrs2ells on May 2, 2020 | hide | past | favorite | 217 comments



One of the comments lays out why I probably never use things like this: And this is exactly why I tend to wrap compound logical comparisons in parens, I simply can't remember the syntax across all the languages I use, and it's not worth the risk of leaving them off and having unexpected behavior

Apart from unexpected behavior it is also about readability: with parentheses in place you can just read left-to-right and the parsing overhead is minimal. Without them though, you have to read almost everything first, parse what is there, figure out what comes first (and hope you remembred it correctly), then read again to get a mental image of what is actually going on.


Whether it's computer languages or human ones, as soon as you get into a discussion about the correct parsing of a statement, you've lost and need to rewrite in a way that's unambiguous. Too many people pride themselves on knowing more or less obscure rules and, honestly, no one else cares.


This. I got asked an interview question that involved writing two list comprehensions in python in an interview. The guy then asked if I could do the same without the intermediate variable - basically nesting them. I had a mental block for a bit before I realised how simple the task was. It just that I would never think to nest list comprehensions because I find it horribly unreadble.


> if I could do the same without the intermediate variable

I'm glad I never had to go through such ridiculous hoops to get a job.

If I had I might have been inclined to say "Yes I could but I wouldn't do it because it would be unmaintainable."

Real application programming isn't about being concise it's about being cost effective. Sometimes conciseness achieves that end often it does not. Having a well named intermediate variable can help document what is happening so that the poor sod who has to fix a bug in or extend your code two years later doesn't have to spend an hour deciphering your code to apply a two minute fix. That poor sod might be the very same person who wrote it.


I was a Java developer for more than 4 years at the beginning of my career. I was more of application developer than language expert so I did not know about nooks and crannies of the language. But that is exactly what was being asked in interviews and was never able to crack those interviews.

I liked Go and used to work on it in my personal time. Go's way of doing things is usually one way so there are not many things you have to remember about language. I applied for a job of Go and I landed that one. Of course, at that time very few Go developers were there in market so that could be a good reason for me landing that job as well.


It's about elegance, you see.

num%3 == 0 ? num%5 == 0 ? "fizzbuzz" : "fizz" : num%5 == 0 ? "buzz" : num


This starts in elementary school when the the cargo cult that is math pedagogy teaches the terrible "order of operations", instead of teaching students to use clear unambiguous syntax.


Math in general is incredibly sloppy with this. I believe it's similar to why churches only read the bible in latin - There's a sort of perverse incentive in keeping arcane symbols around that make the life of the practitioner easier at the expense of the lay-person.

Most formulas students need to learn are actually quite intuitive when you replace the symbol with a one or two word description. But they'd rather force you to memorize the symbol.

It's like forcing you to read obfuscated code. I find it draining with zero benefit to most learners.


Math is first and foremost a language, whose symbols are generally well-defined, although they differ between contexts in some cases. In many of those differing contexts, the symbols are re-used to give an intuitive sense. For example, re-using "x" for cross product.

Reducing math to words means English math is different from Chinese math. If I, a native English speaker, were to look at such math, I would have to more or less take it on faith that the translation I'm reading is accurate. And god forbid I'm trying to read a translation of a work done jointly by (say) a Romanian and a Chinese.

The symbols generally remove ambiguity. Consider the English word "bi-weekly." It's so muddied that it is almost useless. It means either twice a week or once every two weeks. It means, literally, both multiplication and division; and context is rarely helpful for this particular word.

You can certainly use the symbols poorly and ambiguously, as many viral "Solve this math problem!" memes exemplify. But used properly they're generally clear, concise, and well-defined.

I do, however, have one nitpick about symbols, and that's with the use of the ellipsis in math. You'll often see things like {1, 2, ..., n} which is generally intended to mean (say) the natural numbers up and including to n. But does it? Couldn't it also represent powers of 2? Or the set of fibonacci numbers? And since it is a set, order doesn't matter. Almost anything at all could be in that set. We can only be sure it's not something like "all the odd numbers" because 2 is in the set. Frustratingly, it's not even necessary in most cases, since we have set-builder notation and other tools.


Sure, but this is also exactly why latin was used for as long as it was - It was a language that the educated elites from many different and diverse countries used as a command ground.

I'm not saying that math shouldn't be using symbols, I'm saying it's near intentionally hostile to those first picking up the subject. If the goal of math education is to provide the groundwork of understanding to the general populace... Cater to the general populace. The vast majority of those folks are not going to be discussing complex math across several languages.

They're going to be using it for accounting, taxes, construction, cooking, etc. They don't need to memorize an entire language to do that effectively, and we shouldn't be wasting their time in school trying to make them.

Those that choose to specialize are welcome to. In my opinion that doesn't justify the use of specialized language in basic education settings.


Can you give an example of which symbols you object to and what you anticipate as a replacement? The most difficult things I can think of that someone doing accounting or cooking might have to do is compound interest or converting units; so we're talking about ()+-x÷ and exponents. I'm guessing you have something else in mind or I'm not getting it.

I do have criticism with symbols insofar as math is taught which roughly fall in line with "A Mathematician's Lament" by Paul Lockhart[0]. Loosely, that much math is taught as symbol manipulation rather than... actual math. But that's not a problem with the symbols themselves so much as with their use to obfuscate what's actually being done mathematically.

You might be interested in the book Burn Math Class by Jason Wilkes [1]. I don't really recall what his arguments against symbols were. He ends up inventing his own notation as he goes along (which is what ultimately turned me off the book about halfway through -- it just became an exercise in translating notation for me).

[0] - https://www.maa.org/external_archive/devlin/LockhartsLament....

[1] - https://www.amazon.com/Burn-Math-Class-Reinvent-Mathematics-...

Edit: Fixed list of links.


I think you're vastly over estimating the capability of many students (particularly young students without previous exposure).

It's easy to fall into a trap where you forget what it's like to be completely new to a subject. It's particularly hard when you tend to surround yourself in educated communities where this "in-knowledge" becomes assumed and standard (eg hacker news).

Take just less than ( < ) and greater than ( > ). Think about how many stupid rhymes or memorization techniques you see in classrooms to help learners memorize which is which.

Here are three separate sites, with an entire page dedicated to helping students remember which is which:

https://math.wonderhowto.com/how-to/remember-greater-than-le...

https://numberock.com/lessons/comparing-numbers-to-100/

https://myhomeworkdone.com/blog/greater-than-less-than-sign/

One of them includes a whole damned song for the purpose. All to avoid writing out smaller/bigger.

Like any language, once you learn it's hard to remember all the places you struggled.

Why make "change" Δ

Why make "square root of -1" i

In how many classes do we see rote memorization of the quadratic formula, with no context around why you should even bother to learn it? (I've seen quite a few).

Now, not all of those are really the fault of the language (math), but using the language for each of those problems facilitates lazy teaching, and it changes the goal from "understand how math relates to the world" to "memorize this language construct". One is much more helpful than the other.


Because once you learn it, notation gets out of the way rather than in the way. I can't imagine doing physics and having to write out words rather than symbols to express rates of change.

It makes sense that specific knowledge uses specific language to be more easily used and manipulated. Imagine writing (or even proving) Euler's equation without using i or pi or e. Imagine what mess math would be if we never used Greek symbols.

Sure, it would be easier for middle schoolers. But if you're arguing that we should make it easier for them since they're not gonna need ease of manipulation for math that they're not gonna use... Then just do the teach it to them. Maybe middle schoolers don't need < as a concept any more than they need it as a symbol.

But if you're going to solve equations and inequalities, then yeah, you need = and <. Everything else would be needlessly verbose and get in the way of actually manipulating concepts you know.


I've tutored many college students (of various ages) in basic algebra, almost all of whom were convinced they couldn't ever possibly learn it, and only one of whom didn't end up getting an A in their course. I'm pretty aware of the struggles some students can have with it and none of them really had trouble memorizing the handful of symbols that are actually used at that level. Some struggled with the concept of a variable, but many just struggled with understanding the relationship between the actual concepts and the manipulation of symbols. Off the top of my head, equations being balanced can be a difficult one, but certainly not the only one.

As for younger students, I have much less experience, but some; and it's interesting you mentioned less than and greater than; since I actually remember learning those symbols. We learned that the "alligator" always eats the "bigger" number. It's not surprising there are countless ways of learning it, including song. That's true of almost any abstract concept. The idea is to link a metaphor the person understands to the abstract concept. Not every metaphor will work for every person; and this is true of all abstract concepts, not just math symbols.

Of course, it's not actually true that the alligator is eating the "bigger" number, and it actually demonstrates why we need the symbols. ">" and "<" actually refer to "greater than" or "less than" which we much later learned is a way of saying "which number is further right on the number line"; which, of course, requires the abstract concept of the number line and accepting the more-or-less arbitrary decision of a left-to-right number line. "Bigger" means "has a greater distance from zero on the number line in either direction" which we'd represent as a comparison of absolute values. I don't recall when I learned about absolute values, but it was definitely years after learning about < and >. Using the proper symbols lets us be explicit, concise, and precise and avoid issues like using English synonyms (bigger, greater) or whatever pitfalls exist in other languages.

The choice of symbols < and > are, of course, largely arbitrary other than the symmetry between them. (We could have, for instance, always put the greater number underneath the smaller number so the structure is more stable in an imaginary gravity; but that, too, would be arbitrary.) But so is the letter S, or the number 9. They're all arbitrary symbols that have particular meanings in particular languages. "9" is interesting, because it's a number, versus the Roman numeral system. The Roman numeral system could arguably be called non-arbitrary. "I" clearly represents a single thing, "II", two things, etc. That works until you get up to "IV". What? "IV"? So if a lesser value is in front of a greater value, you subtract it? And how does "V" represent five anyway? It's arbitrary! But the Romans found it much more useful to be able to write VII + VI = XIII rather than IIIIIII + IIIIII = IIIIIIIIIIIII, which can pretty quickly get unruly. Turns out, memorizing digits 0-9 makes it (and more complex math) even easier: 7 + 6 = 13; which is why the entire world uses numbers instead of numerals.

(We could also have a side-discussion on why base 10 and not something like base 12, binary, a mixed radix system that uses the prime numbers or the sexagesimal system used by the Sumerians. The answer is basically that its mostly arbitrary, simpler than some systems, and we have ten fingers/thumbs.)

> Why make "change" Δ

> Why make "square root of -1" i

Largely historical reasons, expediency, and lack of better alternatives. Why represent the sound "ssss" with the symbol "s"?

You could swap i for √-1 and people would understand you, but you'd very quickly wish there was a shorthand that you could use to represent this rather special value.

> In how many classes do we see rote memorization of the quadratic formula, with no context around why you should even bother to learn it? (I've seen quite a few).

You won't see me objecting to this; but this is not an issue with mathematics. It is an issue with teaching mathematics and is part of the "lament" I linked to above. This is quite a different issue than the issue of symbols. The symbols, while arbitrary and arcane, actually make the mathematics more manageable and precise. Saying that "facilitates laziness" is like saying a clothes washer facilitate laziness since it removes the need to manually provide friction and agitation. It's true, in a sense, but I'll keep my washer.

Mathematics is taught very poorly in many places; but making it hopelessly complex and less precise by removing the symbols of the language is not going to help. I learned algebra, officially, my freshman year of high school. Yet there are many high school graduates who come out of high school not even having a rudimentary understanding of algebra (and, actually, even basic mathematics - tutored a few of those as well). Many of them learn it in college, so they're obviously capable of learning it. Those high schools failed those students. Many university professors equally fail their students.

But blaming this on the symbols of the language is too far of a stretch for me. Blame the teachers.

Edit: Fixed display of symbols.


> Couldn't it also represent powers of 2?

You'd use a different notation then. `{1, 2, 4, ..., n^2}` or `{2^0, 2^1, ..., n}` or something similar which indicates what you mean. There's nothing wrong with elipsis, if you spend a second thinking about what you're writing.

If someone wanted to be obnoxious and confusing then they could say: Haha, `n` is not a symbol either; I'm using base-50 and it's digit 23(10). So you can't really stop bad / intentionally misleading communication.


Like I said, it’s a nitpick. I can know what a person means most of the time, but it’s imprecise and technically the ellipsis can be almost anything depending on the explicit values/symbols. In the cases where the pattern is clear it’s clear. Where it’s not, using proper set builder notation in the first place would have made it clear. I’ve also seen it used in contexts other than sets, including for summations. In those cases often just having the summations in proper form can make consequences and manipulations plainer and easier.


In general, at least through high school, probably way too much teaching involves basically memorization of trivia.

Yes, some foundations of facts are needed. For example, there's some very basic operator precedence that students should internalize. And history inevitably does involve names and dates. But too much attention is probably devoted to remembering whether a battle was fought in 1746 or 1750.

It's somewhat understandable because testing for those things is easy and unambiguous. But it's unfortunate anyway.


Pedantic point: The whole "reading the bible in Latin" is a very Catholic thing (and maybe Orthodox?). The Protestant churches I've been in has read from the bible in the native tongue, and it was indeed one of Luther's criticisms of the Catholic church.


I'm not Catholic, but I think they stopped requiring the reading of scripture in Latin around 1960 and The Second Council of Vatican, AKA Vatican II.

I think Latin is still used as part of the traditional service, but it's mostly ritualistic. What I remember from attending Catholic services years ago is a lot of standing up and sitting down while chanting in Latin, followed by a sermon delivered in English.


It's not universal either. I've grown up in a catholic-obsessed country and not a single mass I've been to had Latin elements or chanting. All in native language in mid 80s.


> Most formulas students need to learn are actually quite intuitive when you replace the symbol with a one or two word description. But they'd rather force you to memorize the symbol.

Can you give some examples of this?


Literally the only thing you need to know is that * binds tighter than + and - and that * can be omitted. Everything else is obvious - I never used / but fractions instead, exponentiation is also visually indicated, the only counterexample I guess is some made up operations when doing group/ring theory but it’s all clearly explained anyways. Using “clear unambiguous syntax” as in () would actually make all formulas much longer and much less clear.


I genuinely think PEDMAS/BODMAS nonsense creates permanent barriers to understanding in math in some people because it inculcates an understanding of the symbols like + and - as being sequential operations, and they struggle to ever make the leap from math as ‘calculation’ to math as ‘reasoning’ as a result.


Concur. A long time ago I spent a full day as a professional engineer trying to figure out why a microsoft Excel formula wasn't working. Turns out that the default MS Excel order of operations was not PEMDAS, it's PE(M or D)(A or S), where 'X or X' is selected based on which operator comes next from left to right.(1) So frustrating. Always use parens.

1. https://edu.gcfglobal.org/en/excel2013/complex-formulas/1/


Isn't that the normal way to do things? Every teacher, every textbook has always told me that multiplication and division have the same priority, as do addition and subtraction.


Wait, so what do you think 1+2-4+8 is supposed to be? It's not negative.


When I have very complex logic I like to give each step its own variable. It makes it read like English.

To be honest I only actually only do this when Im working on old code and find myself trying to remember how something works.


IME the improvement to readability is worth the cost of a few extra assignments.


In the cases where they aren't worth it, you're hopefully also using a language that gets compiled and they're just optimized out.


On the other hand, Python is the only language I've used which has this special-case parsing. In all others, a < b < c would be parsed as ((a < b) < c) which may or may not be a type error, but it's consistent with the other binary operators.


That actually proves my point. I'm not really good in learning certain facts by heart so if I see a < b < c it is going to take multiple seconds, likely followed by an internet search so add a couple of minutes to that, before I can be 100% sure what it does. I mean, I know what precedence is and I'd probably assume it would be (a < b) < c but that's not going to cut it.


Just program in assembler then.

Honestly, Python's way is the only sane way to parse this. This is how a human who never programmed before with some knowledge of math would understand it.

Everyone's else mind is just corrupted by broken legacy languages.


I sometimes teach programming to beginners who have some knowledge of math. A common mistake they make is to write compound conditionals like this:

    if a == 2 or 3:
        do stuff
because that's how they would write it in math. Does that mean Python is broken and should support that construct as well?


I believe COBOL supports this.


Even if Python's way was the only sane way of doing it (view which, btw, denotes having narrow experience with programming languages), that's irrelevant. GP's point was that different languages use different rules so, unless you use brackets, you won't easily know which precedence rules get applied.


Python’s edge cases are just as odd as every other language

  >>> a=16
  >>> a*a is a*a
  True
  >>> a=17
  >>> a*a is a*a
  False
“is”, “==“, and “=“ are all very different and none of them match their mathematical equivalents very well.


Not really familiar with Python. Do you know why the second statement evaluates as false?


Small integer caching: https://wsvincent.com/python-wat-integer-cache/

The objects for small integers are built-in and static, but 17^2 is too big, and so new integer objects are created when the expressions are evaluated. "is" checks if two expressions represent the same object in memory. "==" checks for equality of values.


CPython keeps an array for "small value" integers (I think in the range [-5, 256]). So when Python wants 256, it'll give you a reference to the existing object. Larger numbers require a brand new object.


`is` is identity comparison and in python everything is "object". It check if two variable refer to same object (location in memory) or not. So you should not expect that `a is b` for `a=1.2 , b=1.2` be true. In this case actually first expression returned True is weird. because 0-256 objects are cached and every instance of them refer to same object.


Math notation is just one of many possible notations, though. And math notation is not static throughout history.


True. But the syntax of almost all other languages are derived from the same basic math syntax we use commonly for at least since programming languages are a thing.


When you use a tricky language feature to improve readability, you can just add a comment to the link some piece of documentation for that behavior. It is still a trade-off, but it saves a bit of time and whenever I come across a place where past me did this, I am always appreciative.


Depends. For obscure behavior features, ok, for 'language features' I'm probably still going to pass. Because in this example case that would come down do

  # See stackoverflow/highlyupvoted/wtf/does/python/do/here
  # This is the same as (a < b) < c.
  x = a < b < c
whih is not clearer, not easier to parse and not faster to read and harder to maintain (leads to rot easily when variable names change) than the non-ambiguous other choice.


I would have skipped that second line and just left the line with the documentation link. If you are going to provide the equivalent code in a comment, you might as well just use the equivalent code.


In Clojure, and I assume other lisps as well, you can use the < operator in the same fashion. By merit of prefix notation it's also unambiguous regarding precedence.

  user=> (< 1 2 3)
  true


I'll be honest that I'm not a Clojure user, even with context I have no idea read that. Like every other language you have to be trained to read it correctly and is not intuitively obvious to everyone.


The rules are extremely simple and easy to grasp. And once you've grasped it, you'll never have to think twice about it.

Parentheses take precedence.

The first symbol is the operator (add, subtract, sum, multiply, etc.).

Everything else is operated on from left to right.

(+ 2 5 7)

2+5+7

It's also not a Clojure thing. It applies to all Lisps or related like Scheme.


All maths functions where it makes sense take arbitrarily many arguments. (+ 1 2 3) = 6. (< 1 2 3) = true. Everything in function position (directly after an opening paren) is either a function you want to apply or a macro you want to expand, except in a select few forms, such as binding forms and conditionals. The syntax of scheme is remarkably simple once you know the basics.

I have programmed python for about 20 years, and I still do stupid mistakes. I grooked all the r6rs scheme syntax in 20 minutes, except for (do ...) which somehow never sticks. It is also rarely used.

I have tried to evangelize scheme enough to know some people just don't like it, but for me it instantly clicked.


+ and = are pretty obvious but it can still be confusing. Take not=. Does (not= x y z) mean adjacent numbers are disequal (x≠y≠z, by analogy to + and =)? Or that no two numbers are equal? Or that at least two numbers are disequal? Different lisps pick different meanings.


Which lisps? I think I have only ever seen that in clojure (thinking it was a bad idea), and clojure seems scared of parentheses. (not (equal? x y z)) is clearer, but suffers from the same drawback as your question.

The problem I think is that not= returns true as long as any elements are non-equal, which means it is not analogous to +. I can't speak for clojure, but this is in line with all equality predicates in scheme, negated or not. A predicate that checked if any neighbours are not equal would in true scheme spirit probably be called not-any-neighbour-equal? :)


> In all others, a < b < c would be parsed as ((a < b) < c) which may or may not be a type error, but it's consistent with the other binary operators.

In APLish languages, it's (a<(b<c)).


Julia also do this:

    julia> 2<3<4
    true


Python follows math. Other languages made up something weird.


The only correct way to parse `a < b < c` for numbers is throwing a type error.

Python comes second for source code niceness, allowing you to compare numbers to booleans is common and shouldn't be.


But it doesn't compare a number to a boolean. It tests whether b is between a and c, in Python.


What I wrote wasn't very understandable.

In most languages, like C++, a < b < c ends up comparing a boolean to an integer. This should be a type error, but isn't.


Something I really like in k/q is that operators have no precedence and everything is executed right to left. To take a "fast" 8th power you can just do

    c*c:s*s:x*x
with no ambiguity of parsing whatsoever.


Sounds like “is” is lower precedence than ==. Nothing to see here folks, move along?


No; it's not about precedence at all. It's about semantics of chained comparisons, which is not what you'd expect in Python.


The syntax works exactly like it's supposed to. "is" is an alias for "==". This kind of chained comparison makes code much more readable. This contrived example which I never encountered as a practical issue not withstanding.


I think you need to brush up on Python...

>>> x = 25 * 37

>>> y = 25 * 37

>>> x is y

False

>>> x == y

True

From my understanding, '==' checks for equality and 'is' checks that the variables reference the same address.


OK this is truly weird:

  >>> a=16
  >>> a*a is a*a
  True
  >>> a=17
  >>> a*a is a*a
  False
WTF


CPython interns the integers between -5 and 256.

See https://stackoverflow.com/questions/306313/is-operator-behav... for some discussion and further related links.


I guess it's ok if you are aware of it but this can be a source of bugs: let's say you're not aware of this "feature" and you are in the middle of writing some code and want to know whether A is A always evaluates to true for integers. So you fire up Python and type "10 is 10" which gives True. ...


Using 'is' on integers raises a SyntaxWarning in 3.8+


That's why you should read the spec and not guess a function's definition from one example.

Otherwise you'll end up thinking that almost everything is simply an alias for constant True or 0 or Error, by popping in 0 to any operator you check.


No. Imho, a good language designer will make it possible to learn the language incrementally without surprises. For example, a tutorial on a language should not start with "please read the spec first or you may encounter some awkward and counter-intuitive behavior in areas you thought you had already mastered".


At the Python (rather than CPython implementation) level, the explanation is “is is object identity not value equality, and there is no guarantee that equal integers share object identity.”

At the CPython level, you can explain it in terms of the particular range of small integers that are interned and thus are guaranteed within particular CPython versions to share object identity when they have value equality.

But just knowing that is identity and == is equality is mostly enough to use them correctly.


Maybe also delete or reply to your previous commented in which you made strong assertions of wrong facts.


Maybe you should stick to assembly?


“==“ is not an alias of “is”, they behave quite differently.


`x is y` is an alias for `id(x) == id(y)`.


To see that this is not about precedence, evaluate the following:

  >>> True == False is False
  False
  >>> False == True is False
  False


Or think about it for a moment and realize that it would be True with either precedence if interpreted as expected.

Or read the very first part of the linked page that shows:

  >>> (True == False) is False
  True
  >>> True == (False is False)
  True


Would I be correct in avoiding "is" for any simple data types, and only using it for objects?


Integers are not simple data types in python. https://docs.python.org/2/c-api/int.html#PyInt_FromLong

You'd be correct to use 'is' to mean 'is', and '==' to mean '==', following their definitions.


It's idiomatic to use it for comparisons with the value `None`, which is a singleton so it's always that one.

Other than that, just don't use `is`.


See the comment by lonelappde, but note that the example under discussion here is actually not about `is` vs. `==` either.


Yes. By the time I realised my second example is redundant, it was already too late to edit the comment. Thanks for clarifying.


I like chained comparisons for the `10 < x <= 100`, since it makes intuitive sense and removes duplication.

But I can't think of any case with `==` type operators, or really any other operators where it also makes sense.

So was that maybe an overgeneralized feature that should have been limited to the math operators?


I think in language design there’s an expectation that if an expression or statement doesn’t make any sense, then people won’t write it that way.

I think that’s a pretty reasonable expectation, too.

In JavaScript, {} + [] evaluates to integer 0. That doesn’t make any sense, but it makes more sense after reading the ES spec for the addition operator.

There are many expressions you can write in dynamically typed languages that don’t make any sense, probably most of them actually, but they have to be considered valid because it’s a dynamically typed language. So they’re valid, they will evaluate to something.

The language designers aren’t so concerned with identifying every possible combination that makes no human-intuitive sense. The important part is that when it seems like types should be inferred and coerced in a particular way, then that’s how it should work. It should match human intuition.

I don’t have any intuition or opinion about how True == False is False should be evaluated, this kind of thing is going to receive superfluous parentheses from me every time for the benefit of the reader, and if someone else wrote it this way I’m always going to look it up or test it in a REPL...

10 < x <= 100 though, if that’s considered a valid expression and it doesn’t evaluate to true for numbers in the range (10,100], I’m going to stop using that language...


>In JavaScript, {} + [] evaluates to integer 0. That doesn’t make any sense, but it makes more sense after reading the ES spec for the addition operator.

It's the consequence of ill-thought mechanics when it comes to type coercion. It's the original sin of many scripting languages: "let's just add a bunch of shortcuts everywhere so that it doesn't get in the way of the developer trying to make a quick script". Then you end up with a bunch of ad-hoc type coercion and the language guessing what the user means all over the place, and eventually these bespoke rules interact with each other in weird ways and you end up with "{} + [] = 0".

> I think in language design there’s an expectation that if an expression or statement doesn’t make any sense, then people won’t write it that way.

That's either very idealistic or very naive. In either case I'd argue that's a terrible way to approach language design. I'd argue that many well designed languages don't make any such assumptions.

>but they have to be considered valid because it’s a dynamically typed language.

Nonsense. Try typing `{} + []` in a python REPL. You seem to be suffering from some sort of Javascript-induced Stockholm syndrome, or maybe simply lack of experience in other languages. JS does the thing it does because it was designed(?) that way, not because there's some fundamental rule that says that dynamically typed languages should just do "whatever lol" when evaluating an expression.


I don’t write JavaScript or have a strong opinion about it.

The widespread prevalence of JS transpilers and everyone’s apparent unwillingness to write pure JS makes me think it probably isn’t such a great language. It probably never had a chance given its history with browsers.

Just using it to make the point that every language has valid expressions that make no sense. You can find similar examples in every language. All you have to do to find them is start writing code in a way that no person ever would or should write it.

Especially in the case of dynamically typed languages, throwing an exception sometimes but not always based on the “human intuitiveness” of any given type inference would make the language even more unpredictable. It’s just instead of asking why expression a evaluates to x, we would all be asking why expression a evaluates to x but similar expression b throws an exception.

If you ask me, the latter is even more arbitrary.

These aren’t useful criticisms. The only reason these kind of critiques even get so much attention is because people reading the headline think: “I wonder how the hell that would be evaluated?” And so they click.

The headline is only interesting to begin with because nobody ever writes that, and nobody should ever write that.

If nobody would ever write it or have any expectations about its evaluation, then how is it even significant that the language will interpret it one way vs another?

I think these criticisms are a good springboard to have the debate about static vs dynamic typing, but arguing over whether True == False is False should be evaluated one way vs another is kind of pointless. If the result of that argument is agreement, then we might end up with people actually writing this, which should be the last thing any of us want.


Python, C, Go, and even Haskell have wtfs too where language features collide. Every complex language does, as a consequence of complexity too.


> There are many expressions you can write in dynamically typed languages that don’t make any sense, probably most of them actually, but they have to be considered valid because it’s a dynamically typed language.

Not really - a language could throw an exception in these cases, as Python does for things like “a”+1. Not every dynamically-typed language is JavaScript.


That is the difference between a dynamically typed language (like Python) and an untyped language (like JS). The dynamically typed language checks types at runtime, whereas the untyped language doesn't.


JS is not "untyped". It literally has a "typeof" operator. You probably mean strongly vs. weakly typed. JS is dynamically, rather weakly typed. Python is dynamically, strong-ish-ly typed. C is statically, weakly typed. Rust is statically, strongly type.

Those two attributes are mostly orthogonal.


There's no such thing as "strongly" or "weakly" typed languages. Those terms have no definition and are meaningless. Yes Wikipedia will tell you that C and JS are weakly typed, but what attribute is it that C and JS's type systems share? They share absolutely nothing in common.

Dynamic typing refers to checking of types at runtime. JS does not do this consistently or effectively. Python does.

The typeof operator cannot be the basis of a type system since you can count all the answers it gives on your fingers. JS needs a bunch of extra functions like Array.isArray() to determine if something is an array or not. The language itself has no clue, everything that isn't a primitive is just an "object" as far as JS is concerned.


"Strongly" and "weakly" typed is not a binary attribute of a language and there's a gradient so languages can be more or less strongly or weakly typed but I think those are still useful qualifiers. C lets you add ints and doubles without explicit cast, Rust doesn't. Rust is more strongly typed than C. It's not meaningless to say that. A metric being partially relative or interpretative doesn't make it useless.

For the rest I don't understand how anybody can seriously make the argument that JS has no types. Shell scripts have no types because almost everything is a character string and that's it. JS type system is very limited but it does exist and I don't see how it can be used to justify {} + [] = 0, which is where we started (especially given that {} and [] are objects in JS, but 0 is a "number", so different core types). Adding two objects and getting a number is not a limitation of the type system, it's a conscious decision by the designers (or at least the side effect of one).


Python (and Ruby) are dynamically typed and strong typed. They are two different things. Strong typed doesn't mean that you can't do this

  a = "a"
  a = 1
That's dynamic typing. Strong typing means only that

  a = "a" + 1
fails.

More about that at https://en.hexlet.io/courses/intro_to_programming/lessons/ty...

JavaScript and PHP and Perl automatically cast to a reasonable value with all the advantages and pitfalls.


>Strong typing means only that

> a = "a" + 1

>fails.

String a = "a" + 1; works in Java. So Java is not strongly typed?


In some ways yes, Java is not so strongly typed; a statically-typed language can be weakly typed.

See C where ints, pointers, floats, and bools can almost all coerce into each other, such that the compiler will allow you to use arithmetic/logic operators with most different types, whether you meant to or not.


That’s because the language designers overloaded the arithmetic operator (+) to perform string concatenation when the operands can be cast as Strings.

Without overloading the + operator, string concatenation which is a common operation, would have been unnecessarily verbose. Your example without the + operator would look like below:

   String a = new String(“a”).concat(1); // String object concat
   String a = “a”.concat(1);             // String literal concat


The same could be said of JS but for some reason it gets shit but Java gets off Scott-free.


Java doesn't get off scot-free for anything; nobody likes it, they just use it because they must.

Java just happens to be less prominent in tech media right now since JS on the server is the new(-ish) hotness. There's still plenty of dislike for it, but really, all that needed saying about it happened long before now.


It’s really not dynamic vs untyped, but the extent to which the language design chooses to coerce a value into a given type instead of raising a runtime type error. Python is guilty of this too with its truthy values, e.g. `if []` — an empty list of type `list` is suddenly apparently inhabiting the bool type and is equivalent to False.


> In JavaScript, {} + [] evaluates to integer 0.

This not really true. Put this js console and you'll see that a is the string "[object Object]":

  a = {}+[]
When you put just {}+[] in your console its doing an empty block followed by unary-plus,like:

  {/*do nothing block*/}; +[]


Huh that’s quite interesting. It behaves this way even when you explicitly make it an object literal rather than a block.

I guess JS’s parser explicitly prohibits adding things to a curly-brace enclosed entity?

I’m normally quite defensive of JS, but I’ll have to admit I don’t like that.


How did you explicitly make it into an object literal?

    ({}) + []
    -> "[object Object]"
will correctly make it "[object Object]"

Did you try to do this?

    {a: 1} + []
    -> 0
That's interpreted as the statement 1 labeled a, not as an object. This makes it obvious it's not an object:

    {a: 1, b: 2} + []
    -> Uncaught SyntaxError: Unexpected token ':'
JavaScript supports labels so you can continue/break multiple levels at once:

    {
      a: while (true)
        while (true)
          break a;
    }


Yes, I realized after the fact that I was indeed making a label. I’m now back to not having an issue with JS


It's just ASI (automatic semicolon insertion) and at the top-level {} declares a block not an object.

You can also do like ({} + []) and it will actually be "adding" them; the confusion solely comes from people typing into the JS console something that wouldn't make any sense to put into an actual script.


You’re right! I would never have guessed that. Learn something new every day.


I could see this for something like

    foo == bar == baz
But chaining the == with other operators feels weird, especially "is". Similarly

    foo >= bar <= baz
feels very off to me, and I'm not sure it should be chainable. If the intuition is to emulate human notation in math, there are many chained expressions that we would not allow to happen.


> But chaining the == with other operators feels weird

Chaining anything with `is` feels weird to me, but:

foo >= bar == baz >= fooo

fells perfectly ok.


(please ignore, I'm an idiot) (deleted)


Not OP but I’d expect no precedence as I expect this to parse into a series of number comparisons and-chained.


Damn, you're right, was being stupid. Duh.


Maybe it should be even more limited than that. For example, chains of `==`, mixed `<` and `<=`, or mixed `>` and `>=` only. That would disallow unintuitive things like:

    if a <= b > c:
But then, what to do in those disallowed cases? Is the parser powerful enough to make them SyntaxErrors?

If not, I’d much rather have the above compiled into the unintuitive `a <= b and b > c` than the completely wrong `(a <= b) > c`.


A context-free grammar can do it; you'd have three productions "ascending comparison chain," "descending comparison chain" and "equality chain" to cover the valid possibilities.

But, yeah, if there are contradictory comparisons, that should be deprecated and raise an error immediately because it indicates a logic error. If you're doing something awful like using custom operators for side-effects, just write out (a <= b) and (b > c).

Most likely, though, no one is using custom operators for side-effects in chaining because the implicit `and` coerces arguments to booleans. So, with numpy, you can compare two ndarrays with a < b and it returns a new array of booleans, but you can't chain compare three ndarrays because `and` coerces the result to a single boolean.


I can think of a few cases where I want:

    if x == y == z:


I always found prefix notation in Clojure to be really elegant for a lot these cases that are odd in other languages:

(= x y z)


This is a bit like something you can do in C, almost by accident as a bonus language feature.

  x = y = z = 4;
But I think this is assignment expressions which were controversial in Python or something.


Yeah in Python you can do, say, a = b = []

But then the two lists are the same, and appending to one appends to the other. Not particularly useful.


It's useful when the object being assigned to those variables are literals instead of reference types, which in my experience is most of the time when you want to do the simultaneous assignment. When you do want to do reference types, you can do a,b = [], [].


Not anymore, check out the walrus operator.


I agree, "is" should probably be excluded from comparison chaining.


I guess if you say "x == y == z" you may also say "x is y is z". But certainly the mixing of the two chaining types is confusing.


If anyone's looking for some more good WTFs in Python: https://github.com/satwikkansal/wtfpython


I must be a crusty Python programmer, because none of the first few examples seemed like WTFs to me at all. I think a WTF is when you think you know a language and then see something that behaves completely unlike how you expect.


I was annoyed by the number of RTFMs they call WTFs. They even claim some WTFs where Python behaves exactly as a naive user would expect, e.g. if I put 5 and 5.0 in a dict they are treated as the same key.

I saw two valid ones: "Lossy zip of iterators" is because Python conflates iterables and iterators, "The disappearing variable from outer scope" where deleting the exception is contrary to how scope works everywhere else.

They also miss the classic newbie head-scratcher:

    x = []
    for i in range(10):
        x.append(lambda: i)
    x[0]()


I'd also classify WTFs as something that behaves completely unlike how you'd expect based on knowing many other similar/related languages, also.


After seeing years / decade of such things about PHP, welcome to our world.


Wow.

(a) I don't know Python as well as I thought I did. (b) I suddenly never want to use it again.

All these edge cases! All these behaviors, which I'm sure were added with the noble intention of increasing developer convenience, but which I'm equally sure have cost a larger amount of developer sanity!


I've been writing python for about a year and change now (~7 years of engineering generally) and I never want to use it again.

It feels to me like javascript in that it's "popular" because people already use/know it. So that huge existing codebase is the equivalent to the web; if you want to build on it, you're stuck with this.

But my lord. Whitespace sensitivity is a terrible choice and there are piles of kludges trying to work around that.

It means no multi-line lambdas so you end up with these unreadable list comprehensions `[ sub_item.value for sub_item in item.sub_items in items if item.is_the_best ]`. Lines copied into the console care about indentation which is definitely not a fun DX.

Not to mention these random global functions everywhere. Whew.

Mistakes were made.


> Whitespace sensitivity is a terrible choice

It's my favourite feature of Python and I actively seek out other languages that make this terrible choice.

I regularly use Python and a couple of curly brace languages and coming back to Python always feels like a breath of fresh air.


> Python always feels like a breath of fresh air.

That’s a good way to put it. Significant indentation makes the syntax so much more lightweight.

This image [0] is supposed to be a joke, but to me it clearly demonstrates how braces and semicolons are both ugly and redundant.

[0] https://www.reddit.com/r/ProgrammerHumor/comments/2wrxyt/


> > Whitespace sensitivity is a terrible choice

> It's my favourite feature of Python

It's like the speed bump: It's for people who can't follow simple rules. It inconveniences you, but is rationalized with that it ultimately makes the world a little safer for everyone, including you.

Me, I unindent temporary code like debug-print statements and literals overriding actual data. This makes it impossible to commit by accident.

But my argument against whitespace sensitivity would be that bad and unreadable code is bad and unreadable regardless of its whitespace. In fact, force-formatting it just hides the evidence.


> It means no multi-line lambdas so you end up with these unreadable list comprehensions `[ sub_item.value for sub_item in item.sub_items in items if item.is_the_best ]`

How would you rather write that code? Is it lack of multi-line lambdas that stops you from writing it as you would like?


    items
      .select(&:is_the_best)
      .flat_map { |i| i.sub_item.map(&:value) }
Not having functional constructs really hurts readability, and I don't want to think about what would happen to the list comprehension with more complicated logic


> It means no multi-line lambdas so you end up with these unreadable list comprehensions

You can have multi-line lambdas, just use parens:

    (lambda: look.ma.im
        .on_two_lines)
And, yeah, if it's complex enough that it should have control flow, you just write a nested function.

And your example is more clearly expressed as a plain loop:

    new_list = []
    for item in items:
        for sub_item in item.sub_items:
            if item.is_the_best:
                new_list.append(item)
I don't like the comprehension syntax. The correct comprehension is:

    [item
     for item in items
         for sub_item in item.sub_items
             if item.is_the_best]
Hopefully that makes plain what they were going for, but in practice, list comprehensions are often a contradiction in terms.

> Not to mention these random global functions everywhere.

We ought to be able to have it both ways, it should be possible to lift a closure and treat the variables it closes over as arguments, but I wind up doing that manually just so I can test them directly.

> Lines copied into the console care about indentation which is definitely not a fun DX.

I think the issue with breaking copy and paste is the real valid complaint against indentation-sensitive languages. The tooling just isn't there, and this continues to be the case after 20 years.


Not allowing multi-line lambdas is beneficial for code readability. If your lambda needs multiple lines, you should write a function.


Chained promises are much easier to read when the steps are inlined as multi-line lambdas rather than having to read the code backwards for every function definition. And as much as i love list comprehensions, map() and filter() with 2-3 line lambdas also make a lot of sense.

Sure, you "should" use asyncio instead of promises but honestly it has its own problems, mostly in that it requires the rest of your legacy code base to also be async.


If you don't want to use a language because in theoretical scenarios it's possible construct unintuitive operations with it, then what language are you left with?


I was going to bring up Haskell as a language with clear semantics, but even that has its quirks, like

    Prelude> [1, 3 .. 10] :: [Float]
    [1.0,3.0,5.0,7.0,9.0,11.0]
In Haskell, this is syntactic sugar for the function enumFromThenTo (in typeclass Enum), which in my view should not have a special case implementation for Float.


LISP?


>>> row = [""] * 3 #row i['', '', '']

>>> board = [row] * 3

>>> board[0][0] = "X"

>>> board

[['X', '', ''], ['X', '', ''], ['X', '', '']]

I almost felt helpless for an hour when I used this list initialization [0] the first time in my code and couldn't find the reason why my unit tests where failing.

[0] https://github.com/satwikkansal/wtfpython#-a-tic-tac-toe-whe...


Whooaaa! This is a gem and should be a handy reference to look up.

I knew about the hash function, and how it generates same hash value for objects that have same numerical value. But it's so easy to forget!


Wow, some of those entries are painful to read.


That thing is good, it's just a bit of a pitty the titles are so undescriptive. For example a typical one (in other languages as well) like a closed-over loop variable having the value of the last iteration in the closure, is called "the sticky output function". Meaning if this weren't all on one page it would be hard if not impossible to find what you're looking for.


This seems to be another instance of the general situation where trying to "helpful" by introducing a special-case rule (comparison chaining) that is intended to make some uses more convenient, also introduces perplexing behaviour for other cases. I'm far more accustomed to the C-family languages, where parsing is usually quite uniform, so to me "these operators are binary and left-associative like all the others, except when more than one occur in an expression" seems like a trap. I wonder if the short-circuiting behaviour of the implicit && has also caused some surprises, even in the x < y < z() type of expressions that this feature was intended for.


Yes, I was going to comment much the same but with the addition that Joel Spolsky famously documented this phenomenon is his essay on The Law of Leaky Abstractions:

https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-a...


For an 18-year-old article, it sure is prescient --- the state of modern software seems to be all about gluing together numerous layers of abstraction and libraries without understanding them, with the result that whenever something goes wrong, as it inevitably will, it takes even longer to diagnose. The higher you are on the ladder of abstraction, the worse the fall.


But also, those high ladders of abstraction support some pretty cool software...


I can‘t help but read most of the moaning about Python‘s handling of chained comparisons in this thread as “I already know how Blubb handles comparisons, thank you very much, and if Python doesn’t do it exactly the same way Blubb does, Python is obviously stupid and it’s designers must be morons.”


I honestly don't think being able to write 'a < b < c' is worth making the language bigger and causing weirdness and gotchas like this. 'a < b && b < c' isn't that much longer and is instantly and unambiguously readable to programmers coming from hundreds of other languages.


I really agree with this. Maths <> programming, so while some notation sharing is good, don't (IMO) take it too far.

I don't like it either when languages get too helpful. I was recently doing some python and getting some very odd results. Turned out the index I was using to pull stuff from the list had gone negative (off by one error) so it was getting stuff from the end of the list rather than excepting as most languages would. That is

  y = [1,2,3]
  y[-1]  <-- unintentionally negative index
I'm not saying the ability to index with negatives like this is a bad thing in python, but I seriously question it's value vs the footgun thing.

A while ago I was doing some SQL and used an answer off stack overflow. It was a good answer but it tried to be too flipping helpful. IIRC it was doing nontrivial date difference calculations[0]. Rather 'helpfully' if "(date2 - date1) < 0" it would kindly assume you'd given it arguments in the wrong order and silently flip them for you (to "date1 - date2") to get you a positive number, always.

This 'helpful' behaviour hid the presence of bad data (date2 should always have been chronologically later than date1 - date1 was interview, date2 was when job started).

Moral: keep programming language & library semantics simple.

[0] I remember now, it was the number of weekdays (that is, excl. sat/sun) between 2 dates.


Negative indexing (along with the related syntax for slicing) is one of the most usefull python features. Numpy and pandas thrive on it.

Just learn the language, python is already one of the easiest languages out there.


I know python. I have rarely found negative indices useful except to pick out the last item of a list, and I've done a decent amount of it, though no numpy/pandas yet. In what way is negative indices particularly useful for these latter two?

I have no problem with slicing. You can't get it accidentally wrong because the syntax is visible. Negative indexing vs positive indexing can't generally be distinguished just by looking at the indexing statement, which is where I tripped up.


I also question the value and cost of syntactic sugar like chained comparison operators, negative indexes, even semantically significant whitespace/indentation.

They let the programmer take shortcuts, but unless they're always attentive of their implicit behavior, can be a source of subtle bugs.

I say that as a one-time enthusiastic user of CoffeeScript, which felt so refreshing and elegant at first. Over time, especially working with other people's codebases, I came to a personal conclusion that it's not worth the convenience.

Syntax sugar can hide logic that would be better explicit. Even if it feels verbose, there's value in being able to see exactly what's happening.

An example of the latter that I've heard occasionally, is how error handling is done in Go. Every single function call that can return an error must be explicit handled (as far as I know). There could have been some sugar to make it simpler, but they preferred to keep it verbose. That was the right decision, in my opinion, and other aspects of the language give me the impression that it's a consistent design philosophy.


How would you feel about:

    y[end(1)]
    y[len - 1]
Either one then constructs a "reverse index" value, and the actual index is determined by the slicing operation.


Both of these are idiomatic because they're not built by composition.

  y[end(1)] 
is non-composed because AFAICS

  end(1) 
is meaningless by itself. Ditto the second although

  y[len(y) - 1]
would be closer to the mark.

If you're willing to accept a naive implementation that can be optimised, something like

  reverse(y)[1]
is compositional though inefficient as-is. You could though recognise this overall and optimise it to not reverse the lot before picking out a single item.


Allow me to justify why end(1) and len are meaningful:

Suppose indexing (and you can extend this to slicing if these are possible components of a Range type) is polymorphic.

list[Index] means to find an element at Index, starting at the beginning

list[ReverseIndex] means to find an element at ReverseIndex, starting from the end

Int is naturally a subtype of Index.

end(Int) would be a constructor for some concrete subtype of ReverseIndex that is distinct from Int. Or len would simply be shorthand for ReverseIndex(0). And ReverseIndex is also defined for the usual int arithmetic, except that it returns new values of ReverseIndex.


Interesting idea. I can see that end(1) would work though it's bringing in a lot of machinery just for a little thing (though if we looked harder that might generalise to other situations enough to make it viable).

Alternatively you could do the same with a method I think, something like

  y.end[1]
although y.fromEnd[...] might be more mnemonic. I don't see that y[len - 1] could work in this scheme, but whatever.

I like yer thinking!


Yeah this is exactly why negative indexing is a bad idea and why I'm against introducing this in V.


Agreed. It strikes me as an inelegant abuse of notation even when it's used in a maths context, albeit a convenient one. It violates the way we can usually reason about infix operators.

a < b is an expression which gives you a booean value, true or false. Why then are we comparing whether it is less than, or greater than, some number?

Unlike with addition or multiplication:

a < b < c(a < b) < c

also:

a < b < ca < (b < c)

instead:

a < b < c = ((a < b) ^ (b < c))

The same criticism does not apply to C's chained assignment expressions, a = b = c, but I dislike that for another reason: if the type of b is a narrower type than that of c, you may get an unexpected value assigned to a.


But in your proposed solution b would be evaluated twice. So you would have to add a new line, think of a new variable name, and so on.

  if lower < x.calculate_weight() < upper:
    ...
Suddenly turns into

  x_weight = x.calculate_weight()
  if lower < x_weight and x_weight < upper:
    ...


And is that so bad? So often i did stuff like that, then finding I actually need the result of x.calculate_weight() inside the condition, or to debug it or log it in some way. Then it has to be extracted anyway.


I tried this in gpython ( https://github.com/go-python/gpython ) and it works.

That isn't surprising I suppose however what is surprising is that I wrote gpython and I had no idea why it worked until I read the explanation on stack overflow about 5 times.

I guess that is the power of having implementing the grammar.

I always like it when my creations (programs or children) exceed me :-)


I find it cool to explore these edge cases, but putting anything like this in a real code base is a terrible idea BECAUSE there are so many different ways to interpret it. Sure a < b <= c has a clear mathematical meaning which works towards python's overall mission of being clear, but in general please good people only have one or two variables in your conditional statements!


I find something like this totally plausible and yet it is completely wrong.

    >>> def check_parity(x, expect_odd):
    ...     odds = [ 1, 3, 5 ]
    ...     if x in odds == expect_odd :
    ...             print("ok")
    ...     else:
    ...             print("error")
    ... 
    >>> check_parity(5, True)
    error
Crazy!


If I understand what you're getting at changing '==' to 'and' will make it work the way you're expecting. Both have to evaluate to 'True' in the if statement and passing 'False' to expect_odd will always print 'error'.


That wouldn't work for the even parity case.

"(x in odds) == expect_odd" gives the intended behaviour and I think is easier to read as well.


I know how to make this work, but a bug like this would be hard to spot.


Can you explain why it doesn't work?


Because chaining the operator results in:

> if x in odds == expect_odd :

> if x in odds and odds == expect_odd :

with the latter always evaluating to False (assuming expect_odd is a boolean flag).


I'm not a Python expert so I don't know much about the internals of Python but my hunch is when Python evaluates 'x in odds' it's not storing that value internally as a boolean so when you use the '==' strict comparison Python is expecting both data types to be the same.


It's: if (x in odds} and (odds == expect_odd).

The 2nd expression is obviously false always.


This is neat. My first reaction was confusion and a bit of shock. But then it made sense. And it makes a lot of sense.

Part of the issue is that “==“ and “is” are intermixed. That emphasizes the weirdness but detracts from understanding the underlying mechanism that is at work.

If you look at

True == False == False

It makes more a bit sense that it evaluates the way it does.

If you do

1 == 2 == 2

and it evaluates to False, then it is perfectly clear.


There are many kinds of chained operators (in some languages even associative arithmetic operators are chained). Contrary to popular beliefs, this is nothing to do with chained operators but rather with operator precedences.

It is pretty common that arithmetic comparison operators are grouped to a single precedence level and that's not a problem. But in Python `is`, `is not`, `in` and `not in` are also in that level. In particular two operands of `in` and `not in` have different [1] types unlike others. Mixing them are, either with or without chained operators, almost surely incorrect.

This kind of precedence issue can be solved by introducing non-associative pairs of operators (or precedence levels), something that---unfortunately---I don't see much in common programming languages. Ideally Python's operator precedence table should look like this (compare with the current documentation [2]):

    Operator                                Description
    --------------------------------------  ------------------------
    ...                                     ...
    
    `not x`                                 Boolean NOT
     _______________________________________________________________
    |
    | The following groups do not mix to each other.
    | Use parentheses to clarify what you mean.
    | ______________________________________________________________
    ||
    || `in`, `not in`                       Membership tests
    ||
    || `is`, `is not`                       Identity tests
    ||
    || `<`, `<=`, `>`, `>=`, `!=`, `==`     Comparisons
    ||______________________________________________________________
    |_______________________________________________________________
    
    `|`                                     Bitwise OR
    
    ...                                     ...
In fact, there is already one non-associative pair in Python: `not` and virtually every operator except boolean operators. It is understandable: the inability to parse `3 + not 4` is marginal but you don't want `3 is not 4` to be parsed as `3 is (not 4)`. My point is that, if we already have such a pair why can't we have more?

[1] With an exception of strings (`"a" in "abcdef"`). I hate that Python doesn't have a character type.

[2] https://docs.python.org/3.8/reference/expressions.html#opera...


It seems like only operators with a nice transitivity should be supported. x < y < z. x == y == z. x is y is z. That kind of thing. x != y != z doesn't work because in normal language, you'd say that to mean that they're all unique, while allowing it the python way, it doesn't imply x != z.


I wrote up a post on this same kind of expression: https://banna.tech/post/chained_conditional_expressions_in_p...


You're mixing comparison operators (`==` and `is`), which is a code smell.

It doesn't matter what the result is - you know it's going to bite you eventually. If you run a linter on this it would correctly yell at you.


Don't jump to conclusions too fast about a reduced example to demonstrate a problem.

I probably originally had 2 expressions that evaluated to booleans. I may have been using `is` to check that the type of one was actually a bool rather than just falsey.


Someone writing code like these, his life is sad.


IIRC, I originally had 2 expressions that evaluated to booleans that I was comparing. I don't remember if I had parens, but I do remember being deeply confused once there were no parens.


does ruby do anything like this? i'm new to ruby and could imagine something like this biting me ...


Yes. Check out the differences between `&&` vs. `and`, along with `or` vs. `||`, which can lead to similar surprises.

Though you can skip this lesson if you've worked with Perl (any others?) in the past.


I agree with the folks at Airbnb regarding the and, or, and not keywords. "It's just not worth it." [1].

[1] https://github.com/airbnb/ruby#no-and-or


not really. there’s some weird operator precedence, but it doesn’t have any multi-operator expressions


One weird operator that comes to mind is the flip-flop operator, but the odds of encountering it are close to zero, and it would certainly stick out as something very bizarre and not confused with other syntax.

https://chrisseaton.com/truffleruby/flip-flops/


!?! I’ve been working in ruby for 15 years and I had never heard if the flip flop operator! Amazing. Thanks for the counter-example, I have learned something


  >>> True is (False is False)
  True
  >>> True == (False is False)
  True


Explicit is better than implicit.

   >>> (True == False) is False
   True


Oh, hi!

Yeah, that was a real head-scratcher.


170 comments here so far because of some syntactic sugar. If it causes this much discussion and confusion it's not worth it IMO. I'm glad none of the languages I use have this "feature".


tl,dr: if you know why does the result of `1 < 2 < 3` is True,then,you know why `True == False is False` is False. Forget other language, first. In Python the chain compare means `1 < 2 < 3` means `1 < 2 and 2 < 3`, so `True == False is False` means `True == False and False is False` and equal to `False and True`. So, the result is False.


This can be fixed by one of my favorite Python oddities,

True = False

This is right up there with default arg instances getting cached across calls, though it's perhaps better suited for an underhanded Python competition.

Have fun with it. Redefine it to be true 90% of the time.


You'll need to find another one, the above results in SyntaxError: cannot assign to True


I could have sworn that was still a thing in Python 3, but it looks like they've formally been keywords since 3.0.

I suppose it took awhile for the default Python shipped with systems to be 3.*, because I show people this anytime Gary Bernhardt's "wat" talk is brought up.

Edit :

Here's some of the fun from Python 2.X not treating True and False as keywords:

https://stackoverflow.com/questions/13665989/in-python-how-t...


> default arg instances getting cached across calls

That is not what is happening at all, logically or semantically. Effectively this is.

    # People not understanding when the 
    # function definition including arguments is evaluated.
    mutable_instance = list()
    def func(change_me=mutable_instance):
        pass


Wouldn’t something like this have been better?

    def func(change_me=None):
        if change_me is None:
            change_me = list()
        ...


It's just a operator precedence problem. Add parentheses and it goes away.

    Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
    >>> True == False is False
    False
    >>> (True == False) is False
    True
There are worse problems with Python's "is".

    >>> 1+1 is 2
    True
    >>> 1000+1000 is 2000
    False
This comes from a bad idea borrowed from LISP. Numbers are boxed, and the small integers have boxes built in for them. Larger numbers have boxes dynamically generated. In Python "is" means "in the same box". This corresponds to (eq a b) in LISP.[1] Exposing the implementation like that might have been a good idea when McCarthy came up with it in 1960.

[1] http://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node74.html


It's not operator precedence, both (True == False) is False and True == (False is False) are true.

As explained in the link, it's because of chained comparisons, it expands to True == False and False is False. More useful for expressions like 1 < x < 3.


I don't think that different equalities like "==" and "is" should be chained together; that is completely wrong.

If it must be done for consistency in syntax/parsing, then such combinations should be semantically diagnosed and rejected.

"x is y is z" -> good.

"x == y == z" -> good.

"x == y is z" -> WTF, error.

All the operators in the "relational cluster" should belong to the same equivalence family.


FYI, it seems more complicated than just precedence:

    >>> True == False is False
    False
    >>> True == (False is False)
    True
    >>> (True == False) is False
    True


You beat me to it. :)

At first glance this looks like an easy problem. But actually building the different parse trees for what I would expect here, and in all cases it should be True.


For the unparenthesized case, it must be building a parse tree that is different from the parenthesized cases, namely a ternary parse tree node for an imaginary ==is op with screwy semantics.

So that is to say:

  a == (b is c)  -->         ==
                            /  \
                           a   is
                              /  \
                             b    c


  (a == b) is c  -->            is
                               /  \
                             ==    c
                            /  \
                           a    b


   a == b is c   -->           ==_is
                              /  |  \
                              a  b   c
Assuming that the == and is are operator tokens and a, b, c are operands, that seems to be one rational hypothesis.

Another hypothesis is that == and is are not pure functions that operate on values, but are special operators that operate on syntax, and treat a parenthesized expression differently from an unparenthesized one.

(In what way does this sort of thing belong in a self-proclaimed newbie-friendly language? This has to be a bug.)


I'm convinced it is a newbie friendly language only because they are so vehement in claiming that.


Multi-way equivalence is more useful if it has OR semantics rather than AND.

If you're working in TXR Lisp, you can use the meq, meql and mequal functions for three or more argument equality.

The call:

  (meq a b c ...)
does not mean anything similar to:

  (and (eq a b) (eq b c) ...)
but the semantics (except for operand evaluation) is like:

  (or (eq a b) (eq a c) (eq a d) ...)
and likewise for the other two. It's testing whether the left argument is equal to at least one of the remaining arguments.

These functions can be called with only one argument, in which case they yield false, just like (or).

These functions are very useful in cond statements that include some tests that prevent conversion into caseq/caseql/casequal.

  (cond
    ((meql x 1 2 3) ... ) ;; x is one of 1 2 3
    ((> x 10) ...)        ;; x is greater than 10
    (t ...))              ;; otherwise
m stands for "multi"; or, if you like, it stands for "member" because (meql x a b c) replaces (memql x (list a b c)).


Having a tool like that makes sense. That it is the non asked for default doesn't.

That is, I think I'm in violent agreement.


The fact that the expression has an evaluation strategy that does not correspond to either parenthesization is not "complicated".

Rather, there is a more fitting adjective, for which a friendly euphemism can be found: "pythonic".


eq may expose implementation (it's even called "implementation equality"), but it's very useful.

Implementation equality lets you use an object as a key in a lookup mechanism which adds external associations with an object (which could be de facto properties of that object, or links to other objects). You cannot do that with an equality that deems similar objects to be equal.

eq does not mean "in the same box". In many implementations, it means "same bit pattern". Two unboxed integer objects (fixnums) are eq if they are the same integer (bit pattern). However the ANSI Common Lisp standard doesn't require small integers to be eq to themselves. eql is used for testing for "same number, and eq equality for everything else", which means that 95% of the time you want eql if you think you want eq. The 5% you're sure you're only comparing symbols, or else objects such as structures or CLOS objects for their identity.


It's not demonstrating an operator precedence problem because you still aren't explaining why `True == False is False` returns False and that was the question.

In fact, a problem is that it looks like an operator precedence issue at first glance. Of course, TFA explains the answer.


I don't think it's that bad. "is" shouldn't be used for equality like that and most people won't think to because it's weird. Maybe they should have named it __is__ so people know it's a scary operation that might confuse beginners.


So all Python numbers are objects and the implementation automatically pre-allocates and reuses instances for small numbers? What a strange implementation. I thought it used tagged pointers like other dynamic languages. In Ruby, values can be pointers to objects but also 63 bit numbers when the least significant bit is set.

https://en.wikipedia.org/wiki/Tagged_pointer

This issue of object identity and equality isn't unique to Lisp but I agree that applying it to numbers produces unexpected results.


It applies to Ruby as well, just in different cases:

    irb(main):020:0> 10.0.__id__
    => 81064793292668930
    irb(main):021:0> 10.__id__
    => 21
    irb(main):022:0> 10==10.0
    => true
And for very large numbers:

    irb(main):031:0> 1e200.__id__
    => 340
    irb(main):032:0> 1e200.__id__
    => 360
So it's a different threshold and different cases, but as a general rule, "equality and identity are different" still holds.


Yes. Floating point numbers cannot be folded into object pointers and are always copied into newly allocated objects. Integers larger than ((sizeof(long) * CHAR_BIT) - 1) will actually become an arbitrary precision integer object.


> I thought it used tagged pointers like other dynamic languages.

No. IIRC tagged pointers were considered too complex to expose via Python’s C API.


In Ruby C extensions it works like this:

  int FIXNUM_P(VALUE object_pointer) { return object_pointer & 1; }
  long FIX2LONG(VALUE object_pointer) { return object_pointer >> 1; }

  VALUE ruby_value;
  long number;

  if (FIXNUM_P(ruby_value))
     number = FIX2LONG(ruby_value);
I thought this was pretty simple. What does Python's C API look like?




Java does the same thing for small numbers.


Yeah, the Integer class maintains a cache of all signed 8 bit values:

https://stackoverflow.com/a/3131208/512904

In Java, there's a clear distinction between a primitive int value and an Integer object. Not all languages have this property. I thought Python didn't.


How is that operator precedence? Putting the parens around "False is False" also makes the expression True. The reason (chained comparison) is explained in the link and I don't think it amounts to operator precedence....(?)


Had to test it out myself.. seems like the odd boxing behavior is gone in 3.7.4. But present in 3.6.6.


EQ in Lisp means 'same object'.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: