Epicycles. Experience leaves me inclined to believe the code precedes the defini...

eru · on March 18, 2020

But that doesn't tell you _why_ that choice is a good idea!

My favourite explanation comes from property based testing.

In general, you have the following property: For any lists xs and ys,

  all(xs) and all(ys) == all(xs + ys)

If you want to keep that property as simple as possible, you have to define

  all([]) == True

If you want to go with the opposite, your property becomes more complicated.

Mathematical definitions are somewhat arbitrary. It's up to us to choose definitions that make sense in our contexts. A general desire to make the resulting systems simple and uniform worked out well for many, many context.

Slight detour: that's also why defining 2 - 5 to have an answer is a good idea! Negative numbers were seen as a bit suspect, but they behave perfectly sensible. Same for complex numbers.

But: still we generally leave 0 / 0 undefined, because no answer would lead to a satisfying system.

dataflow · on March 18, 2020

I love this. But I feel like there are people (like you and me) who find an explanation like this awesome and compelling, and then there are others who have a hard time getting on the same wavelength, let alone to try to agree. Like for example, a similar sort of reasoning leads to min(x, y) breaking ties in favor of x, but max(x, y) breaking ties in favor of y, e.g. because [min(x, y), max(x, y)] should stably sort the objects. (There were discussions of how C++'s definition is broken a while back, if you Google hard enough.)

But whenever I've tried to explain things like this to people, I've found so many just don't seem to see the issue, or to remotely care if they do. Their intuitive responses tend to be:

- It's an obscure issue, it's never gonna come up.

- Oh, well I mean it's obviously your fault for calling it that way with that implicit assumption. Why would you think that?

- Well it's the caller's responsibility to read the documentation/look at the source/test the code before they use this function, not my problem to try to fit it into every mathematically possible use case.

I feel like the notion of putting some more thought into what seems like an elementary API so that these issues don't come up and trip up users in the first place seems like a completely unjustifiable academic effort to them, if not an outright foreign one. Have you found any effective way to try to communicate stuff like this and hopefully actually convince people?

lmm · on March 18, 2020

The very concept of "breaking ties" with min/max suggests that something woolly and ill-defined is going on; surely x and y are either equal or they are not?

dataflow · on March 18, 2020

That's everyone's instinctive reaction, but it's completely wrong, and everyone will know this too, if you just give the right examples. Like say you compare people by their age. That's a perfectly fine operation sufficient for sorting, min, max, etc. Now you find two people have the same age. Are they the same person?

Or let's say you're sorting strings case-insensitively. "a" might be neither less than nor greater than "A", meaning they'd be unordered with respect to each other. But that doesn't mean you suddenly wouldn't care if one got duplicated and the other got deleted.

Fundamentally "unordered" just means "neither comes before nor comes after". It doesn't imply "equal to". You can wrap this in a lot of fancy math terminology about partial and total orders, but it's not that complicated conceptually; the idea is just that the lack of an ordering doesn't imply equality, just like how you and your twin aren't the same person. This is one aspect of why C++ is introducing more sophisticated comparisons in C++20 (I would suggest looking it up).

lmm · on March 18, 2020

> Like say you compare people by their age. That's a perfectly fine operation for sorting, min, max, etc. Now you find two people have the same age. Are they the same person?

They have the same age, which might be the minimum age of the group. I don't think it makes sense to talk about the minimum person of the group; minimum/maximum imply that it's the thing itself which is ordered.

JadeNB · on March 18, 2020

The reaction to this point seems to have been dismissive (ironic in a thread that started with a complaint about dismissing small distinctions!), but I think the distinction you're making is an important one. In Haskell, it's the difference between `minimum` and `minimumBy`.

https://hackage.haskell.org/package/base-4.12.0.0/docs/GHC-L...

https://hackage.haskell.org/package/base-4.12.0.0/docs/Data-...

The point about strings is different. It doesn't make sense in general to speak of the minimum string in a list, or, more generally, the minimum of a partially ordered set; but it does make perfectly good sense to sort a partially ordered set, with the idea that there are multiple correct sorts when elements are incomparable. That, of course, is where the notion of a stable sort becomes important.

Tyr42 · on March 18, 2020

How about a priority heap? It's much more useful to sort things by their min priority, but just telling me the priority at the end isn't helpful. You'll want the thing as well.

(Even more concretely consider a min heap used in pathfinding, where you store the (distance_so_far, (x,y)) pairs, and they sort by distance_so_far.)

lmm · on March 19, 2020

> How about a priority heap? It's much more useful to sort things by their min priority, but just telling me the priority at the end isn't helpful. You'll want the thing as well.

Sure, but again, at that point you're not taking the minimum thing, you're taking the thing with the minimum priority. And it wouldn't be strange to say that two things have the same priority (even if they're different things).

dataflow · on March 18, 2020

> I don't think it makes sense to talk about the minimum person of the group; minimum/maximum imply that it's the thing itself which is ordered.

Uh, it totally makes sense, I just gave you a string example too. The comparator doesn't have to distinguish unequal elements, and a min (and argmin and sort et al.) is perfectly fine and sensible with such a comparator.

Do you code in Python at all? Do you find the fact that 1.0 and 1 are both simultaneously equal in some sense and unequal in another sense to be just complete nonsense? Do you expect min(1.0, 1) to error...? Or for sorted([1.0, 1]) to return [1, 1] just because the elements are "equal" so obviously what right do you have to care which one is returned?

lmm · on March 19, 2020

> Uh, it totally makes sense, I just gave you a string example too. The comparator doesn't have to distinguish unequal elements, and a min (and argmin and sort et al.) is perfectly fine and sensible with such a comparator.

Sorting is not the same thing as taking the minimum.

> Do you code in Python at all? Do you find the fact that 1.0 and 1 are both simultaneously equal in some sense and unequal in another sense to be just complete nonsense? Do you expect min(1.0, 1) to error...?

Yes, and this is a large reason why I prefer to work in languages that let me manage such distinctions more carefully.

Sohcahtoa82 · on March 18, 2020

Not if you're trying to do a stable sort.

If you've got a list of people that is already sorted by name, and you want to sort them by age but still preserve the name sorting for each age (ie, if Alice and Bob are both 30 years old, I still want Alice to appear before Bob in the list), then having a defined way to break ties greatly simplifies the code.

lmm · on March 19, 2020

You wouldn't use min/max as part of sorting though.

rcar · on March 18, 2020

min and max can also be comparing one part of a more complex object:

    In [1]: l = [(19, "John"), (19, "Jack"), (20, "Jim")]

    In [2]: min(l, key=lambda x: x[0])
    Out[2]: (19, 'John')

JadeNB · on March 18, 2020

0/0 isn't the best example, because every answer leads to a satisfying system—i.e., it'd be perfectly fine to have a multiply valued quotient here; we'd just have to let it infect all the rest of our arithmetic operations. It's 1/0 and other such fractions that have no sensible answer.

eru · on March 19, 2020

I'm a bit confused.

You can make 1/0 return infinity, and preserve some properties of fields. Though that works best, if your zeroes are also signed.

But for 0/0, I don't see nearly as many properties you can rescue.

JadeNB · on March 19, 2020

Whatever value x you assign to 1/0, it had better be true that multiplying it by 0 gives 1; and the properties of a field give that 0x = (0 + 0)x = 0x + 0x, so that 0x = 0. You thus would have to give up either distributivity or additive inverses, which are pretty dear to me!

If, on the other hand, you regard y = 0/0 as a "multi-valued variable" standing for any element of the field, then it behaves perfectly fine, since 0y = 0. The only problem is that it's infectious, so that just about any arithmetic computation involving it, like y + 1 or 2y, also suddenly has to stand for every element of your field. (If you work over a ring, then y + 1 stands for any element and 2y only stands for elements of the ideal generated by 2, etc.) Computations like 0y and y + (-y) both still yield 0.

This is not very useful—at least I can't see anything useful to come of it—but, aside from the infectious multi-valuedness, I don't see what problems result.

majewsky · on March 18, 2020

  all(xs) and all(ys) == all(xs + ys)

If I understand you correctly, the "and" should be an "implies" instead:

  all(xs) => (all(ys) == all(xs + ys))

Otherwise this law cannot be true since there exist xs for which !all(xs).

quietbritishjim · on March 18, 2020

I think they just got their operator precedences mixed up. They meant:

    (all(xs) and all(ys)) == all(xs + ys)

This is why you should always use parentheses unless it's absolutely obvious! I certainly had to check the docs to find out which way round it goes. I guess normally you're not dealing with Boolean variables so having == bind tighter makes more sense:

    y is not None and x == 7

eru · on March 18, 2020

Yes, indeed! Sorry, I was deviating from the Python order of precedence.

gridlockd · on March 18, 2020

> But that doesn't tell you _why_ that choice is a good idea!

That wasn't necessarily a choice, it may have just been an oversight not to deal with the "empty list" case.

The fact that this blog post exists is a testament to this behavior being surprising.

I don't think any of the arguments - philosophical or otherwise - are very good from a practical standpoint, that is: is this behavior is not more likely to cause harm?

My gut feeling says that this behavior is somewhat unsafe, but then again Python is probably not the language to use when safety is a big concern.

A hypothetical safer language might be better off to define "all/any" only for non-empty list.

eru · on March 18, 2020

Haskell, the prototypical careful language, defines their equivalents for 'any' and 'all' exactly like Python does it. And that's not an accident.

> That wasn't necessarily a choice, it may have just been an oversight not to deal with the "empty list" case.

If your general code gives you a answer for a corner case, it's a good hint that this might be a reasonable answer. Not a guarantee, though.

gridlockd · on March 18, 2020

Functional languages tend to elevate some mathematical ideals before concerns of practical safety, performance or just common sense.

eru · on March 18, 2020

On my cynical days, I'd be inclined to grant the latter two. But do you have an example of the first?

gridlockd · on March 18, 2020

It boils down to the position that side-effects are actually totally fine when they allow you to write a simpler, more straightforward program, where state is easy to observe and debug.

Such code is less likely to contain obfuscated logical errors arising from an attempt to not break "functional purity". It has a chance to be practically more safe.

Another aspect is the way most functional languages treat memory. If you need a garbage collector, you effectively have to throw out hard realtime constraints. Latency and memory usage can be safety concerns. Most functional languages do not let you reason about that.

eru · on March 19, 2020

Thanks for the answer!

Garbage collectors are used in most modern languages. Most implementations of garbage collectors don't give any real time guarantees. And neither do typical free/malloc implementations.

You can do functional programming without garbage collectors. Eg via linear types / uniqueness typing.

But the more general answer here is: whether your tool is appropriate depends on your problem.

Hard real time constraints are rare for most software. Most languages in widespread use don't support those out-of-the-box. Eg C certainly doesn't.

> It boils down to the position that side-effects are actually totally fine when they allow you to write a simpler, more straightforward program, where state is easy to observe and debug.

You know that eg Haskell allows you to write imperative code just fine? It's actually one of the more pleasant languages for imperative programming.

And it's pretty easy to get proponents of FP to admit that imperative programming has its uses. Much easier at least, than to get them to say positive things about Java-style OOP.

> Such code is less likely to contain obfuscated logical errors arising from an attempt to not break "functional purity". It has a chance to be practically more safe.

As I said most functional languages support imperative programming styles just fine. But, of course, your complaint remains more valid if you are redirecting it against certain functional programmers. Languages are tools to be used by people. Some of them more flexible, some more boneheaded.

lmm · on March 18, 2020

> Latency and memory usage can be safety concerns. Most functional languages do not let you reason about that.

Most languages of any kind do not let you reason about that. You're not wrong, but neither is this some specific issue with functional languages. Indeed I'd say functional languages are at the forefront of trying to bring in ways to reason about those, with things like Linear Haskell or https://www.microsoft.com/en-us/research/publication/coq-wor...

gridlockd · on March 18, 2020

> Most languages of any kind do not let you reason about that.

I might as well generalize my statement to:

"Programming languages tend to elevate some ideals before concerns of practical safety, performance or just common sense."

It just wouldn't make as much sense in-context.

lmm · on March 19, 2020

> "Programming languages tend to elevate some ideals before concerns of practical safety, performance or just common sense."

That statement would be difficult to substantiate, particularly in the context of functional languages. Practical safety, performance, and common sense are absolutely priorities; the ideals that functional language design follows are servants of those goals, not the master.

jeremysalwen · on March 18, 2020

Maybe it's because I studied mathematics, but to me the `all([])==True` seems so obviously true and logical, and that any other behavior would be extremely surprising, and unsafe, for the opposite reason.

imtringued · on March 18, 2020

It's pretty obvious from any standpoint.

all([True, True] + []) is equivalent to `True and True and ?`. If you replace ? with False then all() will never return True in any case because given any list you can always create an equivalent list by appending [] which would taint the list to always return False on all(). Obviously nobody wants all() to return False on a valid list and the concept of tainting a list is pretty stupid so True is the only option. If you replace ? with True you can add as many empty lists as you like which is the desired behavior.

all([True, True] + [] + [] + []) == True and True and True and True and True

gridlockd · on March 18, 2020

> It's pretty obvious from any standpoint.

If it was obvious from any standpoint, we obviously wouldn't be having this blog post or this discussion thread.

Natural language - the natural way to think about things - does not always intuitively translate over to predicate logic. If you disagree with that, you should try teaching it to a class of high schoolers.

earthboundkid · on March 19, 2020

Lol, I used to believe that logic was an intrinsic faculty of human reason, and then I taught a bunch of freshmen.

gridlockd · on March 18, 2020

It probably is because you studied mathematics.

If you think of "all" as "in this set, no element exists which is False" then it is "obviously" the right behavior.

However, consider this more practical example:

    x = {1,2,3}
    y = {}

    all(e in x for e in y)
    > True

This doesn't strike me as obviously right.

yorwba · on March 18, 2020

That's mathematically equivalent to "y is a subset of x", which is obviously right. (Except for the wrinkle that y is a dict, not a set.)

_v7gu · on March 18, 2020

Replying to the sibling comment,

We invented mathematics because practical thinking leaves us without answers at some points. You are free to axiomatically choose "all([]) = false", but then you'll find out that many mathematical equivalences won't hold as they all assume only the ZF axioms.

gridlockd · on March 18, 2020

I don't disagree. Try to mute your "mathematical thinking" for a second and read it as an ordinary person, with a focus on practicality. It just doesn't look right.

If it did look right to me, I wouldn't be sitting here typing this all out. I would be silently agreeing with the mathematical point of view.

That's what tells me this is a likely source of confusion and error.

umanwizard · on March 18, 2020

I simply don’t have the intuition, at all, no matter how I think about it, that it should be false.

Is there any language that works the way you’re suggesting is more practical/intuitive?

gridlockd · on March 18, 2020

I'm not saying it should be false. It's the law of the excluded middle that forces us to make a true/false decision, in which case I can agree that, for mathematical reasons, it should be true.

For practical reasons, it might be better for it to be false. That would require some empirical research on whether this could actually prevent harmful behavior in practice.

From a design perspective, I would argue that "all" should not be defined for empty lists, because that forces people to consider whether the often-forgotten "empty list" edge case might cause an issue.

nicoburns · on March 18, 2020

It's the right choice for practical reasons as well as theoretical/mathematical ones. It means that you generally don't have to special case the empty list case, you can just run let it happily do nothing. If you want exceptional behavior then you should probably use some other value (e.g. None)

gridlockd · on March 18, 2020

> It means that you generally don't have to special case the empty list case, you can just run let it happily do nothing.

The consequence of "if (True)" is usually to do something, not nothing. The consequence of doing something by overlooking the "empty list" special case may be harmful.

Preventing potential harm is more important than not having the programmer jump through a couple of extra hoops to account for the special case. That makes the special case impossible to overlook.

This kind of reasoning is generally accepted when talking about implicitly nullable types vs explicitly nullable types (optional types), but somehow in this context people are more stubborn.

eru · on March 19, 2020

> This kind of reasoning is generally accepted when talking about implicitly nullable types vs explicitly nullable types (optional types), but somehow in this context people are more stubborn.

Not sure that's the same? Haskell has no implicitly nullable types. Everything not explicitly marked as optional is non-nullable.

But still, if your algorithm at hand can treat nulls the same as non-nulls, it's a good idea to do so. (Eg in a sorting algorithm where you just feed the elements to the user supplied comparison function, and let that function handle comparing of optional elements.)

int_19h · on March 19, 2020

Can you give a practical example of a case where that answer would actually be wrong for an empty set, though? Like, what are you trying to check, and what would be inside the "if"?

globular-toast · on March 18, 2020

This was definitely not accidental. Here is the commit that introduces `all` and `any`: https://github.com/python/cpython/commit/96229b19181

It includes a test for the empty case: `self.assertEqual(all([]), True)`. They knew what they were doing.

qayxc · on March 18, 2020

Or - and bear with me for this - they wanted to make sure that future changes stay compatible since clients might rely on this behaviour.

goto11 · on March 18, 2020

What do you mean by "Epicycles"? Are you suggesting the code is written like this by random?

qayxc · on March 18, 2020

"Epicycles" may refer to ancient astronomers who started with the assumption that Earth is stationary and objects in the sky circle around it.

When observed planetary motions were found to be incompatible with the assumption, epicycles were introduced to keep the initial assumption working.

goto11 · on March 18, 2020

Yeah, sorry, I do know what epicycles are. I just don't understand the connection to this issue. It is not like predicate logic was invented to explain the behavior of Python.

qayxc · on March 18, 2020

I obviously failed to illustrate the connection the author maybe was referring to: it's an after-the-fact rationalisation, just like epicycles were.

laplace2 · on March 18, 2020

I see many complicated explanations for "why all([]) behaves this way" that justify the behavior as good because of logical or mathematical purity. I am implying that these explanations are so long worded because they're wrong. I think they are good reasons for keeping the behavior as-is, but the code was probably written first and was written to be simple and just happens to produce mathematically consistent behavior at an edge case. It would be interesting to see the history of the function, to see if the implementation has changed.

Sohcahtoa82 · on March 18, 2020

> I am implying that these explanations are so long worded because they're wrong.

Are you my Intro to Operating Systems professor? Because he would take points away from answers on the exams for being more long-winded than necessary, even if the answer was correct.

Joking aside, your perspective is bizarre to me. You haven't offered any reasons to propose that all([]) should be False, but reject the given answers for it being True simply on the basis that the explanations are long-winded?

laplace2 · on March 19, 2020

The question isn't "what should all([]) return", it's "why does all([]) return True". I think the answer is "because that's what the simplest implementation returns", not "because someone considered the mathematical implications".

Demiurge · on March 18, 2020

Perhaps it happens to check for any false in the list and otherwise returns true. Sometimes mathematically pure is equivalent to simple and vice versa. The explanations are long because they try to explain the implications and make analogies, which could be infinite.