The sum of no numbers is 0. The product of no numbers is 1. The union of no area...

feoren · on Aug 25, 2023

To be pedantic:

TRUE is the identity element of "and", not "all". "All" is not a binary operator, it is the (unary) aggregation of "and". That is, S.all() == S.aggregate((prev, next) => prev && next, TRUE).

FALSE is the identity element of "or". "any"/"some" is the aggregation of "or". S.any() == S.aggregate((prev, next) => prev || next, FALSE).

0 is the identity element of addition, and "sum" is the aggregation of addition. S.sum() == S.aggregate((prev, next) => prev + next, 0).

1 is the identity element of multiplication, with "product" the aggregation.

Given a monoid M<T> { combine(a: T, b: T) => T, identity: T }, you can define an aggregation operation the same way:

    S.aggregate((prev, next) => M.combine(prev, next), M.identity)

Nevermark · on Aug 25, 2023

TRUE can be the identity element both for the conjunction of 2 Boolean values, and of an operator that extends 2-value Boolean operations to lists of N-elements.

Being an identity is a property. A given mathematical object can be an identity for more than one relation.

It can also be the identity element for general logical propositions or relations.

The reason being, the former is just a special case of the Boolean relations of AND and ALL.

So TRUE can consistently be the identity for many classes of logical operations of varying generality or complexity.

The raw identity relation and concept of identity element form the “tops” of ladders, whose specializations can hold over large classes of relations.

feoren · on Aug 25, 2023

My goal was to add more context to your (excellent and upvoted) original post, which in my opinion is basically the fundamental "proof" of why All([]) must be TRUE, and I find beauty in this two-dimensional analogy:

    add      | sum
    multiply | product
    and      | all
    or       | any
    concat   | "flatten" (arguable name here)

I also find beauty in the fact that the existence of a monoid on the left is sufficient for producing the aggregation on the right that behaves in the way we want. I see the monoid as the simple building block from which we produce more interesting and complicated structures. I enjoy the continued, unbroken thread of rigorous logic that connects everything.

When you say that TRUE can be the identity of "an operator that extends 2-value Boolean operations to lists of N-elements", I just don't see the unbroken thread of rigorous logic behind that statement. Maybe it's there! But I don't see it. I can write down exactly what it means for some element i to be an identity of a binary operation @. It means i@x == x == x@i for all x in the set over which @ is defined. Whatever the rules of "identity" are that you're talking about, they're not this, so I'm just kinda struggling to accept it. I wonder whether maybe what you're thinking of deserves a different name than "identity".

Does any of this matter? It does to me, because I actually use Monoids all the time in programming, where you can't afford any ambiguity at all. So again: maybe you see that unbroken chain of rigorous logic and know exactly what you mean when you say TRUE is the identity of "an operator that extends 2-value Boolean operations to lists of N-elements", but I'd need to see the exact laws that an element must follow to be such an operator or such an identity of that operator. I gave an example of how to turn associative binary operations with identity into a unary operator on lists of elements, but unary operators don't really have identity elements. Some unary operators have fixed points and eigenvalues, which are similar to identity, but such operators must have the same domain and codomain, and aggregate operators like SUM and ALL do not. So there's not really anything you define to be SUM's identity, since it does not accept 0 as an input.

Nevermark · on Aug 25, 2023

Ha! I agree, you are right.

TRUE for a monoid can exist independently from any aggregator defined over that monoid.

My bad :)

The fractal lines between what is mathematically true, vs. what is practical or useful to be true, in coding are worth paying attention to!

Thank you!

And perhaps these relations can always hold, “without loss of generality”.

In which case we need to formally define these monoid-aggregator relations so we all know when they are assumed to be in effect.

—

On the beauty side, I love how 0 = sum{} captures the semantic difference between zero and other numbers, with the syntax of nothing between two braces.

feoren · on Aug 25, 2023

I do think there was something interesting about the idea that adding the element to the input array doesn't change the result. It's a little tricky to formalize though.

If we define an "aggregation" agg over some set S as a function from S* (the "Kleene star", AKA the free monoid over S, AKA really just simple arrays of S) to S, and we identify some element i such that

   agg([i] concat X) == agg(X) == agg(X concat [i])

for all X in S*, that does feel very "identity-ish". It's a little bit like saying [i] is an identity of concat under the equivalence relation on S* that X~Y iff agg(X) == agg(Y)? In the same way that 7 is an identity of + under the equivalence relation "mod 7" (paraphrasing).

But note that:

   sum([-2, 2] concat X) == sum(X) == sum(X concat [-2, 2])

for all X as well. You could say "well obviously, because sum([-2, 2]) == sum([0]) and [0] is our identity element" but then we're implicitly assuming that agg(X concat Y) = agg(X) @ agg(Y) for some operator @. Which in turn is basically assuming we've built agg up from a monoid to begin with, and the novelty here would be what happens when we don't do that. There may be some interesting (if not useful) stuff here.

if there exists some operator @: (S, S) -> S such that:

    agg(X concat Y) == agg(X) @ agg(Y)

For all X, Y in S*, and if :

    agg([i] concat X) == agg(X) == agg(X concat [i])

for all X in S*, then must agg([i]) be an identity element of @? Certainly that's true if agg is onto (i.e. we can find every element of S via agg of some S*). But it can't really be true in general, because we could make degenerate agg functions like

    ignore(X) := 0

Now every element of S meets this condition:

    ignore([s] concat X) == ignore(X) == ignore(X concat [s])

So we have every element of S as this weird kind of "identity" of the "ignore" function. And we can find functions on S* with no "identity" of this kind: "count", for instance. Although that's cheating because it only works when S happens to be the natural numbers (otherwise agg is not a function from S* -> S).

Nevermark · on Aug 25, 2023

Yes I think you hit it on the nail.

My logic depended on the aggregator being a recursive monoid, with:

1) Special handling for a unit list, where the unitary element is returned, and

2) Special handling for an empty list, where the identity element of the monoid is returned OR we get explicit that the case of the empty list is simply already it’s simplest form, and we choose to define that equivalent to the monoid’s identity

So i skipped a step by “finding” the identity from the aggregator’s behavior, instead of directly from the monoid it is based on.

——

Alternatively, I am liking the idea of defining this list vs pair, aggregator vs monoid relations more tightly.

Empty braces unify the syntax for “zero” with the semantics “nothing” between the braces.

How else can you join those semantics and syntax?

Likewise, starting with unary natural numbers, syntactically and semantically, a list of 1’s is the number.

So which comes first? A unary monoid addition of 2 lists of ones (n-way concatenations of unitary 1), or an n-way sum-concatenation that each unary number actually consists of?

I think they are more tightly bound definitions than our typical syntax suggests, when we get down to the semantics = syntax level.

black_knight · on Aug 25, 2023

To be even more pedantic! Unit elements are not limited to binary operators. In fact, in a monoid (or group) there is naturally an n-ary operation for each n. Or if you want a single operation from lists of elements to a single element. It is just that we usually present it using a unit and a binary operation + associativity. But one can split that cake in other ways. Thus, in some sense any and all are even more natural or "objective" (as Lawvere would put it) than and and or.

feoren · on Aug 25, 2023

What is your definition of an identity of a ternary function? Say i is the identity of f. Must it hold that

    f(i, a, b) == f(a, i, b) == f(a, b, i)?

For all a, b in f's domain? The seems like the obvious choice, and certainly holds for ternary functions "built up" from associative binary ones, but I've just never seen such a thing. Is there a case where it's useful to talk about the identity of a 3-valued function?

I think the thing I find weird about this is that a binary operator op2 over some set S is defined as S^2 -> S. So we can sensibly say:

    op2(a, i) == a == op2(i, a)

But the extension of this to a 3-valued function from S^3 -> S cannot have that middle condition, because it has too many "free variables". We can't say:

    op3(a, b, i) == op2(a, b)

Without establishing a relationship between op3 and op2, but must there be a sensible "decomposition" of op3 into some op2? Surely not. I guess I just don't know what it means to be an identity of a 3-valued function.

As an example, consider mult3(a, b, c) := a * b * c. Then,

    mult3(0, a, b) == mult3(a, 0, b) == mult3(a, b, 0)

for all a and b. We intuitively know that 0 is not behaving like an identity. Can we write down an identity rule that must hold for a generic op3 that excludes 0 as the identity of that function? Without relying on some oracle to provide us an appropriate decomposed op2?

Edit: I would also like to point out this function over R^3:

    f(a, b, c) := a*b + a*c + b*c

Note that every value in R meets the condition:

    f(a, x, y) == f(x, a, y) == f(x, y, a) == ax + ay + xy

For all x and y. Is every number an identity of this function? In fact, this is trivially true for any completely associative function. I guess I just need to see the definition of an identity of a ternary operator.

evertedsphere · on Aug 25, 2023

monoid, not monad

---

a monoid is a set A with an associative binary operator @ on it: that is, for any a, b, c

a @ (b @ c) = (a @ b) @ c

and it also has an "identity" element I, that is, for any a:

a @ I = I @ a = a

common monoids: integers under addition (i.e. @ is addition, the identity is 0), nonzero real numbers under multiplication, lists under concatenation, booleans under AND (identity is True), booleans under OR (what's the identity?), ...

now, when you have a mathematical structure, you often want to study functions between two of them. so you might ask what a reasonable notion of functions between monoids is

one nice way is if, given two monoids, M with operation @ and unit I, and N with operation ^ and unit J, we had a function f for which:

f(I) = J

f(a @ b) = f(a) ^ f(b)

this is loosely described as the function "respecting the structure" of the monoids, and such functions are called monoid homomorphisms or just homomorphisms if the context is clear

--

with that background, we see that `all` should really be be a homomorphism from the monoid of lists (under concatenation, with the empty list as the identity element) to the monoid of booleans under AND (with True as the identity)

this means that

all(concat(list1, list2)) = and(all(list1), all(list2))

or, in python,

all(a + b) = all(a) and all(b)

and, of course, all([]) = True!

the value here is that now you can reason about the all() of two lists by thinking about the all() of sublists, and the conditions required for a homomorphism guarantee that the empty list is not an edge case:

all([T, F]) = all([]) and all([T, F]) = all([T]) and all([F]) = ...

and that it doesn't matter how you cut up a list into sublists and what order you apply the concatenations in to reconstruct it

Nevermark · on Aug 25, 2023

I was referring to monad, as used in programming, as a transform that can be applied to the elements of a structure.

For instance, a transform that wraps each joining operator, in an expression tree, with a new operator that applies the original operator if all its inputs are defined, but returns a null if any input is null.

The expression tree structure hasn’t changed, but it now consistently (anvoiding error) responds to null values by propagating them.

In my example, a monad converts the list combining operation of concatenation into other list combining operations.

The result was that the identity of concatenation (the empty list) formed the basis for the new list operator’s identity.

—

Both concepts are indeed relevant.

A “monoid-to-aggregator” monad could convert 2-element operations into recursively applied list operations (which also return the single element for unitary lists, and the identity element for empty lists).

Then a structure of monoids processing pairs of values could be transformed into a structure that processes lists.

layer8 · on Aug 25, 2023

One corollary is that a zero-dimensional array has exactly one element, whereas a 1-dimensional array can have zero elements (can have length zero). It took me a while to wrap my head around this.

Nevermark · on Aug 25, 2023

Yes! Dimensions are multipliers on each other.

The cross points, over multiple points on segments in different dimensions is a product.

And the product across an empty list is 1.

The number of ways you can roll a dice “no” times is also 1.

Cross combinations of all kinds are products.

itishappy · on Aug 25, 2023

Whoah...

Does this imply there's a sixth of a way to unroll a dice?

rain1 · on Aug 26, 2023

I don't understand this. Please can you point me to information about it?

layer8 · on Aug 26, 2023

I haven’t seen this discussed anywhere else in terms of arrays, so I’ll elaborate here:

The number of elements in an n-dimensional array is the product of its n dimension sizes. For example, the number of elements in a two-dimensional 2x3 array is 6, because 2 x 3 = 6.

Consequently, the number of elements in a zero-dimensional array would be the empty product, which is 1 (as the root comment mentions).

Naively, one might think that removing a dimension from an array cannot increase the number of elements. For example, dropping a dimension from a 2x3 array reduces the number of elements from 6 to 2 (or 3, depending in which dimension you drop). But in the case of a zero-length 1-dimensional array (i.e. an empty list or empty vector), dropping the one dimension causes an element to be added, because a zero-dimensional array always has 1 element.

(This effect is not restricted to 1- and 0- dimensional arrays, by the way. If you admit zero-length dimensions, then you can also think of a 2x0 two-dimensional array, which would have zero elements, but dropping the zero-length dimension would result in a 1-dimensional array of length 2, increasing the element count by 2.)

The other maybe surprising part is that for all n > 0, you can have an n-dimensional array with zero elements by having at least one of the dimensions have length zero. But for a zero-dimensional array, it’s not possible for it to be empty, because it has no dimension whose length could be zero. This is somewhat counterintuitive at first, because one might have thought that a zero-dimensional array would be the canonical empty array, like how a point in space (whis is a zero-dimensional object) has no length/area/volume.

By the way, one could think of non-array values in a programming language to really be zero-dimensional array values, with their value being the singular element of the zero-dimensional array.

klysm · on Aug 25, 2023

> The identity element for the resistance sum across parallel resistors, is the infinite ohm resister.

I really like this example, because it has an intuitive physical analog and is more tangible than than the example I typically reach for: lists.

bjourne · on Aug 26, 2023

Continuing the same line or reasoning, for integers, max([]) should be -inf and min([]) should be +inf. For positive numbers max([]) should be 0 and min([]) should be +inf. But that's not how these operations are defined in most languages.

edflsafoiewq · on Aug 26, 2023

It's certainly how they're defined in math. In fact all is just min and any is just max over the poset {True, False} with True > False.

I think the reason we don't usually use those definitions is we like to maintain the fiction we're working with actual integers or reals, which don't have a min/max.

xigoi · on Aug 26, 2023

min and max are in fact defined this way in JavaScript.

slaymaker1907 · on Aug 25, 2023

You could also argue that any({}) is really not defined because first order logic generally excludes empty sets. The reason for said exclusion is because weird stuff starts to happen when you allow an empty domain when talking about if a formula is true in all domains. For example, "exists x, P(x) or !P(x)" is true for every domain unless you allow an empty domain since "exists x ..." is always false. The same is true of all({}) because you have an equivalent weirdness in the form of "!(forall x, P(x) and !P(x))". These seem innocuous, but there are many rules of inference that are excluded once you allow the empty domain.

schoen · on Aug 25, 2023

Maybe I'm too used to Coq's higher-order theory now, but I think it's not really that bad to say that we have to be careful about whether statements have existential import.

https://en.wikipedia.org/wiki/Syllogism#Existential_import

Nevermark · on Aug 25, 2023

If we are defining ANY as an n-element LIST operator, in a programming language, which recursively applies the 2-operand AND taking literal Boolean values, there are no ambiguities.

Note that this is inherently a constructible relation, which avoids many pratfalls of more general mathematics.

But if ANY & ALL are being defined as general mathematical relations over all EXISTING members of a class (whose members’ construction, consistency or completeness, may be unknown or even unattainable), then there are other issues.