It turns out he's claiming they're different if x^2 is interpreted as squaring each element in the interval x, while x * x is interpreted as a cross product: the interval obtained by multiplying all pairs of elements in the interval. But I haven't ever seen anyone use x^2 to mean pointwise squaring on an interval x. Is that some kind of standard notation?
"Pointwise squaring on an interval x" is just a weird way of describing the usual function f(x) = x^2 with domain restricted to an interval. It's pointwise because that's how functions f : R -> R are defined: given a point, or value, of the domain, give me a new point in the codomain.
If you think of `x` as a whole interval unto itself, and not just a single point, then I think the options become more interesting. The most natural product on two sets is indeed the cross product; but for intervals, I can imagine defining a common parameterization over both intervals and then multiplying pointwise up to that parameterization.
It makes sense if instead of thinking about intervals, you think about the supports of random variables[1]. Given two independent random variables, X is not indepent of itself, so supp(X) = supp(Y) does not imply supp(X * X) = supp(X * Y).
Yes, I see. There's a desire to map intervals pointwise through functions, but also a desire to produce intervals by all-pairs calculations, and the impossibility of representing both interpretations in one notation leads to some inconsistencies.
There's some abuse of poor notation going on in the article. I don't think the author is intending to be confusing through this imprecision, but instead is just faithfully representing the common way people discuss this kind of stuff.
But it is confusing. And it is imprecise.
(I'll use x below to mean multiplication due to HN's weird formatting rules)
Nominally, if we have two intervals A and B we might anticipate there's a difference between AxA and AxB. In normal math we expect this because we use the different letters to indicate the potential for A and B to be different. Another way of saying it is to say that AxA = AxB exactly when A = B.
The trick of language with interval math is that people often want to write things like A = (l, h). This is meaningful, the lower and upper bounds of the interval are important descriptors of the interval itself. But let's say that it's also true that B = (l, h). If A = B, then it's definitely true that their lower and upper bounds will coincide, but is the converse true? Is it possible for two intervals to have coincident bounds but still be unequal? What does equality mean now?
In probability math, the same issue arises around the concept of a random variable (rv). Two rvs might, when examined individually, appear to be the same. They might have the same distribution, but we are more cautious than that. We reserve the right to also ask things like "are the rvs A and B independent?" or, more generally, "what is the joint distribution of (A, B)?".
These questions reinforce the idea that random variables are not equivalent to their (marginal) distributions. That information is a very useful measurement of a rv, but it is still a partial measurement that throws away some information. In particular, when multiple rvs are being considered, marginal distributions fail to capture how the rvs interrelate.
We can steal the formal techniques of probability theory and apply them to give a better definition of an interval. Like an rv, we'll define an interval to be a function from some underlying source of uncertainty, i.e. A(w) and B(w). Maybe more intuitively, we'll think of A and B as "partial measurements" of that underlying uncertainty. The "underlying uncertainty" can be a stand in for all the myriad ways that our measurements (or machining work, or particular details of IEEE rounding) go awry, like being just a fraction of a degree off perpendicular to the walls we're measuring to see if that couch will fit.
We'll define the lower and upper bounds of these intervals as the minimum and maximum values they take, l(A) = min_w A(w) and u(A) = max_w A(w).
Now, when multiplying functions on the same domain, the standard meaning of multiplication is pointwise multiplication:
(A x B)(w) = A(w) x B(w)
and so the lower and upper bounds of AxB suddenly have a very complex relationship with the lower and upper bounds of A and B on their own.
l(A x B) = min_w A(w) x B(w)
u(A x B) = max_w A(w) x B(w)
So with all this additional formal mechanism, we can recover how pointwise multiplication makes sense. We can also distinguish AxA and AxB as being potentially very different intervals even when l(A) = l(B) and u(A) = u(B).
(As a final, very optional note, the thing that makes interval math different from probability theory is that the underlying space of uncertainty is not endowed with a probability measure, so we can only talk about things like min and max. It also seems like we can make the underlying event space much less abstract and just use a sufficiently high-dimensional hypercube.)
About the last remark, my intuition is that even though there are operational differences, any formalism to represent uncertainty should be roughly as useful as each other
I mean. Can you express Bayes rule using interval arithmetic? Or something similar to it
I think a more complete way to say it would be that probability theory is a refinement of interval theory. Per that last remark, I suspect that if you add any probability measure to intervals such that it has positive weight along the length of the interval then the upper and lower bounds will be preserved.
So in that sense, they're consistent, but interval theory intentionally conveys less information.
Bayes' Law arises from P(X, Y) = P(X | Y)P(Y). It seems to me in interval math, probability downgrades to just a binary measurement of whether or not the interval contains a particular point. So, we can translate it like (x, y) \in (X, Y) iff (y \in Y implies x \in X) and (y \in Y) which still seems meaningful.
I don't. I've never actually seen interval theory developed like I did above. It's just me porting parts of probability theory over to solve the same problems as they appear in talking about intervals.
Yeah it sounds like something he's made up. For matrices x^2 is just x*x, not element-wise power (which if you want to be deliberately confusing is also known as Hadamard power). The latter is apparently written like this: https://math.stackexchange.com/a/2749724/60289
It turns out he's claiming they're different if x^2 is interpreted as squaring each element in the interval x, while x * x is interpreted as a cross product: the interval obtained by multiplying all pairs of elements in the interval. But I haven't ever seen anyone use x^2 to mean pointwise squaring on an interval x. Is that some kind of standard notation?