I like that the article uses the concrete example of proving √2 is not rational to demonstrate the abstract priciples of induction, while also not hand-waving away the formal details of the general case.
I got a little caught up on on the proof of (no) infinite descent by well-founded induction without a base case. I thought, "something is wrong; (∀y∈X.y⊏x⟹¬Φ(y)) only implies ¬Φ(x) if you assume (∀y∈X.y⊏x⟹¬Φ(y)) in the firsts place". But that's actually fine, because the principle of well-founded induction lets you assume it out of thin air, and as long as it implies ¬Φ(x), you're good.
The intuition of this, to me at least, is that well-founded induction has an implicit base case of the empty set. (∀y∈X.y⊏x⟹Ψ(y))⟹Ψ(x) means Ψ(x) is required to be automatically true if ∄y∈X, so (∀y∈X.y⊏x⟹Ψ(y)) may be assumed out of thin air and then used inductively for all the cases where ∃y∈X.
Back to the specific example, ¬Φ(x) is automatically true if ∄y∈X because Φ(x) implies ∃y∈X. However note that this base case doesn't need to be explicitly shown, because it's included in the general case.
> well-founded induction has an implicit base case of the empty set
Yes, but that still means the reasoning that proves the induction has to be valid for the empty set--i.e., "if P is true for all y less than x, then P is true for x" has to be validly proven for the case that there are no y less than x--which of course is the case for the natural number 0 in ordinary mathematical induction.
I think it's fine. It's analogous to starting induction from 0 over the natural numbers.
It does feel a bit tricky though because there are two nested foralls instead of just one in standard induction. I think to derive it from standard induction you need to perform induction over sets of statements, which takes more power than first order logic. This additional power is generally accepted in math courses. It isn't necessary though, you can prove the irrationality of sqrt 2 using standard induction.
The article discussed how additional techniques can be made rigorous in the opening paragraph. I agree that the author didn't fully justify the assumptions in the proof system used, instead bringing this well ordered induction as an axiom. This is an article, I think a full aximization would have taken too long.
How does infinite descent work using a proof assistant? It's been quite a while, so I may be remembering incorrectly, but this is my understanding:
Coq uses an inductive type like "N = 0 : N | S : N → N", and a first-order theory with integers axiomatized this way admits nonstandard models (whose prefixes are isomorphic to the naturals but have elements not reachable by repeated application of S, defeating infinite descent). But universal quantification in system F (inherited by the calculus of inductive constructions) corresponds to a fragment of second-order logic, where there is only one model (– the theory is categorical).
Not my area at all, but surely you would just add the relevant induction schema as an axiom? Are nonstandard models relevant to proof assistants?
(to be clear - I know what nonstandard models are and how they work, and I've mucked around in coq/lean, just wondering why it matters to a proof assistant)
My line of thinking roughly was, "each type system corresponds to a particular logic system and vice versa; in a proof by induction, the different cases correspond to the type constructors, but what does the necessary axiom schema or second-order axiom of induction correspond to? (in order to avoid the problem of nonstandard elements). And proof by infinite descent seems to require something other than induction, so where does well-foundedness come from?"
I realize these questions may not even be posed sensibly; maybe the answer is as simple as, "the naturals are defined constructively here; of course they are the naturals and therefore are well-founded". As I said, it's been a while, so things are hazy in my mind.
I see, that's interesting. You inspired me to go read up about CIC, it's really cool, thanks! And in hindsight I realise my reply was totally wrong anyway haha.
> a well-founded relation ⊏ is a well-ordering if and only if it relates every distinct pair of elements, i.e. x1≠x2 implies either x1⊏x2 or x2⊏x1.
The author here explains well-founded relation prior to this. I want to add an example here. Consider the set of positive real numbers. If we use the natural ordering < then we have a total order. But that's not a well-ordering. If x were to be the least element you could always divide it by 2 to get a lesser element. So there's no least element in the set of positive reals using the conventional < ordering. However, the amazing thing is that assuming axiom of choice, such a well ordering exists.
I think maybe you're overlooking "a well-founded relation ⊏ is a well-ordering …". < on the reals is not well-founded. Remember, a well-founded relation requires all leftward chains to be finite. Together with the rest of the given definition, it implies a unique least element.
Recall that the set R of real numbers R is defined as the set of numbers that can be shown to exist, plus an uncountable collection of "numbers" that are assumed to be near the set of numbers that can be shown to exist.
It's not amazing that assuming something unnatural (Uncountable Choice) gives an equally unnatural consequence. The set R of "Real" numbers is a fantastical "object" that has fantastical properties. It can do apparently impossible things because ZFC flat out assumes that it can do apparently impossible things.
IIRC the real numbers are, roughly speaking, more-or-less isomorphic to infinite sequences of digits. If you believe real numbers are fantastical, I suppose you also believe infinite sequences are fantastical? We can't have one in the universe, because we'd run out of atoms, but we can't have circles either and that is rarely complained about.
Historically the definition of computability came quite a bit later than ZFC. Zermelo might have formulated his set theory differently if computability came first.
There are lots of missing bits here. For instance, you seem to assume that a = sign(a) * abs(a). Why?
And both sign and abs are not defined. So you say abs(a) >= 0 and sign(a) \in {-1, 1}. Why? For instance, what is sign(0)?
Are you assuming a construction of integers from the natural numbers such that for any n < 0, there is an integer abs(n) and n = -1 * abs(n)? If so, don't you need to include that (or at least reference it)?
Later you assume that all integers are even or odd. Why?
These are niggling details that don't matter when you have chalk in hand, but do matter when speaking to a system like lean4.
You're right that a lot that is "usually defined" in math has to be explicitly defined in theorem prover/proof assistant
* abs(x) is usually defined as "x if x >=0 and -x if x < 0"
* sign(x) is usually defined as "1 if x >=0 and -1 if x < 0"
* with these definitions, it follows that for any x, x = sign(x) * abs(x) (by definition applied for each cases >=0 and <0, assuming that all integer are either >=0 or <0)
Showing that any integer is either even or odd seem less obvious
You'd be surprised at the number of things that appear to be self-evidently true that turn out to depend on tacit assumptions. For example, is 7 prime? Not if you're doing modular arithmetic.
The proof is not too bad in Lean4. I'm nothing special and I've got it down to 9 lines. But, maybe that is considered pretty tedious vs. what expectations one might have. In any case, it is an exercise in Heather MacBeth's free book: The Mechanics of Proof,
I doubt you'd be able to prove that it's rational, regardless of how it's constructed, since wouldn't that imply that it's also (for example) a real number?
Still, I don't see how any such definition could even be called a "definition of √-1" in the first place, since the resulting object wouldn't actually have the properties of √-1.
I think I'm missing something... isn't there an error in the proof when applying the theorem of the principle of infinite descent?
The theorem requires proving the following premise/antecedent (to deduce the consequent):
∀x ∈ X. Φ(x) ⟹ ∃y ∈ X. y ⊏ x ∧ (...)
... but I don't see how this can be proved when x=0. By substituting `x` for `0` (and using Z as the set X and |<| as the well-founded relation) you get:
Φ(0) ⟹ ∃y ∈ Z. y |<| 0
Unfolding Φ:
(∃b ∈ Z. 0^2 = 2 b^2) ⟹ ∃y ∈ Z. y |<| 0
The antecedent of this implication is true (when b = 0), so now you have to prove:
∃y ∈ Z. y |<| 0
However, this can't be proved because no `y` satisfies the |<| well-founded relation when applied to zero.
Therefore, this article's sentence doesn't seem to be true:
> Having satisfied the premises of the principle, we use it to deduce that no `a` satisfies the property
... which in fact cannot be true, because `a=0` does satisfy the property `∃b ∈ Z. a^2 = 2 b^2`, does it not? Consider `b=0`.
So what am I missing?
Edit: in fact, if the theorem could be applied in this case, then the conclusion of the theorem would be:
∀x ∈ Z. ¬(∃b ∈ Z. x^2 = 2 b^2)
Which is equivalent to:
∀x ∀y ∈ Z. ¬(x^2 = 2 y^2)
... which is clearly false when x=0 and y=0. So it seems like the theorem of infinite descent cannot be used to reach this conclusion. Some kind of false assumption would have to exist, but no false assumption was used to instantiate this theorem.
It's not wrong, but it skips some detail. (It's ironic that an article about "insight via precision" uses such imprecise language.)
This is demonstrated in a Lean tutorial for induction, where first it has you "prove" that √2 is "rational", before adding the requirement that the denominator is not 0, which then of course requires a totally different (semantically valid) proof.
0 does satisfy the theorem on integers, but it doesn't say anything about √2, because rational numbers do not allow a 0 denominator.
The √2 theorem assumes b >= 1, and infinite descent / well-foundedness is based on this subset of Z (and to the author's pedantic point about ordering, also works with Z\{0} and the magnitude ordering on |z|.)
I would argue that technically, the proof has an error because the author says:
> Let X be the integers Z; for integers a,b, let `a ⊏ b` be `a|<|b`; and let the property Φ(a) be “a^2=2 b^2 for some integer b” (formally, `∃b ∈ Z. a^2=2 b^2`).
However, the theorem of infinite descent cannot be applied to this property with the set Z which the author specifically said he would use. Instead, it needs to be applied to the set Z\{0}, which you rightly pointed out (as well as the sibling poster).
Note that this has nothing to do with the rest of the proof about √2.
I do agree that even after this error is corrected, then the proof additionally needs to be fixed by adding a few more simple proof steps to analyze what happens when either `a` or `b` are zero, which was excluded from the set. At this point you'd probably realize that yes, there is an implicit assumption that ¬(b = 0) which should be made explicit, and then you also have to account for the `a = 0` case, which should also be easy.
But still, the error greatly confused me at first because it seemed like the theorem could not be applied in this case. I only realized the theorem could be used correctly in this case when the sibling poster suggested to use a different set.
In the step of √2 = a/b you can get away without asking for relatively prime, using GCD or this descent by asking for least a for which this holds. Then when you get a/b = 2c/2d = c/d, but c would be less than a, hence contradiction.
Couldn't you always do this instead of doing this infinite descent? If relation is well-founded then there will be a least element.
I guess I got a little confused, because a different form with some de-morganing and contraposed implications feels more natural to me in plain-language proofs.
How do you evaluate "best"? FWIW this blog uses mathjax, I believe that's a pretty conventional approach, but the latex is rendered on the browser side leading to lots of reflows. There's also mathjax-node to pre-render the latex on the server side, and serve ready made html+css.
It means that many a book will implicitly assume that there is one way to do something (like induction), when in truth there are many ways to choose from.
I got a little caught up on on the proof of (no) infinite descent by well-founded induction without a base case. I thought, "something is wrong; (∀y∈X.y⊏x⟹¬Φ(y)) only implies ¬Φ(x) if you assume (∀y∈X.y⊏x⟹¬Φ(y)) in the firsts place". But that's actually fine, because the principle of well-founded induction lets you assume it out of thin air, and as long as it implies ¬Φ(x), you're good.
The intuition of this, to me at least, is that well-founded induction has an implicit base case of the empty set. (∀y∈X.y⊏x⟹Ψ(y))⟹Ψ(x) means Ψ(x) is required to be automatically true if ∄y∈X, so (∀y∈X.y⊏x⟹Ψ(y)) may be assumed out of thin air and then used inductively for all the cases where ∃y∈X.
Back to the specific example, ¬Φ(x) is automatically true if ∄y∈X because Φ(x) implies ∃y∈X. However note that this base case doesn't need to be explicitly shown, because it's included in the general case.
reply