Inventing an Operation to Solve x^x = y (2000)

hprotagonist · on Nov 28, 2019

It’s always nice when people independently reinvent good ideas! This poster’s more or less come up with up-arrow notation and a shot at its inverse.

edit: actually meant https://en.wikipedia.org/wiki/Tetration

https://en.wikipedia.org/wiki/Knuth%27s_up-arrow_notation

ColinWright · on Nov 28, 2019

As is pointed out further down, it does effectively invent tetration, which is cool, but that's not Knuth's arrow notation.

Edited: I've removed the original and replaced it with the above because clearly my original comment wasn't being understood the way it was intended. Possibly this is no better, but it's worth a try.

To provide context, the comment by elil17 is in response to my original, in which I asked what this had to do with Knuth's arrow notation.

elil17 · on Nov 28, 2019

a@2 = a↑↑2

ColinWright · on Nov 28, 2019

Except the essence of Knuth's up-arrow notation is the recursive nature. This is going one step, and while I agree that it's a first step, it doesn't actually get the recursive nature. So this is a notation for a single thing, not a general thing. It's good, but i's missing the generalisation step that is what makes Knuth's notation particularly special.

Maybe I know too much about the field, maybe I don't know enough, but I've looked over this many times and it doesn't seem to me to be making "the step."

Maybe I'm just expecting too much, but it feels like the heart of Knuth's notation is missing.

<fx: shrug /> I'll leave it at that. Feel free to disagree, perhaps it's an interesting talking point.

Sharlin · on Nov 28, 2019

Maybe it's more accurate to say that the poster had invented tetration.

https://en.wikipedia.org/wiki/Tetration

perl4ever · on Nov 29, 2019

I find it hard to focus on this stuff, but is there really any ultimate generalization? Like, you always can recurse once more - once you define up arrows, then why not iterate them?

↑n means n up arrows, so why not have an operator that applies up arrows a number of times defined with up arrows?

It seems like it's just dipping your toe into large numbers, no matter how far you go.

Gondolin · on Nov 29, 2019

The ultimate generalization is to use an inductive definition based on recursive ideals. For instance the fast growing hierarchy f_alpha: https://en.wikipedia.org/wiki/Fast-growing_hierarchy

Since ordinals are the ultimate recursion tool, any such function will ultimately grow faster than whatever recursive definition you can cook up by hand.

For instance even the slow growing hierarchy g_alpha, which grows very very slowly catches up to the fast hierarchy at extremely large ordinals.

Some examples:

- f_0 behaves like addition, f_1 like multiplication, f_2 like exponentiation, f_3 like tetration, f_w like Ackerman

- Graham number is bounded by f_{w+1}(64)

- Goodstein function behaves like f_{epsilon_0} (this show how much faster it grows than Ackerman). Similar for the Kirby-Paris Hydra

- n-Iterated goodstein behaves live f_{\phi(n-2,0)} where \phi is Veblen's function

- the tree function from Kruskal's tree theorem behaves like f_{\psi(Omega^Omega^omega}, ie by the small Veblen ordinal. And the graph function from the Robertson-Seymour theorem behaves like \psi(Omega_omega).

By contrast the slow growing hierarchy grows extremely slowly, for instance g_{epsilon_0} only grows like tetration. But it still catches up to the fast growing hierarchy at very large ordinals.

perl4ever · on Nov 29, 2019

Like I said, I can't really follow this stuff, but maybe the ultimate would be the Busy Beaver function? Is there such a thing as a proof of how many functions can be between something computable and Busy Beaver, given some sort of constraints?

Gondolin · on Nov 29, 2019

Yes, the Busy Beaver function grows faster than the functions I gave, basically it behaves like f_{w^1_CK} (although this does not really make sense because w^1_CK, the Church Kleen ordinal, is not recursive).

By "ultimate generalisation" I was speaking about a family of computable functions that ultimately grow faster than other functions you could cook up by hand: you put all the recursivity in the ordinals, they are there for that, and logicians are very good at constructing very large (recursive) ordinals.

If you just want a very very fast growing function (but that is still computable, unlike busy beaver), you can do the a light busy beaver version: Beaver(N)=the maximal output of all turing machines T with N state where there is a proof in ZFC of length <=N that T terminates.

This function grows much faster than the other functions I described (basically it is a f_{ordinal consistency of ZFC}. If you are interested in this kind of ideas, there is this reddit post that develops these ideas: https://www.reddit.com/r/math/comments/283298/how_to_compute...

dahart · on Nov 29, 2019

> why not have an operator that applies up arrows a number of times defined with up arrows?

That is exactly what up arrows do, isn’t it? Up arrows are iterated up arrows. I learned this in college but still didn’t fully understand up arrows for a long time, they grow unimaginably bigger and faster than I first thought. This video explains it pretty well: https://youtu.be/GuigptwlVHo

I don’t know about ‘ultimate’ but there are some generalizations of up arrows:

https://en.m.wikipedia.org/wiki/Hyperoperation

https://en.m.wikipedia.org/wiki/Conway_chained_arrow_notatio...

kdmccormick · on Nov 29, 2019

I agree that your '@' (tetration) is one layer of 'meta' below Knuth arrows.

Tetration delves past exponentiation into hyper-operations, whereas the Knuth arrow generalizes them. It is just a special case that x↑↑2=x@2.

alwayslearning0 · on Nov 28, 2019

I wouldn't worry much when your e-rep takes a few hits from people with nothing better to do.

perl4ever · on Nov 29, 2019

You're not supposed to discuss voting...but I've noticed a couple times that sometimes when I make two related comments on the same thread expressing basically the same opinion, one gets upvoted a lot and one gets downvoted a lot, and that's just weird. Maybe the exact timing determines if the upvoters see it.

alwayslearning0 · on Nov 30, 2019

Yeah, this is just one reason why I wouldn't be super concerned with your hacker news points.

youareawesome · on Nov 28, 2019

This is easily solved using the Lambert W function:

Observe: W(x * e ^ x) = x

    x ^ x = y
    ln(x ^ x) = ln(y)
    ln(x) * x = ln(y)
    ln(x) * e ^ ln(x) = ln(y)
    W(ln(x) * e ^ ln(x)) = W(ln(y))
    ln(x) = W(ln(y))
    x = e^W(ln(y))

https://en.wikipedia.org/wiki/Lambert_W_function

throwlaplace · on Nov 28, 2019

That's pretty asinine since the W function is that inverse (basically). You're just saying "this is easily solved by easily solving".

Before anyone jumps on me: I'm perfectly familiar and comfortable with functions that aren't defined in closed form.

knzhou · on Nov 29, 2019

That's true for almost any nontrivial transcendental algebraic or differential equation. Special functions to solve special cases are the norm.

We could repeat the same discussion at a lower level. Suppose you tell a bright middle schooler about the basics of integration, and the power rule. Then they ask,

"So what's the integral of 1/x? It can't be x^0/0..."

"That's a special case. It's a function called "natural logarithm", ln(x)."

"But what's the definition of that weird function?"

"It's defined to be the integral of 1/x." [0]

"That's pretty asinine..."

[0] https://en.wikipedia.org/wiki/Natural_logarithm#Definitions

paulddraper · on Nov 29, 2019

To be fair, this is much of math.

The ratio of right triangle legs is "easily solved" by tan(x). But that's magical/cheating to someone who hasn't studied trigonometry.

Lambert-W is a well known function with known approximation methods; once you reduce your problem to it (in this case, trivially), you can lean on that knowledge from others.

anon1253 · on Nov 28, 2019

I asked the same question a while ago, https://math.stackexchange.com/questions/613946/xx-y-given-y...

Interesting analysis of the problem :-) it has been bugging me since high school!

Might be worth linking to the Lambert W function https://en.wikipedia.org/wiki/Lambert_W_function

This was the best comment I'd reckon:

> As you probably saw, the solution of x^x = y is x = Log(y) / W[Log(y)]. Now you take us in a highly philosophical debate. What is the limit of elementary functions ? Thanks for this question. Cheers. – Claude Leibovici

edflsafoiewq · on Nov 28, 2019

That comment references the worked examples that used to be on the Lambert W wiki page, including this exact problem, x^x=y. Here's the last version before it was deleted if anyone wants to see it:

https://en.wikipedia.org/w/index.php?title=Lambert_W_functio...

pickdenis · on Nov 29, 2019

I'm glad to see that this problem bugged other high schoolers. I spent a lot of time learning about the lambert W function and real analysis in general due to this problem in high school.

ludamad · on Nov 28, 2019

One thing that has got me thinking in the past was another kind of formula,`f(f(x)) = g(x)`, where `g` is given. It bothered me that far from a general algorithm being known, individual cases were stumping (e.g., `f(f(x)) = e^x`). Any insights/references from HN folks about this?

BoppreH · on Nov 28, 2019

I think you are looking for the functional square root operation: https://en.wikipedia.org/wiki/Functional_square_root

  In mathematics, a functional square root (sometimes called a
  half iterate) is a square root of a function with respect to
  the operation of function composition. In other words, a
  functional square root of a function g is a function f
  satisfying f(f(x)) = g(x) for all x.

For your example of f(f(x)) = e^x, that would be the https://en.wikipedia.org/wiki/Half-exponential_function with a=1 and b=e. Unfortunately there's no simple arithmetic representation of f(x).

ludamad · on Nov 28, 2019

Thanks so much!

paulddraper · on Nov 30, 2019

Oh wow, thanks!

dan-robertson · on Nov 28, 2019

Problems like these (typically restricted to integers/natural numbers) come up in competitions from time to time. Sometimes there are some tricks required that are very hard to find on the spot. Here is a related problem you may enjoy:

  Find all increasing functions f : N -> N such that f(f(x)) = 3x

fyp · on Nov 29, 2019

Is this problem correctly stated? Substituting in x=f(y) you get

  f(f(f(y))) = 3f(y)

so

  f(3y) = 3f(y)

This suggests a linear solution but then you get f(x)=sqrt(3)x which doesn't work since it's N -> N.

Chinjut · on Nov 29, 2019

Suggests is not demands.

If we discard the requirement to be an increasing function, one recursively defined solution takes f(3x + 1) = 3x + 2, f(3x + 2) = 3 * (3x + 1), and f(3x) = 3f(x), with f(0) = 0.

Or analogously for any way of splitting the positive integers not divisible by 3 into ordered pairs (a, b). Then define f(3^n a) = 3^n b and f(3^n b) = 3^(n + 1) a. This is the general form of the solutions.

a1369209993 · on Nov 29, 2019

∅

assume f exists

increasing requires that x < f(x) < f(f(x))

f(f(0)) = 3*0 = 0 ≮ 0

therefore f does not exist, qed

dan-robertson · on Nov 29, 2019

The definition of an increasing function is a function f such that for all x and y, x < y implies f(x) <= f(y).

For a strictly increasing function change <= to <.

Your argument fails because you drop the antecedent in: x < f(x) => f(x) < f(f(x)). If f(0) = 0, you don’t have 0 < f(0) so can’t require that f(f(0))<f(0).

Note however that it is easy to show that f must be strictly increasing if it is increasing for if x is not y and f(x) = f(y) then 3x = f(f(x)) = f(f(y)) = 3y which cannot be the case.

a1369209993 · on Nov 29, 2019

Ignoring zero, I get the summation, starting at [1]=2, of {1,3,1,1,1,3,3,3,1,1,1,1,1,1,1,1,1,3,...} = {..{1}^3^i,..{3}^3^i for i in ℕ}

Edit: derp, it's ℕ, not ℤ⁺.

AnimalMuppet · on Nov 29, 2019

f(x) = sqrt(3) * x. Then f(f(x)) = sqrt(3) * sqrt(3) * x = 3x.

So there is at least one solution.

a1369209993 · on Nov 29, 2019

That's not increasing (√3 * 0 = 0 ≮ 0) and it's not on natural numbers (√3 * 1 = √3 (≈ 1.732) ∉ ℕ).

JoshuaDavid · on Nov 29, 2019

I think in this context "increasing" means f(x+ε)>f(x) for all x, not f(x) > x for all x.

a1369209993 · on Nov 29, 2019

What is ε? The original problem is about natural numbers (ie nonnegative/unsigned integers).

AnimalMuppet · on Nov 29, 2019

Well then, f(x + 1) > f(x). I don't know where you're getting this x < f(x) as "increasing".

The point about sqrt(3) not working for the naturals is a valid point, however.

a1369209993 · on Nov 29, 2019

> Well then, f(x + 1) > f(x). I don't know where you're getting this x < f(x) as "increasing".

Headdesk Yes, that's correct; I misparsed "increasing functions f" as "functions f that increase [the value passed through them]" and then got distracted by the discrete vs continuous confusion.

AnimalMuppet · on Nov 29, 2019

Well, I missed that the domain was N, so... let's both just blame the tryptophan and call it good.

JoshuaDavid · on Nov 29, 2019

ε is epsilon, conventionally "an arbitrarily small positive number".

sgdpk · on Dec 1, 2019

Interesting! The form of the solution can be written conveniently in base 3.

sgdpk · on Nov 28, 2019

I really like this answer on Quora [1], about the precise example you mention. The author is worth following for his insights, and the clear way he talks about math.

[1] https://qr.ae/TaQ8y2

yiyus · on Nov 28, 2019

The J language makes trivial to solve this problem without a tetration operator:

       (^~^:_1)5
    2.12937

H8crilA · on Nov 29, 2019

How's that implemented? Newton's method like in the OP? I.e. if there's at least one solution there's an okay chance to find at most one solution?

xelxebar · on Nov 29, 2019

Essentially, yes.

> (^~^:_1) 5

First, let me break this down, in the interest of sharing the joys of J.

The thing in parenthesis is our inverse operation of x^x; we then evaluate that at 5. The first ^ is exponentiation and the following ~ makes binary operators into unary ones by evaluating on the diagonal. In other words it turns f(x,y) into f(x) = f(x,x), which for x^y corresponds to x^x.

Next we have the ^: modifier. This is function iteration, i.e. exponentiation over function composition. The function is on the left (here ^~) and the count on the right (here _1). So f ^: 2 means f(f(x)). In fact, this function exponentiation is extended to negative integers as well! In J, negative numbers are prefixed with _ (underscore), and f^:_1 actually gives us the inverse of f (if it exists)!

Anyway, we can see inspect the definition for an operation's inverse using the b. verb. From the jconsole:

        ^~ b. _1
    3 : '(- -&b@(*^.) % >:@^.)^:_ ]1>.b=.^.y'"0 :.(^~)

The part we are interested in is in single quotes. Pulling this one apart would involve introducing the various operator combinators in J, so suffice it to say that the ^:_ part numerically approximates a fixed point of the part in parenthesis, which essentially is a straightforward transcription of the recursive relation in Newton's method.

Straw · on Nov 28, 2019

We can express the solution in terms of the Lambert-W function, the inverse of f(x) = x*exp(x).

We still need a new function, but at least Lambert-W comes up in other places as well- not much worse than introducing log.

chewzerita · on Nov 29, 2019

All this talks about centers, operations and inverse operations reminds me of monoids from learning haskell: https://en.wikibooks.org/wiki/Haskell/Monoids

DoctorOetker · on Nov 29, 2019

is it just belgium where we call these "center's" neutral elements of an operation, or did the rest of the HN geography just forget that term?

47348436885325 · on Nov 29, 2019

I think what the original author meant by "centers" of addition and multiplication is closer to "empty sum" and "empty product" respectively. From what they wrote, it seems that they wanted more than just neutrality. In any case I don't think we should retroactively label their novel and necessarily unique concept with a familiar name.

zw123456 · on Nov 28, 2019

I couldn't resist chucking it into wolfram alpha https://www.wolframalpha.com/input/?i=solve+x^x+-+y+=+0 https://www.wolframalpha.com/input/?i=x%5Ex%3Dy I think someone else mentioned W function, which is shown in the wolfram solution The thing I found interesting is the graph, there is a local min at 1/e, so the solution depends on y >1 or e^-1/e <= 1 very interesting curve.

macawfish · on Nov 29, 2019

Reminds me of this fascinating paper: https://arxiv.org/pdf/math/0112050.pdf

It's about an infinite chain of operations where each is to the last as multiplication is to addition.

throwaway71828 · on Nov 28, 2019

Here is another fun problem for you folks:

x^x^x^... (infinite power tower) = 2.

If you solved it, try the general case of

x^x^x^... = y, for some arbitrary constant y.

np_tedious · on Nov 29, 2019

Seems to this series converges to 1 for the interval between 0 and 1, and nothing else converges. Is that correct?

EDIT: actually I guess it's more complicated than that. If I'm understanding the order of operations correctly, there's an alternating parity (where n is the number of exponentiations) thing going on with the values between 0 and 1

JoshuaDavid · on Nov 29, 2019

I don't think this is possible for any constant y that is not between 0 and e.

jwilk · on Nov 29, 2019

My hypothesis:

• for 1/e ≤ y ≤ e: x = y^(1/y)

• no solutions elsewhere

Can anyone prove the above? :-)

H8crilA · on Nov 29, 2019

How do you define infinite power? A limit of such series, if exists?

{x, x^x, x^x^x, ...}

lokedhs · on Nov 29, 2019

That would be equal to x^y, which means that the solution is the y'th root of y?

However, don't you have to prove that the series converges? Is the approach even valid if it doesn't?

jwilk · on Nov 29, 2019

Note that according to your solution, x=√2 for both y=2 and y=4. But of course the sequence cannot converge to both 2 and 4 at the same time!

a1369209993 · on Nov 29, 2019

Fixed point of x^(...), I guess; you don't actually need to define it properly:

  x^x^x^... = y
  x^(x^x^x^...) = x^(y)
  x = y^(1/y) # there you go

zabzonk · on Nov 28, 2019

Might be useful if it was stated at the start of the post exactly what the ^ operator is supposed to do - it does different things in different programming languages.

saagarjha · on Nov 28, 2019

In context, I think it’s pretty clear that it’s exponentiation. Mostly because the context is math where the meaning is a lot more unambiguous.

namirez · on Nov 28, 2019

I always felt the Fortran exponentiation operator (* *), which is inherited by javascript and php, was more intuitive notation.

samatman · on Nov 29, 2019

Interesting, the caret looks better to my eyes, since the real notation is x², thus x^2 is pointing at where the 2 should be.

But I get the reading of double-asterisk as 'multiplication of multiplication' as well, either is fairly transparent really.

kaoD · on Nov 29, 2019

Indeed. To me C's ^ as xor has always been the weird one since it actually looks like a logical and: https://en.m.wikipedia.org/wiki/List_of_logic_symbols