I think it's also important to make a distinction between a pair of programs whi...

setopt · 2024-10-29T19:14:32 1730229272

On the other hand, compiler tricks like tail call optimization can e.g. reduce an O(n) algorithm to an O(1) algorithm. Is it a “different program” if the same source code is compiled with a new compiler?

casey2 · 2024-10-30T03:46:46 1730260006

Yes, same source code; different program. The point of compilers is to output a different program that (hopefully) has similar enough I/O but it better suited to the hardware. The program that runs when a human has to adds 1+1 is still largely unknown to us but the source '1+1' isn't.

Why use a new compiler if your program isn't meaningfully changed by it? I'd consider running on two completely different machines to already constitute "meaningfully different"

chongli · 2024-10-30T00:22:38 1730247758

Tail call elimination is not an optimization because it changes the semantics of the program. The feature can take a program which would previously fail to terminate due to a stack overflow and cause it to terminate without error.

Perhaps TCO is better thought of as a language extension.

esrauch · 2024-10-30T00:48:52 1730249332

That seems no different than any other optimization: very directly tons of optimizations would reduce stack usage which would then change a given input from a stack overflow to a successful execution. Similarly anything that reduces heap memory usage or code size would also do the same.

chongli · 2024-10-30T01:22:35 1730251355

How many of those other optimizations reduce stack usage from O(n) to O(1)?

BoiledCabbage · 2024-10-30T09:02:16 1730278936

That's irrelevant - your assertion was that a change that changes semantics by preventing stack overflow cannot be called an optimization. The commenter showed that is false, and gave reasons why.

Whether tail call goes from O(n) to O(1) doesn't change any of the above.

Jtsummers · 2024-10-29T19:21:43 1730229703

Tail call optimization does not turn O(n) algorithms into O(1) algorithms unless you're talking about the space used and not the runtime.

naniwaduni · 2024-10-29T20:16:47 1730233007

At a certain level of abstraction, that's easily an example of converting an O(n log n) algorithm into an O(n) one.

In practice, of course, the effect is far more dramatic with a MMU.

Jtsummers · 2024-10-29T20:25:05 1730233505

Can you show a O(n log n) algorithm with tail calls but not TCO that's O(n) after being optimized with TCO?

naniwaduni · 2024-10-29T20:37:16 1730234236

Computing f(0)=0; f(n)=f(n-1) is O(n log n) without tail calls because you need O(log n) addresses to hold your stack frames.

Jtsummers · 2024-10-29T20:55:24 1730235324

> Computing f(0)=0; f(n)=f(n-1) is O(n log n) without tail calls because you need O(log n) addresses to hold your stack frames.

There are two principal ways of applying asymptotic analysis to algorithms: time or memory used. In both, your procedure is O(n) without TCO. With TCO it is O(n) for runtime (though further optimization would reduce it to O(1) since it's just the constant function 0, but TCO alone doesn't get us there) and O(1) for space since it would reuse the same stack frame.

What O(log n) addresses do you need to hold the stack frames when there are O(n) stack frames needing O(n) addresses (without TCO, which, again, reduces it to O(1) for memory)?

Also, regarding "without tail calls", your example already has tail calls. What do you mean by that?

zeroonetwothree · 2024-10-29T22:19:55 1730240395

I assume they mean the size of the address is log n, since there are >n addresses.

Dylan16807 · 2024-10-29T23:33:14 1730244794

If we don't treat almost all integers in an algorithm as fixed size then the analysis gets really messy and annoying in a way that has nothing to do with real computers.

And if the algorithm actually did anything with the value that made it grow or shrink with the recursion, the TCO version would stop being O(n) under such a framework. This only works because it's passing 0 around every iteration. And this probably already applies to the TCO version's flow control depending on how you expect to run it.

Jtsummers · 2024-10-29T23:37:27 1730245047

I was going to write something similar.

Regardless, the comment I replied to is fundamentally confused (presented a tail recursive algorithm and said it didn't have tail calls, presented a linear algorithm that uses a linear amount of memory and claims it's O(n log n) for some reason but no clarification if it's in time or space). I'd rather hear from the person I responded to than whatever guesses the rest of us can come up with because it needs several points of clarification to be understood.

naniwaduni · 2024-10-30T19:47:15 1730317635

> If we don't treat almost all integers in an algorithm as fixed size then the analysis gets really messy and annoying in a way that has nothing to do with real computers.

I guess nobody does numerical computing or distributed systems with real computers anymore, huh.

Dylan16807 · 2024-10-30T21:59:01 1730325541

Wrong way around. Almost all the numbers in those systems are fixed-size. There's a few types of variable width payload, but all the little ancillary types are 64 bits or whatever.

naniwaduni · 2024-10-30T19:43:41 1730317421

> There are two principal ways of applying asymptotic analysis to algorithms: time or memory used.

It is usually both, but I meant time, because to be able to address your stack frame you need* a stack pointer that can take as many distinct values as your nesting depth, so it must have o(log n) width.

It is easy to dismiss this as irrelevant because your integers are usually fixed-width, but then you'd need to parameterize your algorithm on the size of input you're willing to handle (at which point you're no longer doing asymptotic analysis), since arbitrary-precision arithmetic really does just work this way normally.

> Also, regarding "without tail calls", your example already has tail calls. What do you mean by that?

I mean "without tail call optimization", or if you're particularly keen on getting into a pedant-off, "with the function call in the tail position not implemented as a tail call".

Sharlin · 2024-10-29T19:21:33 1730229693

From complexity analysis we can adopt the concept of polynomial-time reducibility and might define a type of equivalence relation where two algorithms are equivalent only if both are pt-reducible to each other. Intuitively it’s not a sufficient condition for "sameness" – otherwise, for example, all NP-complete problems are the "same" and solvable with the "same" algorithm – but it’s arguably a necessary one.