Maybe I'm dense, but why not the same way a compiler implements it? In a recursi...

kvb · on Dec 3, 2013

Right, my thought was that doing real tail calls everywhere that's eligible would almost certainly be a performance disaster. I'm not sure that hotspot could help, at least if applied naively, since correctness depends on never missing a tail call that could lead to unbounded stack usage.

For example, consider an F#-like language on the JVM. Given the definition `let apply f x = f x`, you could imagine `apply` getting called in all sorts of contexts that don't result in unbounded recursion, so a hotspot-like system would not generate a tail call when compiling the method given the first N invocations. But then if another function is defined as `let rec loop n = if n < 1 then n else apply loop (n-1)` and invoked as `loop 10000000`, then the lack of a tail call in `apply` is fatal.

MichaelGG · on Dec 3, 2013

So if something kicked in at 75% stack usage (via a guard page?) and forced analysis, wouldn't that work? It'd be like the plethora of JVM options now: another configurable switch.

kvb · on Dec 3, 2013

How much of the stack are you willing to look at when you hit a guard page? The non-tail call might not be right near the top, depending on the structure of the code. I'm not saying it's impossible, but I think the approach of having the higher-level language add the annotations is probably more practical.

MichaelGG · on Dec 3, 2013

If the tail-call-needing function isn't found rather quickly (say, in the first 100? frames), then isn't it unlikely that the tail call will help? That seems like a much rarer case. Is the cost of going 1000 functions deep that expensive? Most threads aren't gonna hit a high % of stack without having some sort of problem anyways.

Of course it'd be better for the JVM to support this in bytecode, just like generics, stack allocation, and pointers. But it seems to me that if there was a real solid need for TCO then a JVM could implement it with tunable heuristics (most Java code doesn't need TCO, so if your code really depends on it in complex scenarios, you can always pass a -XXTcoInspectionDepth argument.)