Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Guido mentions that you can't get an accurate stack trace in the presence of tail-call optimization. That alone makes it an absolute no-go for just about ANY production code (where ease of debugging is orders of magnitude more important than ease of writing in the first place).


That makes it sound like a. you can never get an accurate stack trace, and 2. the stack traces you do get that are "inaccurate" are worthless.

I can't speak to every optimizing interpreter or compiler, but in the few that I've used, TCO doesn't do anything to code that doesn't have tail calls, so lots of the code has "accurate" stack traces.

And when it does perturb the stack trace, it does so in a very obvious way, it abbreviates the tail calls. Such stack traces no longer have a 1:1 mapping with the function "calls," but are still quite informative for debugging purposes. Not as informative as they would be if you turn TCO off just to debug that code, but informative enough that I rarely had to turn it off.


Never mind the fact that tail cails are just fundamentally loops, which don't generate any stack frames anyway!

If you're application is complex enough that inspection won't reveal the source of the bug, then stack traces are almost strictly less useful than logging.


Tail calls can be used to increase the efficiency of loops simulated using recursion. However, not all tail calls are loops, e.g. void foo() { bar(); }

A good backtrace for a crash or exception can tell me the entire call stack. Logging at that granularity (i.e. every time a function is called) is not a good idea.


Exactly, it's not like a loop counter variable will show up in a stack trace (even if it is a perl style trace, with values, rather than a Java style, line numbers only, stack dump).

This whole debug dump issue could be avoided by saving a copy of the initial entry frame for debugging and the current frame with elipses shown between them when TCE was in effect.


Lua does TCO and implements it with a special opcode that tracks the number of times a function is called, and you can get an accurate stack dump.

    function r(x)
      if x == 1 then
        return 1
      else
        return r(x-1)
      end
    end

    r(100000000)

    [spc]saltmine:/tmp>lua t.lua
    ^Clua: t.lua:6: interrupted!
    stack traceback:
            t.lua:3: in function 'r'
            t.lua:6: in function <t.lua:2>
            (tail call): ?
            (tail call): ?
            (tail call): ?
            (tail call): ?
            (tail call): ?
            (tail call): ?
            (tail call): ?
            (tail call): ?
            ...
            (tail call): ?
            (tail call): ?
            (tail call): ?
            (tail call): ?
            (tail call): ?
            (tail call): ?
            (tail call): ?
            (tail call): ?
            t.lua:10: in main chunk
            [C]: ?
    [spc]saltmine:/tmp>
But one reason why I like TCO is in writing code to handle protocols. Create a state machine that describes the protocol. Convert each state to a function. Transitions to a another state is a function call. It works something like:

    function start() 
      i = getinput() 
      if i == A then
        return state_A()
      elseif i == B then
        return state_B()
      else
        return state_error()
      end
    end
    
    function state_A()
      return state_B()
    end
    
    function state_B()
      i = getinput()
      if i == MORE then
        return state_A()
      else
        return DONE
      end
    end
    
    function state_error()
      return ERROR;
    end
Because of TCO, each function call is effectively a GOTO. I've used this method to implement TFTP.


This is something of a pet peeve, so I'll reply. Let's say that the only stack trace entries TCO will eliminate are not only useless, but actually impede debugging. Let's say 'rf' is a recursive function. If it dumped core, without TCO you would see something like 'rf rf rf rf ...'. With TCO, you would see 'rf' once, just as if it dumped core while on a loop. Which is actually what TCO means: translation of recursive functions into functions that use loops, at the AST level. The result is code that's easier to debug (because it's not recursive at the stack trace level), easier to understand (because it's recursive at code level) and that performs better (because it's a loop at machine code level).

So yeah, you could make the case that it's not "an accurate stack trace", because it isn't. But, you don't want that. You want TCO. Trust me on this.

Preemptive strike for the bikeshedder crowd in the back :) : if you think you can deduce some useful fact from a recursive function that's stacktraced, like for example where in the loop the function gave an error, then you're wrong. Go learn to use a debugger.


> Which is actually what TCO means: translation of recursive functions into functions that use loops, at the AST level.

That's not correct. It's not TRO, TCO. So as it's normally understood any call in tail position would be eliminated.

In Python today, this program:

    def foo():
        bar()

    def bar():
        baz()

    def baz():
        raise Exception('OH NOES')

    foo()
will output:

    Traceback (most recent call last):
      File "raise.py", line 10, in <module>
        foo()
      File "raise.py", line 2, in foo
        bar()
      File "raise.py", line 5, in bar
        baz()
      File "raise.py", line 8, in baz
        raise Exception('OH NOES')
    Exception: OH NOES
If Python did TCO, you'd get:

    Traceback (most recent call last):
      File "raise.py", line 8, in baz
        raise Exception('OH NOES')
    Exception: OH NOES
Not wanting to lose helpful information like this is a valid cause for concern. I used to think Guido was wrong for not wanting TCO in Python but I've yet to see anyone give a clear example of what having it would let you express more easily than you can now with iterators/loops/yield etc.

TCO is a fine feature in languages where its deeply engrained and part of its natural idioms. Bolting it on twenty years later doesn't seem that helpful to me.


Off the top of my head I can think of a doubtless unsophisticated way to recover the "stack" in this situation: namely, the same way you recover the "stack" in the factorial example, by using an accumulator hidden in the interpreter. When foo calls bar(), the interpreter passes along, when it makes the tail call, an explicit list of predecessors (in this case ["foo()"]); when bar calls baz(), the invocation of baz receives ["bar()", "foo()"], etc.


Some Schemes (like MIT Scheme) do this with a circular buffer of bounded capacity. Another Scheme interpreter (a little one by Jake Donham) had a switch to turn off TCO to help debugging. Yet another approach is full omniscient debugging (being able to go backward in time).


You are recovering the stack by creating another one.


Yeah. But you could decide how much of the stack to keep, at least. Nevertheless, I know that some smart persons (I just can't remember who!) have addressed this issue in a significantly subtler way---wish I could find the reference.


That seems to defeat the main advantage (not eating up lots of memory with deep recursion)?


Parent wasn't suggesting to keep full stack frames, but something like (file/function name/line number) tuples, that's not the end of the world. By making it a fixed-size list, you can ensure your memory usage won't blow up.


It's largely a red herring. The cases where TCO drops stack information are mostly the cases where you would've written the code as a loop in a language without TCO. So the equivalent is losing iteration history, which you don't have anyway.


I'm afraid this is just not true. With full TCO, if function A calls function B in tail position, and B gets an error, the frame for A will not be on the stack; it will in general be harder to tell how the code got to B.

Although TCO is often essential and I have written large blocks of code that depend on it, I have also long advocated better heuristics for when to turn it off so as not to impede debugging. The compromise taken, for example, by some Common Lisp implementations, where calls from one top-level function to another are not tail-optimized but local calls (calls to functions defined with LABELS or FLET) are, in my experience works very well most of the time.


Which is why I said "mostly." I don't think incidental tail calls come up often enough to be a major impediment to debugging. Moreover, often times when non-loop tail calls do come up, I don't want to see the intermediate calls anyway. E.g. proxies or thunks.


I think it's a lot easier to ignore uninteresting stack frames than to reconstruct those that have been optimized away.

But for experienced programmers, I agree global TCO doesn't quite rise to the level of a major impediment, though I have found it an annoyance at times. It particularly bugs me when the code I'm debugging is not recursive, so I'm not getting any benefit from TCO in this particular case.

For novice programmers, however, I think it could be a serious barrier. I think van Rossum was right not to have Python do it by default. But I think he should also have provided a way to define a set of mutually-tail-recursive functions.

I have long experience with Franz Allegro CL, which does TCO only on local calls. I think it's the best of both worlds: debugging usually isn't interfered with, and on those occasions where I really want to write a set of mutually-tail-recursive routines, it gives me a way to do that. I have certainly also used CL implementations that do global TCO -- I think that's all of the major open-source ones -- and they're certainly usable, but I prefer the Allegro compromise.


Most heavily optimizing compilers do strange things to the code. You wouldn't want to debug an -O3 program, just as you probably wouldn't want to debug with TCO turned on.

Luckily, in the lisp and scheme families (possibly also other image based PLs), it's easy to mix compiled (and optimized) and interpreted code. Whenever i run into a bug such that i need the debugger to fix it, I always replace the compiled function with an interpreted version first, making it much easier to debug. What you get in the stacktraces is the same as you read on the screen.

In most lisp implementations, this maneuver can be performed completely on line, you don't have to restart the program or recompile anything but the function under inspection itself. When the runtime signals an exception or fault, just tell lisp to interpret the suspect function. Then move up the stack to the function that called the suspect, and restart that stack frame et voilá, bob is your uncle.


An optimized tail-recursive call is just a loop. As far as I know, there are no stack traces for loops in imperative languages.


That's bullshit, however, because:

(a) there are techniques to deal with that, and (b) where's the accurate stack trace from a state-maintaining while loop, or a trampoline?


I think you could chose a better analogy.

In any production code I would rather prefer performance than debugging. Isn't that the point why C/C++ programmers have debug and release configurations? to turn off debugging features on a production code?


I guess you never had to analyze stack dumps from optimized release builds.


OK, but you also get no stack traces when you use a loop. I am not really seeing how this is an argument against TCO, it sounds more like an argument against anything other than unoptimized recursion.


And that's one reason why loops are hard to debug!

Losing stack frames for recursive functions is not so bad. The real violence that TCO does is to non-recursive tail calls, which form the majority of tail calls, at least in non-functional languages.


Isn't that an argument against many optimizations, such as function inlining?

In any case, it's possible to have both, if the compiler outputs some (optional) extra debug information. For example, gcc has options that let you reconstruct a 'virtual' stack trace in gdb even for inlined functions.


But it's not true.

As someone said in the comments below Guido's post, providing meaningful stack traces in presence of TCO is possible, it just requires a little more bookkeeping on language part.


Exactly, that's why you can't have loops in production code!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: