For the most part, programming constructs like these split the world into two camps:
- People who point out things can be done without them, who largely see them as useless due to lack of familiarity
- People who've used them, and see critical ways to restructure code to make it cleaner using said constructs
That's been the case for a lot of progress in programming. Python, and ES2015, have done a wonderful job of bringing many previously-academic (including functional) programming constructs to a broader community, together with models like reactive programming.
That's true of about half of the bits of progress in programming. Garbage collection was seen as garbage by C/C++ programmers ("What's the big deal with adding a free() call? Stupid lazy people."). Garbage collections wasn't useful because it omitted free() calls, but because it allowed the design of flows with many exit points (for example, different kinds of exception handling, exiting in the middle of a function, or giving up an object in a half-dozen places an hour later in code where tracking for free() requires building an ad-hoc reference counter or garbage collector).
The place where tail calls are a huge deal is when dealing with deep (or even infinitely deep) tree-like structures:
if condition:
return red(left_child)
else:
return blue(right_child)
I don't mind opt-in versus opt-out versus neither. All the reasons listed for not having them are dumb, though, and have good and easy work-arounds. The major one -- debugging -- it's basically always good enough to just have a list of functions called, without having the whole tree. A 10,000 element stack trace is no help at all. You can, for example, keep the first 20 elements of the stack trace (don't start PTCs unless the stack is of a certain depth), and then still keep a list of functions called:
On a tangent, I've always felt garbage collection, and, more importantly, safe memory management, was important because it allows you to mostly pretend that memory allocation isn't a globally side-effecting operation, as such operations are difficult to reason about. It's the same logic that leads to design choices that don't explicitly use global variables or global state.
In reality, all memory allocation is still globally side-effecting, and you'll find that out when your program starts GC spiraling, or consuming more memory than you want it to, but being able to pretend otherwise and mostly get away with it means automatic memory management brings a tangible, measurable productivity multiplier to programming that few, if any, other programming language features can boast of.
I disagree. My experience with codebases with programmers who program like that is that you eventually get memory leaks, some of which are near-impossible to debug or fix. You pay for that kind of thinking down-the-line.
Enabling new design patterns == good
Not thinking about what happens under the hood == bad
Catastrophes are rare, but expensive enough to cost more than thinking things through.
In my experience, memory leaks in GC languages are very easy to track down, since the timing for analyzing the heap is excellent. While discovering that you do have a memory leak is not very easy, once you know about it you can easily compare heap snapshots, find the increasing objects, and track down their GC roots, all from the same tool. With Java especially, you can even do this on a running service in production, with minimal downtime.
In contrast, it's much harder to track down memory leaks in C, since the runtime has little information about the contents of the heap. You typically end up using valgrind to instrument your code, assuming it can still run fast enough to reproduce the problem.
The only exception is Go, which has awful tooling for analyzing memory. For some bizarre reason, it can't even show you the gc roots of an object in memory, even though the GC obviously needs this information to work properly.
My experience is that Python has *many* more memory leaks than C code.
They're less visible since I ran C code on a computer with 32MB of RAM, and I run Python code on a computer with 32GB of RAM, but they're very common in modern Python programs.
As another comment pointed out, memory issues in Python are less dangerous too. I'm glad not to worry about buffer overflows in Python.
Memory leaks appear in BOTH (specifically, any data structure that accidentally keeps references to stuff will leak memory in python). The biggest difference is that Python won't create the particularly bad/dangerous memory leaks (eg, use after free).
Use after free is not a consequence of memory leaks. A memory leak is, specifically, memory which is still allocated but not referenced. Use after free errors can't happen if you don't free the memory.
The problem with tail calls destroying stack traces isn't with the recurse-for-the-tree functions. It's with code like this:
function frobnicate_something(foo, bar) {
let baz = antifrobnicate(foo, bar);
// Hey, I'm a tail call!
return exfoliate_meteor(baz);
}
With tail calls, you now lose the frobnicate_something stack frame and you just see a call to exfoliate_meteor, which can produce confusing results. This lost stack frame is potentially quite injurious, especially when the function that's lost actually does a serious amount of work.
This is the rationale behind the proposal to support tail calls only on explicit scenarios, where the developer opts in to throwing away stack frames.
A tail call is just a goto; using a special syntax for it (`for…` or `while…`) is simply syntactic sugar. Nobody complains about the lack of a stack frame in a for loop.
Goto can be anything - a function call, a loop, an if clause, exception handler. That's the whole reason we created structured programming - so we don't have to guess what it is THIS time.
Tail call optimization is nice, but it has its problems. What's the downside to requiring special syntax for it? It frees the programmer reading the code from determining if it's a tail call every time.
The downside is that your code doesn't optimize for free.
iOS is over 50% of the mobile browser share in the US and desktop Safari is 11-12% (roughly 1 in 9) and both implement proper tail calls and have for around 6 years now. Despite this, the world has not collapsed and people aren't constantly complaining that websites are constantly breaking.
Optimizing calls using the same syntax as regular ones risk introducing stack overflow without noticing.
function rec(x, y, z) {
if (something(z)) {
return rec(x, y, z-1);
}
return 1;
}
Now you need to add 1 to the result.
function rec(x, y, z) {
if (something(z)) {
return rec(x, y, z-1) + 1;
}
return 1;
}
Ups, now it fails with stack overflow for big z values. Hope you have good unit tests to catch this.
To avoid this problem whenever you modify a possibly recursive function in a language with transparent TCO - you have to look through the code to see if it's using TCO. This is wasted time. And it can be non-trivial if you have recursive functions calling each other and other funny stuff.
If instead the language required special syntax for TCO:
function rec(x, y, z) {
if (something(z)) {
return TAIL_CALL rec(x, y, z-1) + 1;
}
return 1;
}
This would be a compilation error cause it's not a tail call contrary to the declaration.
Idk, rarely (never?) happened to me, that I make a tail call into a non-tail call, because I ad-hoc add something like in your example to it. Seems like a beginner mistake. When I design a function to be tail recursive, which is all the time in langs supporting that, I wont suddenly disregard all care and add code like that. And hopefully other people will at least spent a minute reading the function they are modifying, before slapping that change on and calling it a day. If not, then some syntax wont protect you from that either.
I don't quite get the argument. Isn't this similar to e.g requiring the `z` parameter to be annotated with something that ensures that it is a small number? ("What is the downside of having special syntax for that?")
My undrstanding is that a tail call is a tail call, and the variant with +1 is not that, thus being a candidate to be looked at.
Perhaps also IDEs can help here if it's difficult to spot?
> Isn't this similar to e.g requiring the `z` parameter to be annotated with something that ensures that it is a small number?
You mean something like "unsigned short int" :) ? Another example would be "const". What's the point - either the variable is const or it isn't. Programmer can just remove the mutation.
> My undrstanding is that a tail call is a tail call, and the variant with +1 is not that, thus being a candidate to be looked at.
Sure, but don't you agree it's an easy mistake to make? Especially when the recursion goes through several different functions. Additionally - you reading the code might not realize it's important for this function to continue to be tail-recursive.
> Perhaps also IDEs can help here if it's difficult to spot?
Regarding compiler vs IDE, I'd say that since it doesn't change the meaning of the code, just the depth of recursion, then IDE would fit better. Perhaps you could also ask for info on tail-call optimized functions from the compiler giving it a flag, but otherwise I don't think whether something was optimized or not doesn't matter much (depending on what you work on, of course).
TCO is kinda binary. If you use a recursive function on small data it doesn't matter. If you use it on big data it fails with stack overflow so they will fix it.
AFAICT, your question doesn't make sense. But I guess you are thinking, "what if we should be using the TAIL_CALL syntax for performance but most people don't, and it's exarbated inside a tight loop?" If so, I'll try to explain:
TCO is not about making code faster, it's about making it not eat up all the (stack) memory. It's a memory, not speed, optimization[1]. (OK, using less memory does make it somewhat faster (even when not swapping), too, due to fewer memory accesses, but it's usually not a big difference, and not what we worry about; what we worry about is eating up so much memory that the computer starts swapping because of it, yes, at that point it would become much slower, but that doesn't happen with "small data in a loop"...let me finish.)
Not using the TAIL_CALL syntax in a tail recursion would use up the stack memory iff the recursion is deep (in this context because the data is not small).
If "inside a loop" in your question means, a self-recursive function call (tail recursion),
function foo(...) {
if ... {
foo(...)
}
}
then the answer would be, if it's on small data, it only uses a small amount of stack space. No problem. The problem only comes up if the data is large.
If "inside a loop" means that you're using a for or similar loop syntax:
for ... {
foo(...)
}
then a call to a function inside that loop is not actually a tail call (so the compiler would report an error if you were to use the TAIL_CALL syntax), since at least the test in the loop has to run after returning from the function call.
Does that make sense?
[1] And the memory is only being used until the end condition in the recursion is met, i.e. temporarily; it doesn't contribute to bloat, just uses memory for a bit then releases it again; except when it uses so much memory that you run out of RAM (or stack space if the VM limits stack space separately).
I assumed it was a speed optimization too, since a goto is faster than creating a whole stack frame and then destroying it again. Is that not the case?
By “inside a loop”, I meant a loop that is independent of the function, like this:
function foo() {
if (…) {
foo();
}
}
for (let i = 0; i < 1000000000; ++i) {
foo();
}
Here it won't cause a stack overflow either way (if foo is used on small data), but I'd assume that the loop would amplify the effect of even a small performance optimization.
> since a goto is faster than creating a whole stack frame
Yes, as I mentioned in parentheses--creating a stack frame costs because it is writing to memory, other than that it's just an increment of the stack pointer register in compiled/JITted code. But it's not a large cost, so you usually won't notice it or at least not much. Especially if your recursion depth is small, as then the memory writes might not leave the CPU cache. Sure, making some code 10% faster by way of TCO can still be useful, and it's probably why C compilers are doing it (they don't guarantee general TCO so you're limited in what you can do with it, so the only guaranteed advantage is a little speed gain), but if we're talking some JavaScript app there will be worse problems. Especially, the problem that with large enough data you're going to allocate lots of temporary memory for the stack--that's something a user will notice. Especially if the memory temporarily used is not given back to the OS, which is the case in C, and I rather expect JavaScript as well, which will mean that every tab running JavaScript will have some amount of RAM tied down to (even if rare) temporary stack use.
> and then destroying it again
This is then really just a decrement of the stack pointer.
That makes sense, thanks. Now I'm curious, if popping a stack frame is just decrementing a pointer and the contents stay there, isn't it a security vulnerability?
Only if there's a way to access it. From within JavaScript you don't have access to it, unless there's a security hole in the VM. But yes, programs handling encryption keys generally overwrite the memory holding them before freeing it / letting it go, at least those written in C/C++. Either for the case of a security hole allowing access to such memory, or so that it doesn't stay around in process memory where another process on the same system with the necessary permissions (on Linux either root or the same user) can access its memory via debugging APIs. But it's not done for normal data.
> Python, and ES2015, have done a wonderful job of bringing many previously-academic (including functional) programming constructs to a broader community
This is kind of amusing to hear considering that Python seems to actively design itself around making typical functional programming constructs difficult and JavaScript has a bunch of weird quirks with how it implements things (you’ve seen the “map” joke perhaps?)
The major thing Python has done is bring major functional design patterns, like maps, reductions, everything in itertools, and made human-friendly versions and syntax for them. Right now, that covers 70% of the stuff I did in Scheme over C/C++ (weighed by volume of use, rather than by feature list). A lot of beginner programmers now use features like closures and decorators, and they're accessible.
The remaining 30% is awkward, but usually possible.
JavaScript is convoluted.... but after the learning curve, it does functional as well as anything, and better than most other things it does. It certainly does functional better than OO.
Before C#, he designed Delphi, which was a pleasure to use. It made Pascal beautiful, which is not something many considered possible. Functionally, Delphi combined the ease-of-use of VisualBASIC with the power of C++, with an elegance unparalleled in any environment before.
He did Turbo Pascal. It's hard to overstate how much of a revolution that was. Compile-wait-wait-wait-run turned into run.
Now, he's doing TypeScript.
I'm not sure it's really fair to compare anyone to Hejlsberg.
"Sure, your kid won a Nobel Prize for his research, but Einstein did relativity...." "Sure, your business hit a billion dollars, but it's no Apple...."
Acknowledging brilliance elsewhere doesn't reduce my appreciation of Python. It's a good language.
I program professionally in both languages and in my experience the two are very different when it comes to functional programming. FP is viable in JavaScript, if imperfect. And as the language continues to evolve to better support that paradigm its community continues to embrace it. But Python’s choices seem to be actively antagonistic to FP. I’d like to be able to use a functional approach in Python when it makes sense but it’s too hard, too awkward, too flawed, and too against the grain of the language.
I’d agree in the sense that JavaScript gives you poor out-of-the-box support but it definitely can be shaped into something pleasant. Python will make you hate yourself if you try, yes.
> Garbage collections wasn't useful because it omitted free() calls, but because it allowed the design of flows with many exit points
This misses the mark a bit. C++ destructors allow for multiple exits. Design at the level of an individual subroutine is not very important, ultimately.
Garbage collection is useful because it enables greater modularity, at the level of full applications and protocols. It does not force details of memory management and ownership into api contracts, leaving them free to be changed. (This, incidentally, is one of the reasons smarter people than I take issue with rust: it forces details of ownership into api contracts, reducing modularity. It is also the reason why reference counting is less modular than tracing gc: it does not deal with cycles, so cyclicality is part of api contracts, and a very subtle part at that.)
> People who point out things can be done without them, who largely see them as useless due to lack of familiarity
There's no need to dismiss the those who disagree as just having a lack of familiarity. There is a technical trade off to be made, with pros and cons each way. The only people that are objectively wrong are those that claim the decision is clear cut.
The con, in particular, is language complexity. You only have to look at C++ to see that it is possible to include to many features in a language. There's no disputing that tail calls are useful. The question is whether they are so useful that everyone who learns the language has to understand them.
Garbage collection was seen as garbage by C/C++ programmers ("What's the big deal with adding a free() call? Stupid lazy people.")
As a former C++ programmer working on a large code base when/before Java was introduced, garbage collection was seen as expensive and slow, but definitely not a waste of time. I would have given my right arm to be free (pun intended) from the need to manually manage my heap across threads. Adding a free() call was never not a big deal.
exactly, it was never about attitude, it's about deterministic real-time. C/C++ are low level systems languages for real-time control applications. In that case garbage collection, a non-deterministic unpredictable stop-the world pause, simply is out. so you have to have a base language without GC. you can use garbage collection via libraries in c/c++ if you don't need deterministic real time. It never was about "not liking" GC or "thinking it's garbage".
So, what happened to syntactic tail calls? I think that's what I would prefer anyway, both because it makes it more clear from a debugging standpoint, since you opt in, and because you can get a warning (or compiler/linter error if using a transpiler or linter) when your function isn't actually tail recursive.
The whole "issue" is very strange to me. Proper tail calls (PTC) without the extra syntax are literally free performance boosts for existing code.
The whole "stack frames" argument is a red herring:
* Nobody expects stack frames to exist for every `for` loop which is the biggest practical use for PTC
* Stack frames go away the second you release control back to the event loop which is by far the more pernicious problem.
* Stack frames essentially just capture the continuation anyway. If my function `blah()` calls `foo()` and `bar()` before blowing up on `baz()`, neither of those functions will be captured by the stack frame which is no different than a CPS (continuous passing style) with PTC where you have `foo()` that returns `bar()` that returns `baz()` and `baz` throws. In BOTH cases, you'll see the stack frame for `baz`, a stack frame for `blah()` and frames for whatever called `blah()` up to the top of the stack or where the event loop made the stack frames disappear anyway.
* EDIT: I almost forgot to mention, but you can activate a "shadow stack" when the debugger is open (just like they already disable most optimizations when it's open) which can give you your reams of useless stack traces as your function executes a million times in a loop.
In short, programmers have performance to gain and not much of real value to lose by implementing PTC without syntax.
> The whole "issue" is very strange to me. Proper tail calls (PTC) without the extra syntax are literally free performance boosts for existing code.
It's bad to create a dangerous performance cliff. There could be some TCO based code that works fine, and then a junior coder makes an 'innocent' change that makes it ineligible for TCO, then that code eventually starts getting OOM crashes on heavy data.
I think if they're gonna do TCO then it really should be syntactic. If someone is writing their code around an assumption of TCO, then they almost always want an explicit guarantee of TCO, and they want to fail fast if TCO isn't happening.
It would be the other way around, you'd get the stack overflow issues in non-Safari browsers.
But because nowadays nobody is coding with TCO in mind, there are no such issues.
What paulhodge is pointing out is that once people start coding with the assumption of TCO, then issues crop up when someone changes a tail call into a non-tail call (or, a browser that doesn't implement TCO runs the program).
But right now, unless you are developing exclusively against safari, you can't depend on getting TCO. If all browsers implemented it, you could, and you get the performance cliff.
Maybe the performance gains are worth it, but the current situation doesn't really refute the claim that it could be a problem.
> Stack frames go away the second you release control back to the event loop which is by far the more pernicious problem.
Sadly the opposite is true today: If you're doing async/await programming in Chrome (and Firefox too, I think?) the runtime actually tries to carry your stack across event loop turns and this will be visible in Error.stack. This happens even with the debugger closed in my experience (the massive stacks are really annoying)
Having had to debug issues in unfamiliar async-using code without the benefit of that feature (Now who the hell originally called this function?), I'd tend to agree.
I agree, syntactic tail calls seems like a great compromise if automatic tail calls are too risky. I'd love to have them in the language.
Here are some strawman syntax examples from the Syntactic Tail Calls proposal.
Return Continue
function factorial(n, acc = 1) {
if (n === 1) {
return acc;
}
return continue factorial(n - 1, acc * n)
}
let factorial = (n, acc = 1) => continue
n == 1 ? acc
: factorial(n - 1, acc * n);
// or, if continue is an expression form:
let factorial = (n, acc = 1) =>
n == 1 ? acc
: continue factorial(n - 1, acc * n);
Function sigil
// # sigil, though it's already 'claimed' by private state.
#function() { /* all calls in tail position are tail calls */ }
// Note that it's hard to decide how to readably sigil arrow functions.
// This is probably most readable.
() #=> expr
// This is probably most in line with the non-arrow sigil.
#() => expr
// rec sigil similar to async functions
rec function() { /* likewise */ }
rec () => expr
!-return
function () { !return expr }
// It's a little tricky to do arrow functions in this method.
// Obviously, we cannot push the ! into the expression, and even
// function level sigils are pretty ugly.
// Since ! already has a strong meaning, it's hard to read this as
// a tail recursive function, rather than an expression.
!() => expr
// We could do like we did for # above, but it also reads strangely:
() !=> expr
Reads like a sad state of affairs, but the article itself doesn't really explain whatbthe actual concerns where that caused the proposals to be put on ice.
From reading, I mostly get "PTC was un-implemented and put on ice because some browser vendors had issues with it; the alternative proposal, STC, was put on ice because other browser vendors had different issues with it. Then everyone (from the browser vendor side) kind of lost interest."
But what were the actual issues that blocked the two proposals?
Edit: Ah, I'm sorry. The issues with PTC are indeed described, but STC was brought forward specifically to address those reasons. So why wasn't STC implemented then?
> It makes it more difficult to understand during debugging how execution arrived at a certain point since the stack contains discontinuities
That's a weird complaint, considering that stacks don't describe "how execution arrived at a certain point". In fact, stacks don't describe the past at all; rather, they describe the future of what's left to do (AKA the "continuation").
For example, consider this code:
function foo() {
const bar = someComplexFunction();
performSomeEffect();
baz(bar);
}
If an error occurs somewhere inside `baz`, the stack trace won't mention anything about `someComplexFunction`, or `performSomeEffect`, or the vast majority of "how we arrived at" the call to `baz`. Yet it will tell us exactly what was remaining to do (namely, `baz` and `foo`).
If we eliminate tail calls, stack traces are still an exact description of the continuation. The difference is that "remaining work" doesn't include a bunch of useless identity functions (i.e. redundant stack frames with no further work to do)
For execution, a stack is a continuation. For debugging, we pretend like it's a historical record, and mostly get away with it. Various things break the correspondence slightly. TCO breaks it a lot more.
Debugging is important. It doesn't get enough respect. Stacks are a pretty critical component of debugging, for better or worse.
It would be great if we didn't depend on this fiction quite so much. With native code, there are definitely alternative options now, such as rr[1] and Pernosco[2] where if you want to look back in time—well, you just go back in time. For JavaScript, that's becoming more and more possible with things like Replay[3]. Perhaps before long, the debugging argument will just go away.
The hardware stack has always been a crutch that in retrospect was probably a bad idea. We use it for jobs it's not well-suited for (like parameter passing, local variables, and debugging) and it has held back better flow control mechanisms like delimited and first-class continuations. And of course TCO, which wouldn't even be a thing if everybody didn't automatically assume a stack pointer was involved with every call. (Hard to imagine? Yes, but plenty of other flow control models exist.)
Stacks are still useful for low-level jobs like register spilling and interrupt handlers, and they make memory management of such data easy. Nevertheless
on modern machines with multicore processors running message-passing programs, the limitations of what can be done in high-level code with a one-dimensional stack pointer should now be obvious.
Stack frames also capture locals, and those often provide a lot of information about what just happened. I've had cases before where this was instrumental to figuring out the cause of the bug, and other cases where it likely would have been if TCO hasn't wiped out that information (in C++).
If you're writing imperative code with side-effects (or mixed-style) like much classic JS code is, the existence of foo on the call stack indicates that performSomeEffect has been run, and thus it's side-effects on our global state when we enter baz has to be accounted for.
Is it an ideal style to write code in? No.
Does real code have this problem, Yes!
- STC is part of the spec and will take too long to change.
- Now that they've implemented support for PTC, they don't want to regress web pages that rely on it.
- They don't want to discourage vendors from implementing PTC by agreeing to STC.
- They don't want to introduce confusion.
Some of these arguments about confusion and delays seem wrong hindsight, since on every point things would have been better if they'd just agreed to the compromise of STC.
- It would have been part of the spec years ago
- STC would have had a clear way for web pages to know when tail calls could be relied on (and PTC would have been optional)
- Other vendors didn't implement PTC in any case, despite no agreement on STC
One advantage of syntactic tail calls is that you can give an error if you are unable to transform to a tail call.
Otherwise, you could have a program that seems to work file, and then you refactor and now your recursion isn’t a tail call anymore, and your stack blows up.
I thought the @tailcall annotation in OCaml was cool. It's not essential to use it to receive the optimisation, but rather it's a way to tell the compiler "I need this call to be optimised, so let me know if you can't do it".
It needs to be at the end of the calling function so you can throw the calling function's stack frame away, since it's still in use. Getting rid of the calling stack frame is what proper tail calls is about.
Worth reading only if "because ignorance" is useful knowledge to you. In any case, you'd be better off starting at http://funcall.blogspot.com/2009/04/you-knew-id-say-somethin..., which tries to clear up some of the misconceptions Guido was laboring under at the time.
> So in my mind, the applicability of reduce() is pretty much limited to associative operators, and in all other cases it's better to write out the accumulation loop explicitly.
Grantparent is right. What you pointed to is in functools, a "functional" utility package in the stdlib, which has even more exotic stuff.
But Python doesn't have a reduce function as a primitive, and Guido and co discourage such uses. So much so, that Python 2 did have, and it was explicitly removed.
Outside pure functional languages, I think the best way to implement them
is to have explicit tail call declaration for functions and/and tail positions and ability to disable them for debugging (or have debugger aware of them).
This way you can actually use them to implement useful stuff and rely on them. You get a compiler error if tail call can't be implemented.
- TCO only addresses recursion that can easily be replaced by a loop
- loosing stack frames makes debugging harder
- its not just an optimization, as soon as code depends on it to not blow the stack its a required feature for all implementations
- functional languages with no side effects need recursion, everyone else really doesn't
Personally, I think TCO is bad in a similar way as async functions. It is an exception from the general mental model of how function calls work, makes tooling much more complex and the alternative is just writing a loop, which I know is beneath the average Lambda The Ultimate subscriber, but its just how some people earn their money.
> TCO only addresses recursion that can easily be replaced by a loop
This is simply false. Specialized tail recursion optimization, which I've seen a few places, approximately does that, but generally TCO covers a lot of things that aren't easily replaceable with for loops.
> its not just an optimization, as soon as code depends on it to not blow the stack its a required feature for all implementations
True, though that's an argument for it, not against it.
> loosing stack frames makes debugging harder
This is a weird argument to include with the for-loop thing, since the stack frames “lost” would never exist in the for-loop form for the case where there is a straightforward equivalence. In general, I find that this is mildly true (it sometimes make spotting the source of an error from the dump an an unhandled exception harder, but doesn't really make any debugging that involves more than that harder).
> functional languages with no side effects need recursion, everyone else really doesn't
Regardless of whether the language has side effects available, functional code without side effects is easier to analyze and assure important properties of, and it's beneficial to be able to leverage that even if you have a language that allows side effects.
As far as I understand, TCO (or PTC?) gives you "structured goto", i.e something that would be a pain to write as a loop, won't either mess up the state/scope, but still achieve that "loop speed" without stressing the stack.
Being able to guarantee tail calls is a useful optimization in far more cases than code replaceable by loops. I’ve used it myself where dispatch targets themselves are dynamic (so can’t be trivially made into a loop) for significant performance gains.
Some other folks who’ve done the same thing (and can post publicly) for code that’s not just “must recurse because loops are for lowly imperative serfs”:
Interesting use case, didn't occur to me that tail calls can also just be a performance optimisation technique to help out the compiler and branch predictor. I assumed hot loops could be implemented just as well using GOTOs, but maybe not?
Everything can be implemented using IF statements and GOTOs. That's how early processors worked, and Turing completeness and all. We don't _really_ need function calls, while loops, or for loops either.
In the particular linked case of protobuf parsing, a loop with goto's doesn't produce very well optimized code because of specific internal details about how modern C compilers do optimizations. You could certainly imagine a compiler that can fully optimize a go-to heavy program, in which case the code cleanliness argument would be the only reason.
Tail calls are basically GOTOs, yeah. The bring a HUGE benefit of very clearly defining state flow between gotos which makes the compiler's job super easy.
> TCO only addresses recursion that can easily be replaced by a loop
This is fundamentally false.
TCO addresses recursion that is implemented through the arbitrarily complex composition of functions, enabling one to define arbitrarily complex loop constructs through function composition.
There are many such interesting compositions that cannot be “unrolled” into a single high-level imperative for loop without essentially having to rewrite the code of all the functions being composed, including code that controls looping in interesting ways (e.g. automatically terminating on error, collecting all errors, parallelizing execution, etc.)
> It is an exception from the general mental model of how function calls work
It’s not, though — unless you have an incorrect mental model of how function calls work.
When calling a function, the return address is first pushed on the stack (or on some architectures, stored in a link register).
The target function returns from execution by popping the return address from the stack (or reading it from a link register), and jumping to that address.
When calling a function, if the call is in a tail position, the calling function can provide it’s original return address, and then jump to the target function it’s calling.
That’s how functions actually work. That’s the mental model.
> makes tooling much more complex
What significant complexity does TCO add to tooling, exactly?
> the alternative is just writing a loop
That’s simply not true. See first paragraph above.
> There are many such interesting compositions that cannot be “unrolled”.
You're arguing semantics, sure, you cannot mechanically transform code which relies on TCO into a loop. That is not the same as the parents point that TCO functions are isomorphic to loops. In a language without TCO you wouldn't ever find yourself in mess of composition of functions.
Another quote:
> One common misunderstanding about TCO is due to the word 'optimization'. It is indeed a space optimization (don't use more space than goto, as Guy said in the 1970s) but a language should implement TCO in support of PROPER DESIGN. To wit, go through the OO design pattern books and inspect all the little patterns. Pick some -- say interpreter or composite -- and design your Java program accordingly. Then run a stress test and weep. Java blows up even if all the method calls are tail-recursive because it doesn't support TCO. Now do the same in PLT Scheme's class system and smile. It works -- for all inputs.
Replacing a stack of function calls with a loop may be the same in terms of what happens, but it’s the like inlining code in general. Sometimes packaging it up makes it cleaner and more reusable and testable, relative to inlining it (or sticking it in a for loop).
Yeah, that could work in principle. However, if it's a language that’s using recursion for looping, then you'll loose that history every time you have a loop with more than n iterations (which could be quite often). Given that recursion can be indirect, you can't entirely eliminate that problem just by special casing direct recursion. It might still be better than nothing, though, I agree.
I would opt for balance, there is a reason why some languages compile in debug and release mode, because of the tradeoffs.
Having code that is optimal in performance often implies a tradeoff in debuggability. If debugging helpfulness is the major design decision of a programming language, that designer is trading off performance.
Tail calls fundamentally isn’t (just) about performance, it’s about language capability. Tail calls allow you to recurse indefinitely (because required stack space remains constant), which is not possible without tail calls (stack grows indefinitely). For example, tail calls make it okay to recurse on variable-length user input, which would be ill-advised in languages not supporting tail calls.
And not just direct recursion as is often considered. Mutual recursion is also handled neatly by this permitting you to write very clear state machines via functions and mutual recursion, if the tail calls get optimized. Very handy for parsing and similar tasks.
So the whole topic is not terribly hard to understand. When you see
return f(x', y', z')
in some function g, then g's stack frame just describes a forwarding proxy, “let me take the value returned by f and hand it to whoever called me.” And like with all forwarding proxies you can just delete the middleman and it works fine. You would do this because it gives you an alternate, debatably simpler, model for looping. In other loops you either have to return out midloop, or have to explicitly marshal your inputs and outputs of each step into mutable variables, see. So here is the same loop written two ways, the second is probably less familiar to you:
function fib(n) {
let curr=0, last=1;
for (let i = 0, i < n; i++) {
[ curr, last ] = [curr + last, curr];
}
return curr;
}
function fib(n, i=0, curr=0, last=1) {
if (n == i) return curr;
return fib(n, i + 1, curr + last, curr);
}
These are only different styles for the same thing if you can trust that the call stack does not overflow in the second, which it doesn't have to because it returns a call to a function. So the problem is, if you have too many forwarding proxies in a chain, the language gives up on you.
Guido gives four reasons, you are quoting the first. The counterpoint there is, tail calls are just rewrites for looping constructs, as Guido admits in point 3. Should we ban loops as not “maximally helpful for debugging” because not every iteration appears on error stack frames? Perish the thought!
So at the end of the day that one just turns out to be, I am a lazy developer and don't want to figure out how to track this looping info in a way that makes sense outside of the call stack, the call stack exists and works, let's just keep it. And like, that's respectable!
The other 3 reasons are better? Reason 2 is correct, Python has multiple implementations and they'd all have to play, cf. the OP where JS implementations didn't. Reason 3 is correct but unimaginative, there's no reason you can't write a loop in this style and use Python's data structures, for that matter you can write in this style and not use any data structures, like the example above! Because the technical objection isn't really there, again, this boils down to just, Guido wants to read Python code and he finds recursion hard to read, and wants the language to make it deliberately slow so that he never has to read it. That one is valid, but it sounds almost borderline unethical? And I will admit that reason 4 fooled me at first! This appears to be a damning problem but in fact it's just smoke and mirrors, right? “I might not know who the forwarding proxy is forwarding in advance.” Yes that's true but we can agree that the forwarding proxy is unnecessary no matter what it is forwarding. Sure, your language sucks at referential transparency, but if you are conflating these two things you are confusing the issue, no?
Okay, so I started out this comment wanting to defend Guido and here I am at the end disagreeing with him...
We should switch on time travel debugging when it’s worthwhile, and be aware of its actual costs, rather than always paying for stack frames because they might serve as an incomplete history (missing loop iterations and calls that already returned).
I use recursion in JS when implementing generic trees/dags. I'm not worried at all about growing the stack because I use the language for UI stuff, where the depth of the trees is quite shallow and the data is small overall.
I don't really know what the utility of TCO/proper tail calls would be. You already have UX constraints that nudge to avoid having a ton of stuff on the screen.
As an example of where recursion of generic trees could be applied in a UI: Look at HN threads. Even exceptionally large threads have what, a couple hundred responses? With depth of maybe a dozen? Also you typically have affordances to navigate such a tree and only see the parts of it that you want. So it becomes even more trivially small.
I ran out of JS stack, walking the dependency graph of a big computation. Dependency graphs can be very deep relative to their overall size. It was a big pain to rewrite with an explicit stack of to-be-visited nodes.
TCO wouldn't have saved me, though. It just needs a big stack. Node defaults to just under 1 MB, which doesn't go very far.
Yes, but even there it is typically used for stuff that leans towards front-end. People don't typically write databases and messaging systems in Nodejs.
I wonder about specific use cases where stack allocating recursion actually becomes an issue in the JS world.
For some examples: People have 3D game engines running in JavaScript. There's a lot of cryptographic work in JavaScript, including but not limited to cryptocurrencies - a lot of groundbreaking stuff from a technological perspective. Full blown emulators, developer tools, virtual machines... the world of JavaScript is way larger than CRUD applications.
> I don't really know what the utility of TCO/proper tail calls would be. You already have UX constraints that nudge to avoid having a ton of stuff on the screen.
The only thing you can think of using recursion for is walking the DOM?
Not specifically walking the DOM but doing DOM/UI related stuff with JS is where I sometimes use recursion. This might also be data processing, but that data is often small and shallow enough for stack growth not not matter, because it is typically related to the UI in _some_ way.
I don't typically use JS for heavy computation of large datasets that lend themselves to recursion. The only thing I can think of that is somewhat large is data visualizations and drawing graphs interactively, but there you typically have a matrix or just a flat array.
Yes absolutely. After learning about problem solving through recursive algorithms (SICP/HTDP) I was quite sad to find that JavaScript/TypeScript didn't have this.
The few real world discussions I've had about the topic of tail calls revealed that people mostly don't know about them and are thus more "afraid of the unknown" rather than against the thing itself.
> Despite its inclusion in the 2015 language specification, PTC is currently only supported by Safari
Kind of related: I kept hearing about how backward Safari is, but is it really bad?
To me even as a web dev myself, the constant stream of new things pushed into modern browsers seem unsustainable. Saying no sometimes doesnt seem clearly bad, especially considering the fact that all browser vendors have some sort of agenda. Safari and Firefox are the only neutral ones but I'm not sure about the latter anymore.
> Unfortunately, the TC39 process at this time did not require heavy implementation involvement and so while many implementers were skeptical, the feature was included and standardized as part of ES6.
I've seen a lot of chatter about tail calls, but I don't think I've ever seen actual examples of what they look like. Does anyone use them or is it just something for language nerds to obsess about?
It's a compiler optimization, not something an end user should really ever use. As an end user, it should just look like a function whose final statement is the call to another function (often itself, recursively).
One may be familiar with "stack overflow" which is when a series of function calls (often recursive) go so deep before actually returning that it exhausts the maximum stack space (essentially an array of scopes containing variables/state for each function). A proper tail call will realize that it can reuse the same existing stack frame instead of adding a new one, which essentially makes exhausting the stack impossible and removes the need for the CPU to do all of that bookkeeping.
In performance sensitive workflows it can make a large difference.
It's broader than compiler optimization. You have to implement algorithms in such a way that the recursion is in tail position. Sometimes you may also have to specify that the function is tail-recursive (I believe that's the case in Scala?)
Any time a function ends by calling another function and doesn't do anything else besides perhaps return the value you're using a tail call. It's just the name for a function call that's in tail position.
> Any time a function ends by calling another function and doesn't do anything else besides perhaps return the value you're using a tail call. It's just the name for a function call that's in tail position
It sounds like you already know this, but many people, including myself at one time, think of tail call optimization as a trick for not blowing up the stack when writing recursive functions. However, it's much more general. Tail call optimization doesn't have to be recursive, it can be applied any time the final statement or expression of a function is another function call. It's something like a special form of inlining. And as you say, it's actually possible to apply the optimization in limited cases where the tail called function returns a value![2]
That general optimization is very useful for implementing threaded[1] or continuation passing style VMs since it compiles function calls and all their associated baggage down to a jump and maybe some assignments.
For me the mental leap was that when I call a function, what I'm saying is "do this, then come back to me so I can finish." What if I don't care if the function comes back, because I've already done all the work I need to do? What if I'm really just handing off my results to the function for it to finish the job itself? Then I should give the function the address of whoever called me, and leave. The function I'm calling is replacing me, not assisting me. If a call stack is a series of waypoints that have to be revisited in reverse after a goal is achieved, with a tail call I'm saying "don't bother coming back to me."
I might leave my house, go to an ATM, then go to the grocery store to do my shopping. TCO means that I don't have to stop by the ATM again on the way home.
So my mental model actually doesn't have anything to do with recursion.
One way to look at it is they let you do "low level functional programming", where you can write code that jumps around without using stack space.
If you think about all the standard statements in a language - if, while, async, etc - all of these kinds of constructs can be built out of normal functions once you have tail calls. This makes the language more extensible and reduces the need to introduce new kinds of built-in language constructs and allows more experimentation.
They're really useful for input processing/parsing, recursion, implementing algorithms, etc.
Another way to think about it is that functions can become little 'named blocks' with defined inputs and outputs, and you can combine them together without worrying so much about the runtime costs of functions that take up stack space. So if you have a function with lots of if conditions, while loops, etc, you can convert it into these 'named blocks' instead, which makes maintenance easier.
There are lots of examples. In a way, the fact that none of the main imperative languages support this simple runtime feature is what's stopping people from realising how great tail calls are. There's a bit of a Catch-22 happening.
From my understanding tails calls used more in functional languages where mutations are not welcomed or not allowed. This construction basically gives a way to have unlimited loop without adding more stuff to the call stack. The downside that you are losing call stack and it may be harder to debug.
I've ended up writing a number of things in an equivalent iterative way due to this... which in retrospect feels like a positive thing because I find it far clearer.
There are some forms of control flow which are difficult or impossible to represent in an iterative manner. The big example is VMs, where tail calls or goto provide noticable performance improvements over a large switch statement in a loop. Compilers have trouble optimizing such a large function, just as people have more issues maintaining one.
For what it's worth, syntactic tail calls seem to be the way to go when adding this to imperative languages, as it gives more control over stack usage. The WebAssembly VM has a proposal for a 'return_call' instruction, and rust has a reserved 'becomes' keyword.
After thinking about this for a bit, the decision to avoid including this functionality is probably for the best.
Even though I would love for this feature to exist, I can see people unintentionally shooting themselves in the foot and not understanding why.
Often when you run up against call stack limitations, you actually need to reconsider the algorithm being used.
Trampolines can be used to bypass the call stack limitation. As an advanced technique, the majority of people having issues with a call stack problem will reconsider their solution before thinking about jumping on the trampoline.
If the algorithm can be PTC optimized, then it is and everything works as efficiently as possible.
If not, then it blows the stack either way.
Finally, a trampoline is objectively worse. The programmer has to have an even bigger understanding of tail calls. Trampolines involving complex patterns are MUCH more difficult to follow. The trampoline is implemented in JS rather than C++. The trampoline will require additional function overhead that cannot really be eliminated. The Trampoline isn't anywhere near as optimizable by the JIT either.
Trampolines are all downsides in comparison with proper tail calls.
You are coming from the side of someone who already has a a well planned algorithm that doesn't get stuck in infinite looping or end up diving too deep.
My concern was for those without a well planned algorithm, who don't see that it can get stuck in a loop or dives too deep too quickly. In these situations blowing your stack is a good indication you have a problem.
This is just my bias of dealing with programmers who don't do well with recursion or love to introduce function call hell.
- People who point out things can be done without them, who largely see them as useless due to lack of familiarity
- People who've used them, and see critical ways to restructure code to make it cleaner using said constructs
That's been the case for a lot of progress in programming. Python, and ES2015, have done a wonderful job of bringing many previously-academic (including functional) programming constructs to a broader community, together with models like reactive programming.
That's true of about half of the bits of progress in programming. Garbage collection was seen as garbage by C/C++ programmers ("What's the big deal with adding a free() call? Stupid lazy people."). Garbage collections wasn't useful because it omitted free() calls, but because it allowed the design of flows with many exit points (for example, different kinds of exception handling, exiting in the middle of a function, or giving up an object in a half-dozen places an hour later in code where tracking for free() requires building an ad-hoc reference counter or garbage collector).
The place where tail calls are a huge deal is when dealing with deep (or even infinitely deep) tree-like structures:
I don't mind opt-in versus opt-out versus neither. All the reasons listed for not having them are dumb, though, and have good and easy work-arounds. The major one -- debugging -- it's basically always good enough to just have a list of functions called, without having the whole tree. A 10,000 element stack trace is no help at all. You can, for example, keep the first 20 elements of the stack trace (don't start PTCs unless the stack is of a certain depth), and then still keep a list of functions called: I have literally never seen a case where having a list of 100k calls in a stack traces is at all useful for anything.