A simple defer feature for C

nrclark · on Jan 16, 2022

C could really use a standardized defer feature.

I don't like the proposed syntax though, because it introduces a new meaning for the & operator. In all other uses, the & operator modifies the type of the operand, to be the opposite of the * operator.

In the proposed syntax, & becomes something a little bit different: a C++-style reference. I think it would introduce too much language confusion.

A more C-like syntax would be:

defer(captures-by-value; captures-by-reference) {}.

Defer() statements would be executed at scope-exit, in last-in first-out order. Standard block scoping would apply for anything declared inside of the defer().

KerrAvon · on Jan 16, 2022

The & syntax builds on a separate proposal for C++-style lambdas in C which would introduce that syntax generally.

SeanLuke · on Jan 16, 2022

Given C++'s grotesque track record of insane new syntax, this wouldn't seem to be an advantage.

pjmlp · on Jan 17, 2022

I don't see how Apple's block syntax would be any better.

yason · on Jan 17, 2022

I like your C-like syntax better.

But I don't think there's any need to differentiate between different captures. Capture by value is enough because what do you need in C if you need a reference? Capture the value of the pointer.

    {
        void *chunk = malloc(x);
        
        defer (chunk, void **reference = &chunk) { ... }
        
        ...
    }

This would allow you to operate on the value both from the defer declaration time and defer execution time, whichever it is that you need.

nrclark · on Jan 17, 2022

That's an interesting point, and very C-like. You're right that capture-by-value can emulate capture-by-reference.

Capture-by-value does have a weakness though, in that it requires allocation (whether stack or heap). Capture-by-reference doesn't have any storage overhead.

In my example, I suggested having both. But if we had to pick only one, capture-by-value would be the one to use.

isaiahg · on Jan 17, 2022

> Defer() statements would be executed at scope-exit, in last-in first-out order. Standard block scoping would apply for anything declared inside of the defer().

But this still doesn't address the problem the change is used to contain. Should defer free a variable's value as it is when defer is used or the value at the end of the scope.

And is it really such huge change. I don't find it too far removed that it becomes confusing and I appreciate the additional control over what should be freed and where. You can imagine a scenario where a variable is being reused multiple times where this kind of additional control might be useful.

nrclark · on Jan 17, 2022

That's the capture-by-reference vs capture-by-value distinction in a nutshell, yes.

Capture-by-value takes a snapshot of the value instantly, and stores it until the block is executed.

Capture-by-reference waits to evaluate the variable until the block is executed.

Between the two, capture-by-reference is the least surprising behavior IMO. It acts like the deferred code is copy/pasted onto the end of the function.

Capture-by-value is the most flexible, and can also emulate capture-by-reference by using pointers. Any syntax should either support both, or only capture-by-value.

krapp · on Jan 19, 2022

I disagree. Every programming language is suffering from bloat and complexity now. We need one language, at least, to stand up against the madness and remain simple and stupid, even to the point of being dangerous - the old man yelling yelling at clouds of programming languages. That language could have been javascript, before all the nonsense with classes and templates and integers, but now maybe it should be C. Like, C doesn't even know what a string is, and I respect that. What even is a "string?" Sounds like unnecessary complexity to me. Raw pointers and bytes should be enough for anyone.

pantalaimon · on Jan 17, 2022

+1

We are not using teletypes anymore, the days of confusing single character keywords should truly be over.

eqvinox · on Jan 16, 2022

> Whereas gcc’s cleanup attribute is attached to functions, POSIX’ cleanup functions and the try/finally are attached to possibly nested blocks.

> This indicates that existing mechanism in compilers may have difficulties with a block model.

Erm. No. GCC's cleanup attribute is attached to nested blocks. Whatever that means for the conclusion there.

How is a pretty basic misunderstanding like this seeping into an ISO WG document?!?

[add.:] in case anyone wants to double check: https://gist.github.com/eqvinox/c062f5a46f3a60b1151fcdf3e91a...

tialaramex · on Jan 16, 2022

An individual or handful of people make a proposal, the Working Group looks at the proposal, maybe it gets revised (this is version 2). Working Groups themselves do not, on the whole, produce documents like this.

There is no ISO magic ensuring working groups are all-knowing. There's an excellent chance that no members of WG14 have a deep in-depth knowledge of GCC, or that members who did they weren't particularly interested in that part of this paper (e.g. they were already staunchly for or against, remember ISO Working Groups are democratic, a proposal does not need consensus, so a working group member who thinks your idea is inherently good or bad might just skim the introduction, and move on)

taeber · on Jan 16, 2022

I like the idea of automatic resource cleanup in C and was excited when I read the headline, but I agree with others that the proposed syntax doesn't seem very C-like.

I previously shared a "with" statement idea on HN that I thought fit with the rest of the language.

  with (FILE *fp; fp = fopen(path, "wb"); fclose(fp)) {
  ...
  } else {
    perror("Failed to open file");
  }

https://github.com/taeber/cwith

https://news.ycombinator.com/item?id=23370693

edit: formatting

okla · on Jan 17, 2022

I wonder if this can be emulated with for loop:

  for(FILE *fp = fopen(path, "wb");
      fp != NULL;
      fclose(fp), fp = NULL) {
    ...
  }

taeber · on Jan 17, 2022

I'm glad you suggested that, because one of the first comparisons I drew in my article[1] about the "with"-syntax was how it looks like a for-loop.

If you are interested, I did provide a few macros[2], but as another commenter pointed out, they won't handle an early return. You can use "break" to exit early though:

  #define with(declare, startup, cleanup, block) \
      {                      \
          declare;           \
          if (( startup )) { \
              do {           \
                  block;     \
              } while(0);    \
              cleanup;       \
          }                  \
      }

  with (FILE *fp, fp = fopen(path, "rb"), fclose(fp), {
    fread(buf, 1, sizeof(buf)-1, fp);
    if (ferror(fp)) {
      perror(path);
      break;
    }
    printf("%s\n", buf);
    success = 1;
  })

There's also "withif"

  #define withif(declare, startup, cleanup, block, otherwise) \
      {                      \
          declare;           \
          if (( startup )) { \
              do {           \
                  block;     \
              } while(0);    \
              cleanup;       \
          } otherwise;       \
      }

[1]https://github.com/taeber/cwith

[2]https://github.com/taeber/cwith/blob/master/with.h

josephcsible · on Jan 17, 2022

No, because exiting the block early (e.g., with return) won't run the cleanup code.

jstimpfle · on Jan 17, 2022

That's a little bit like saying, it's impossible to write functioning code because you could make a mistake. But an abstraction that can be misused might still be an abstraction worth using.

Here is a macro that can be used to emulate a defer

    #define CONCAT_(a, b) a ## b
    #define CONCAT(a, b) CONCAT_(a, b)
    #define UNIQUENAME() CONCAT(i_, __LINE__)

    #define SCOPE_(counter, init_stmt, exit_stmt)  for (int counter = ((init_stmt), 1); counter--; (exit_stmt))
    #define SCOPE(init_stmt, exit_stmt) SCOPE_(UNIQUENAME(), (init_stmt), (exit_stmt))

Granted it's a hack, but it can be useful at times. I've used something like it to define a large data hierarchy in code for example, as having to close all the nodes manually is tedious.

You could wrap a second for-loop around the definition of the macro to at least be able to catch misplaced "break" statements.

okla · on Jan 18, 2022

Looks very interesting. Can you please provide usage example?

jstimpfle · on Jan 18, 2022

Sure, you'd code like

    SCOPE(Resource *ptr = acquire_resource(),
          release_resource(ptr))
    {
        // do stuff with resource ptr.
    }

Actually, to allow variable declarations in the init_stmt like above, you'll need to use two nested for-loops:

    #define SCOPE_(name, begin_stmt, end_stmt) for (int name = 0; !name; assert(name && "should never break from a SCOPE")) for (begin_stmt; !name; name++, (end_stmt))

It is natural to add another layer of specific usage macro like this:

    #define UI_NODE(ctx) SCOPE(push_ui_node(ctx), pop_ui_node(ctx))

    UI_NODE(ctx)
    {
        ui_color(ctx, UI_COLOR_RED);

        const char *items[3] = { "a", "b", "c" };
        for (int i = 0; i < 3; i++)
        {
            UI_NODE(ctx)
                ui_text(ctx, items[i]);
        }
    }

Naturally, if you're already on C++, better code these macros in terms of RAII instead of abusing for-loops. That will add a little robustness.

okla · on Jan 18, 2022

I'm on C89, but this is a very useful trick. Thanks!

emptybits · on Jan 16, 2022

I'm interested. So defer is one of the handful of achievable goals I find Zig interesting and hold some hope for it. Easy to call C from Zig or Zig from C. And Zig can also compile C and target cross-platform easily.

    const sprite_sheet_surface = c.SDL_LoadBMP("res/tile.bmp") orelse {
        c.SDL_Log("Unable to create BMP surface from file: %s", c.SDL_GetError());
        return error.SDLInitializationFailed;
    };
    defer c.SDL_FreeSurface(sprite_sheet_surface);

lerno · on Jan 16, 2022

I don't mind a sane defer in C, that is something which follows scope in the same way stack allocated destructors are invoked. This follows the behaviour in other languages like Swift.

Why "on function exit" style defers - already known to be a bad idea from Go - is beyond me. Such a solution more or less requires storage of potentially unbounded allocations.

    Foo *f;
    for (int i = 0; i < very_big_value; i++) {
      f = get_foo(i);
      // BOOM
      defer return_foo(f);
    }

I would sort of understand if this was proposed for a high level language garbage collection where actual memory usage is secondary.

adamrt · on Jan 16, 2022

> Why "on function exit" style defers - already known to be a bad idea from Go - is beyond me

Is there something you can point me to about this? I write Go professionally and from a readability and utility standpoint I really like it in common scenarios. I hadn't heard its a know bad idea and am just curious. Thanks.

kaba0 · on Jan 16, 2022

I think parent means that there are languages with scope-based clean-up (e.g. in rust/c++ a value will be cleaned up at the end of the scope that contained it, so one can even create a separate block inside a function) which is a better choice than forcing people to do clean up at the end of the function.

tialaramex · on Jan 16, 2022

Note that Rust isn't dropping things "at the end of the scope" but at the end of their lifetime, it's just that if you declare local variables their lifetime ends when they fall out of scope and so this often (but not always, so it's worth remembering to care about lifetimes not scopes) coincides for values in those variables.

Making things more confusing, Rust is inferring scopes you never explicitly wrote, for example Rust brings a new scope into existence whenever you declare a variable with a let statement:

  let good = something.iter().filter(is_good).count(); // good is a usize

  let good = format!("{} of them were good", good); // a String

This is fairly idiomatic Rust, whereas it would sound alarm bells in a lot of languages because their shadowing is dangerous (if you hate shadowing you can tell Rust's linter to forbid this, but may find some other people's Rust hard to read so I suggest trying to see if you can live with it instead).

Obviously that first variable named "good" is gone by the time there's a new variable named good, and so that usize was dropped (but, dropping a usize doesn't do anything interesting, beyond making life harder for a debugger on optimised code since this "variable" may never really exist in the machine code). On the other hand the String in that second variable named "good" has a potentially long lifetime, if it gets out of this local variable before the variable's scope ends.

Because Rust is tracking ownership, it will know whether the String is still in good when that scope ends (so the String gets dropped), or whether it was moved somewhere else (e.g. a big HashMap that lives long after this stack frame). Because it tracks borrowing, it will also notice if in the former case (where the String is to be dropped) there are outstanding references to that String alive anywhere. That's prohibited, the lifetime of the String ends here, so those references are erroneous, your program has an error which will be explained with perhaps a suggestion for how to fix it.

oconnor663 · on Jan 17, 2022

NieDzejkob is correct: In Rust, shadowing a variable has no effect on when the destructor of the previous value runs. Thus there's no problem with retaining a reference to the previous value:

    let a_string = String::from("foo");
    let retained_reference = &a_string;
    let a_string = String::from("bar");
    dbg!(retained_reference);
    dbg!(a_string);

Prints:

    [src/main.rs:5] retained_reference = "foo"
    [src/main.rs:6] a_string = "bar"

Similarly, "non-lexical lifetimes" have no effect on when a destructor runs. The compiler will infer short lifetimes for values that don't need to be destructed (don't implement Drop), but adding a Drop implementation to a type will force every instance's lifetime to extend to end of scope. (Though as in C++, temporaries are still destroyed at the end of the statement that created them, if they're not bound to a local variable.)

The only exception to this rule that I'm aware of is what you mentioned about move semantics: Moving a value means that its destructor will never run. That's the big difference from C++. Everything else to do with destructors is very similar, as far as I know.

tialaramex · on Jan 17, 2022

To my mind, move semantics being "the only exception" is a pretty bad joke. Unlike C++ Rust's assignment semantics are moves. So, you're not opting in to anything here as with C++ move, this is just how everything works.

For example, if you were to make the second a_string mutable, and then on the next line re-assign it to yet a third string containing "quux", the "bar" string gets dropped immediately, as a consequence of move semantics again.

In C++ you'd have to go write a bunch of code to arrange that, although I believe the standard library did that work for you on the standard strings - but in Rust that's just how the language works, you assigned a new value to a_string so the previous value gets dropped.

oconnor663 · on Jan 18, 2022

> In C++ you'd have to go write a bunch of code to arrange that

I don't think it's quite that bad. If you define a new struct or class that follows the "Rule of Zero (or 3 or 5)", the copy-assignment and move-assignment operators will have reasonable defaults. For example, the following Rust and C++ programs make the same two allocations and two frees.

Rust:

    struct Foo {
        m: String,
    }

    fn main() {
        let mut x = Foo {
            m: "abcdefghijklmnopqrstuvwxyz".into(),
        };
        x = Foo {
            m: "ABCDEFGHIJKLMNOPQRSTUVWXYZ".into(),
        };
    }

C++:

    struct Foo {
      string m;
    };

    int main() {
      auto x = Foo{"abcdefghijklmnopqrstuvwxyz"};
      // Foo's default move-assignment operator is invoked on the temporary.
      x = Foo{"ABCDEFGHIJKLMNOPQRSTUVWXYZ"};
    }

The high-level "you assigned a new value so the previous value gets dropped" behavior is indeed what's happening, and it's automatic in most cases. But when we do voilate the Rule of Zero and override the default constructors/operators, things get quite complicated, and it's easy to make mistakes. (Also in general we often get more copies than we intended, when we're not dealing with temporaries.)

The "moves are implcit and destructive, and everything is movable" behavior in Rust is substantially simpler and often more efficient, and personally I strongly prefer it. But I'll admit that trying to contend with destructive moves without the borrow checker would probably be painful.

meithecatte · on Jan 17, 2022

That is a common misconception. The destructor runs at the end of the scope. Try putting a println in a drop impl.

tialaramex · on Jan 17, 2022

If you do this with C++ destructors, they will sure enough fire at the end of the scope. Even if your String is long gone, the destructor fires anyway, destroying... a hollowed out String left behind to satisfy the destructor.

But go ahead and try it in Rust, your print doesn't happen because nothing was actually dropped. The String was moved, and so there isn't anything to drop.

https://gist.github.com/rust-play/4bbcc2a4efb641e578e84a1962...

cpuguy83 · on Jan 16, 2022

In this case you wrap the loop in an anonymous function if you need it to be cleaned up within the scope of the loop. Or move the functionality to another function.

knome · on Jan 16, 2022

Adding a statement that would require a dynamic allocation for every iteration of a loop is kind of insane for C.

It doesn't matter what the defer does, if it's allocated in a for statement and doesn't go off until the surrounding function exits, then it's going to have to stuff tons of function pointers and parameters somewhere, presumably alloca'd onto the stack over and over.

That's not C. That shouldn't be C.

There are a dozen languages for doing clever dynamic magical things in code. C is still fairly straightforward. Use one of the other languages instead of bagging down C with complexity until it turns into another difficult C++ variant over time.

skybrian · on Jan 16, 2022

It seems like you could just ban that. Defers aren't typically used within loops, so why support it?

clysmic · on Jan 16, 2022

Sure they are, if there is something that is being acquired at the start of the loop and needs to get released at the end...

skybrian · on Jan 16, 2022

...of the function? You could append to a slice, after setting up a defer to free everything in the slice.

...of the loop? Call a function.

meithecatte · on Jan 17, 2022

...or just make defer run at the end of the scope.

AlbertoGP · on Jan 16, 2022

I list a few defer-style approaches for C that I know of, including the previous version of this one, in the “Related work” section of my block-based defer for C (presented in HN last August https://news.ycombinator.com/item?id=28166125):

https://sentido-labs.com/en/library/cedro/202106171400/#defe...

camgunz · on Jan 17, 2022

Why is this feature entangled with lambdas? Why is defer not a block which is sugar for a goto? Probably the answer is "we want capturing" but this only exists in Go because of its defer scoping weirdness. It also doesn't make the feature foolproof, as in what if I call fclose() outside the defer lambda? Also lambdas and capturing aren't free? What allocates space for them? What is going on with this feature?

camel-cdr · on Jan 17, 2022

Because there are multiple defer proposals and this one is made to be compatible with the lambda proposals. N2589 for example proposes a simpler defer statement that just defers an expression, so e.g. "defer fclose(f);"

camgunz · on Jan 17, 2022

Ahhhhh thank you. I had a minor freak out, thanks for the good info!

arka2147483647 · on Jan 17, 2022

Goto may cross variable initialization, so the code in goto may use a variable that does not exist, theoretically.

Also a goto cleanup becomes complex if you have, say a dozen variables to cleanup.

Anyway, i think this is really syntaxtic sugar for the Cpp destructor. I peresume that the lambda will create an object, which will hook itself to the same mechanism that calls cpp destructors at the end of blocks.

And, as all major c compilers are also cpp compilers this seems a good idea to me. And ofcourse the lamda syntax seems same as cpp. So rhats where all this comes from.

marcodiego · on Jan 16, 2022

The new defer proposal has the potential to make code easier to read, remove some uses of goto's and help to fix some resource leaks that are so common on old non-GC languages. Certainly a feature I'm rooting for.

The only other proposal I'm more interested in are the lambda approaches: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2890.pdf

unwind · on Jan 16, 2022

In all of those examples that talked about which pointer value to capture, it would perhaps have helped to use the built-in (standard, although in my opinion way too few programmers use it) way of indicating that a value won't change, i.e. const:

    double * const q = malloc(something);
    /* use whatever magic new syntax here, 'q' will not change in this scope */

That wasn't so hard, in my opinion. They could simply let 'defer' fail if the referenced pointer isn't constant. Also, as a micro-wtf I was confused by the use of 'double', for a generic allocation I would have expected void.

foxfluff · on Jan 16, 2022

So then you can't for example use defer with a pointer that may get realloc'd in a loop? Or pointer to the root node of tree-to-be-built? What a half-assed defer, in my opinion.

unwind · on Jan 17, 2022

Sure, that's a valid point. I confess to being not at all used to thinking of/using 'defer' so I don't have a good grasp of the use cases. Thanks.

foxfluff · on Jan 17, 2022

No problem. Another point I just thought of is that it is very common to have functions that allocate resources and only free them if an error happens; otherwise said resources are returned to the caller or otherwise retained, and not freed.

I don't know how well the proposed defer would work here but as long as you can look at actual variables (and not constants), you could do something like this:

    int error = 0;
    void *resource = malloc(N);
    defer [&]{ if (error) free(resource); }

    if ((error = do_stuff(resource)) != 0)
        return NULL; // something went wrong, resource freed

    if ((error = do_more_stuff()) != 0)
        return NULL; // something went wrong, resource freed

    return resource; // all ok, let the caller keep it

daneelsan · on Jan 16, 2022

Some say they don't like it because it it's implicit control flow, i.e. I don't like that my code is being put into the end of the function without it being in the end of the function. I mean OK, but that's what for loops so right? The i++ is put at the end of the "while" loop, together with the exit condition. I think the more important problems are: 1) why be function scoped and not block scoped? 2)why should it be so tied up with lambdas?

1) I don't se why they chose function over scoped so please enlighten me 2) the proposal said there were debates whether defer free(ptr) refers to the ptr where the defer first appears, or to the current ptr that appears when the defer block is being executed. As someone mentioned, gotos already work in the latter way. Same goes with i++ in a for loop, I could do whatever I wanted with the i inside the for, and the i++ or the exit condition would use the latest value of i, not the value at the start of the block.

laserbeam · on Jan 17, 2022

> Some say they don't like it because it it's implicit control flow, i.e. I don't like that my code is being put into the end of the function without it being in the end of the function.

Wait, what? The whole point of defer is to let you do that. People like defer because it's different control flow. If that's why X doesn't like defer, X shouldn't use it. No big deal.

daneelsan · on Jan 17, 2022

Yeah, I do want some kind of defer.

camel-cdr · on Jan 17, 2022

> I don't se why they chose function over scoped so please enlighten me

I don't like it either, but it allows for a simpler syntax, consider e.g. "f = fopen(...); if (f) { defer fclose(); }". You'd need special syntax for the defer to escape the block. Although this example is a bit forced, because usually you'd return from the function on error anyways.

colonwqbang · on Jan 16, 2022

Why are lambdas needed for this? The standard C idiom used for this since forever ("goto out") does not use lambdas.

mpyne · on Jan 17, 2022

The paper seems to go into this.

* There is no generic way to say whether a defer should free up a variable based on the value it has when the defer was encountered, or a potentially-different value the variable might have when the defer is actually run. * So the best thing to do seemed to be to give the programmer the ability to make the choice. * Lambdas are the means they chose to make that happen.

But this all seems to stem from a desire to enable a use case beyond what an obvious use of a goto idiom would support.

camel-cdr · on Jan 17, 2022

Because there are multiple defer proposals and this one is made to be compatible with the lambda proposals. N2589 for example proposes a simpler defer statement that just defers an expression, so e.g. "defer fclose(f);"

daneelsan · on Jan 16, 2022

Huh that's a good point.

josephcsible · on Jan 16, 2022

I don't like the nonlocality of this. When I see a closing curly brace, I want to be able to tell what execution will do when it gets there just by looking immediately before the corresponding opening curly brace, but now I'm going to have to look in the entire body of the block too. If I wanted things to be this implicit and nonlocal, then I'd have chosen C++ instead of C.

laserbeam · on Jan 17, 2022

Except, when you start writing code with defers half your cleanup flags disappear, you end up with fewer branches and it makes it much harder to forget to free temporary allocations. It's really nice to look at only 1 place in code where something was allocated to ensure it also gets deallocated, instead of looking at the curly brace and then thinking: ok, what should be deallocated here, and when should it not be?

jibalt · on Jan 17, 2022

Is it April 1 already?

"Whereas gcc’s cleanup attribute is attached to functions, POSIX’ cleanup functions and the try/finally are attached to possibly nested blocks.

This indicates that existing mechanism in compilers may have difficulties with a block model. So we only require it to be implemented for function bodies and make it implementation-defined if it is also offered for internal blocks.

Nevertheless, we think that it is important that the semantics for the feature are clearly specified for the case of blocks, such that all implementations that do offer such a feature follow the same semantics. Therefore it is also a constraint violation to ask for the feature for blocks on implementations that don’t have support for it."

To call this incompetent would be too kind. Completely wrong about the scope of gcc's cleanup attribute, wrong to omit the prior art from Dlang, utterly wrong about what it "indicates", wrong about making it implementation-defined ... and that's just this one section.

Animats · on Jan 16, 2022

That doesn't belong in C. C doesn't have much in the way of implicit execution. C++ does, and C++ already has destructors.

greenn · on Jan 16, 2022

If the purpose of defer is to replace the `goto single_exit`, I think the value captures are unnecessary. Any single_exit I've implemented uses the value at the time of function exit, so that I can realloc in the middle of the function.

It would be a shame if this defer was limited to function scope. It would be very useful in nested blocks as well. But, I would still appreciate it.

defer and auto are the only things I would love to see in C.

ohCh6zos · on Jan 16, 2022

Defer is a bad feature in Go and I'm hesitant to see it spread to other languages. It exists as a hack because error handling in Go is convoluted.

halpert · on Jan 16, 2022

Being able to close an open resource when a function returns is generally useful. It has nothing to do with error handling.

leni536 · on Jan 16, 2022

Defer is weird in Go, being tied to function scope instead of immediate block scope.

assbuttbuttass · on Jan 16, 2022

Function scope allows constructs like

    for _, lock := range locks {
        lock.Lock()
        defer lock.Unlock()
    }

To acquire locks in a loop and release them at the end of a function. Otherwise the previous lock would be dropped each time the loop executes.

I've never seen the advantage to having a block structured defer: it's easy to add a new function, it's not always possible to remove a block.

pcwalton · on Jan 16, 2022

The semantics of block-scope defer are much simpler, easier to understand, and faster. The compiler always statically knows which defers execute at any given point, which aids optimization. With function-scope defer, the semantics are extremely dynamic, requiring bookkeeping of a runtime stack of defer thunks, and compilers have a hard time optimizing it.

Function-scope defer is something that surprises everyone I explain it to. Programmers naturally expect defer to be block-scoped.

kgeist · on Jan 17, 2022

>With function-scope defer, the semantics are extremely dynamic, requiring bookkeeping of a runtime stack of defer thunks, and compilers have a hard time optimizing it.

As far as I know, most functions have only one defer, and Go optimizes such cases quite trivially, by inlining calls to the deferred function at every function exit at compile time, without managing an additional stack. If there are several defers, then yes, the slow path is used.

jibalt · on Jan 17, 2022

It's very common for a function to have multiple defers ... but defers in a loop are a misuse of the feature.

leni536 · on Jan 17, 2022

How does it handle a single defer in an if?

Thaxll · on Jan 20, 2022

You mixed up the two I guess?

dzaima · on Jan 16, 2022

..Except that in C that'll turn into stack allocations and will easily segfault due to a stack overflow. So you can't actually use it like that in practice (unless you can guarantee your loop will be called a small number of times). You probably even won't notice while writing the code, and will just get random segfaults wherever you have a defer in a loop when you hit a large enough iteration count. Never mind it being very inefficient while it doesn't.

gmfawcett · on Jan 16, 2022

Well, we are earnestly analyzing a silly little example program, but okay :). The equivalent C code wouldn't produce stack-allocated mutexes unless that's what the programmer wanted. E.g. the POSIX pthread functions don't care where your mutexes are allocated, since they are always passed by reference.

dzaima · on Jan 16, 2022

It's not the mutexes that'd be stack-allocated, but the list of things to call back to at the end of the function. The locks list could be modified or freed by the end of the function, but something still must hold the list of things to deferred-unlock.

The pthread_cleanup_push/pthread_cleanup_pop thing presumably keeps its own heap-allocated vector, backed by malloc or something. C itself can't willy nilly heap-allocate, so that list will be on the stack. But the stack is tiny compared to how long loops can be. Hence stack overflow.

gmfawcett · on Jan 16, 2022

C libraries -- including the runtime implementations of features like this -- can heap allocate just fine. They just need to return pointers on the stack to the heap allocated values, either directly or indirectly (e.g. buried in a struct return value). As is always the case with C, the burden to free the memory is on the caller: no problem.

Given a possible implementation, this loopy mutex example could have a tiny stack footprint: a single pointer to the shared cleanup() function; and a single pointer to the head of a (heap allocated) linked list of pointers to mutex (i.e., the function arguments). And the function pointer would not necessarily require allocation at all, as we can statically point at the function definition here. So we are down to a single word of stack allocation.

pcwalton · on Jan 16, 2022

Who's going to construct the linked list, and where does it live? That's what the parent comment is pointing out.

In the general case I see no alternative to either the compiler generating one alloca() per defer or heap allocating defer callbacks. Both are terrible solutions for C, because alloca can overflow, while heap allocations can fail with an error code and defer has no way to catch that error. Besides, C programmers just won't use the feature if it requires allocation out of performance concerns. Block-scoped defer is the only reasonable semantics.

gmfawcett · on Jan 16, 2022

Same question in return: who's going to alloca() or heap-allocate the defer callbacks? How is that substantively different from maintaining a linked list? As soon as compiler support is on the table -- i.e. we're not limited to using some too-clever cpp macrology and a support library -- then virtually any implementation is possible. There's obviously more than one way to do it.

> C programmers just won't use the feature if it requires allocation out of performance concerns.

I agree that many C programmers wouldn't touch the feature for performance reasons. But let's not pretend that every C program is a video driver, a AAA game or a web engine. Many, many large C programs would benefit immensely from `defer` semantics -- otherwise, why would the GCC feature exist -- and they are performance-tolerant enough that a little heap allocation would be a reasonable tradeoff for increased safety.

But I'm not really defending `defer` in the first place...

> Block-scoped defer is the only reasonable semantics.

I agree with you completely. :) I was never defending function-scoped `defer`, but answering claims about the necessity of stack allocation. There are possible defer implementations that wouldn't blow the stack: that's my only point.

pjmlp · on Jan 17, 2022

> But let's not pretend that every C program is a video driver, a AAA game or a web engine.

While true, the culture in C and C++ circles pretends otherwise, hence why we have unsafe by default and safety as opt-in in those languages.

The sky would fall if we lose that 1us.

gmfawcett · on Jan 17, 2022

Let's hope that they sky doesn't fall. :)

It's healthy for culture-conscious programmers to reflect, now and then, on just how small a segment they represent. Cultured programming is fine, but it's like opera: the ordinary programmer recognizes a few of the tunes, but they don't sing along. It's hard to appreciate just how much uncivilized business code is out there, when nobody is getting HN likes for keeping that 1980's ERP running.

foxfluff · on Jan 16, 2022

> Besides, C programmers just won't use the feature if it requires allocation out of performance concerns.

Nevermind embedded platforms where heap might be unavailable or just so scarce that its use beyond early initialization is strictly verboten. Or interrupt handlers where you simply can't call an allocator.. Block scoped defer could still be useful on such systems (e.g. with locks).

gmfawcett · on Jan 16, 2022

Of course. And if you were writing your embedded system in C++, you'd avoid EH and other non-zero-cost features. That doesn't mean that these features aren't useful in other contexts. I submit that the world of C and C++ programming is vastly larger and more diverse than the world of interrupt handlers and embedded systems.

...But as I pointed out in a sibling comment, I was never defending function-scoped defer. I agree with you. I was pointing out that the implementation of such a feature wouldn't require excessive stack allocation.

dzaima · on Jan 17, 2022

defer, as proposed here, isn't a library feature though, it's a part of the core language. I'd like to be able to use it, but if it can ever do a malloc (which is horrifically slow compared to not having one), it's just infeasible.

But I looked through the spec again, and it actually just says that defer outside the top level is implementation-defined, so this is irrelevant anyway.

gmfawcett · on Jan 17, 2022

I admit that I didn't read the actual article. :) I was using "library" in a broad sense, just to mean the runtime code that you didn't have to write yourself.

dzaima · on Jan 17, 2022

C doesn't exactly have "runtime code that you didn't have to write yourself" though. There's libc, but you can easily disable it, and having any form of defer be unavailable then is just bad. Everything else comes from your code, the headers it includes (which don't contain function definitions), and statically linked things.

gmfawcett · on Jan 17, 2022

Sure. And just like libc, you could easily disable (by not using) a theoretical libc_defer that provided defer semantics. Kind of like libm, you use it when you need it.

This has been a fun conversation, and I really enjoyed chatting with you about this. :) Take care.

assbuttbuttass · on Jan 16, 2022

That's a good point. I was wondering about Zig's defer, which has a block scope, and I suspect it's exactly because of this issue.

Go can allocate defers on the heap, but that's a different story.

jcelerier · on Jan 16, 2022

> To acquire locks in a loop and release them at the end of a function.

that looks like a super big gotcha, hasn't there been enough bugs with alloca() being called in a loop yet to show how risky and unintuitive this is ? block-scoped is explicit and explicit is good

kaba0 · on Jan 16, 2022

Good implementations of defer are scope-based, so those would be clean up at the end of the loop pretty explicitly.

gaganyaan · on Jan 16, 2022

That code is odd (though I assume idiomatic Go). I'd much rather have block scoping with something like this:

    for _, lock := range locks {
        lock.Lock()
        // Do something with lock
    }

And I don't really do Go, but in Rust, if I wanted to lock some arbitrary list of locks for a whole function, it would just be something like this at the top:

    let _ = locks.iter().map(|l| l.lock().unwrap()).collect::<Vec<_>>();

duped · on Jan 16, 2022

Named scopes would alleviate this problems, and I wish more languages would adopt them.

boardwaalk · on Jan 16, 2022

How often do you do something like that? Seems pretty rare. If you're managing locks maybe having an actual collection for them makes sense.

If I had to choose between allowing that pattern and allowing reasonable usage in all control constructs, I'd choose the latter.

Someone · on Jan 16, 2022

I can see one use that when using lock ordering to prevent deadlocks (http://tutorials.jenkov.com/java-concurrency/deadlock-preven...), but then, the locks to be taken are a fixed set, and one probably would have function take_locks(lock *) and release_locks(lock *), and one could do defer release_locks.

Also, Wikipedia (https://en.wikipedia.org/wiki/Deadlock, https://en.wikipedia.org/wiki/Deadlock_prevention_algorithms) doesn’t seem to know about it. Even though I expect/guess it to be popular in embedded work, that makes me wonder whether lock ordering is used much.

Could also be an omission in Wikipedia. It has https://en.wikipedia.org/wiki/Banker%27s_algorithm, which I hadn’t heard of, and that Wikipedia says of

“In most systems, this information is unavailable, making it impossible to implement the Banker's algorithm. Also, it is unrealistic to assume that the number of processes is static since in most systems the number of processes varies dynamically”

KerrAvon · on Jan 16, 2022

I see this code and all I can think is that you’re going to be spending the rest of your life debugging threadsafety issues. I can contrive a case where this is useful, but where in the real world?

ohCh6zos · on Jan 16, 2022

You're right, that is a better way to put it than what I said.

bboozzoo · on Jan 16, 2022

What in your view makes defer a bad feature of Go? Maybe my bar is low, but each time I jump back to C I wish I had defer and end up abusing __attribute__((cleaup)) instead.

SV_BubbleTime · on Jan 16, 2022

> error handling in Go is convoluted

Is it worse than in C that there is no concept at all? Is it better that everyone is on their own to do it their own way every time?

I’m not above making the eaiser, but to your point when I see their example of:

  double* q = malloc(something);
  defer [qp = &q]{ free(*qp); };

That doesn’t look like the C that I know.

pjmlp · on Jan 17, 2022

The C you will get to learn is also planning to have lambdas, and they are being based on C++ design rather than Apple's blocks, so the example assumes they will also be in C2X.

laumars · on Jan 16, 2022

It’s already exists in other languages and it exists because a function could have multiple different exit points. Even if Go had different error handling there might well be different exit points. Hence why defer isn’t a Go-specific feature.

kgeist · on Jan 16, 2022

>It exists as a hack because error handling in Go is convoluted.

What's the alternative, though? Sprinkle your code with try..finally's? RAII like in C++?

w4rh4wk5 · on Jan 16, 2022

RAII is fine. If it's a language feature, APIs can provide that to you so you don't have to write RAII wrappers all the time.

Otherwise a generic _bracket_ operator could be introduced, similar to Python's `with` statement. In C it might be a bit hard to parameterize, but even if it just requires you to use a void, it's still an improvement vs. not having anything at all. (Talking about language features so `__attribute__((__cleanup__))` doesn't count.)

Imagine something like this:

    FILE* handle = bracket(fopen("data.bin", "rb"), fclose) {
        fread(..., handle);
        return;
        fwrite(..., handle);
    } // <-- implicitly calling fclose(handle), always

Edit: After writing this sample I could even an optional `else` block in case `fopen` returned `NULL`.

Edit2: Of course, syntax fine-tuning would be welcome. For instance, `handle` should be restricted to the bracket's scope.

qsort · on Jan 16, 2022

Context managers, or an Auto-Closable interface with syntactic support.

Also why is RAII bad? It's an awesome feature in C++.

kgeist · on Jan 16, 2022

>Also why is RAII bad? It's an awesome feature in C++

I didn't mean it's bad (I used to be a C++ developer myself and enjoyed RAII a lot), just wondering what are the alternatives that the OP doesn't consider "hacks". RAII would require to introduce constructors/destructors in the language, with all the gotchas (and probably you'll want a full-fledged OOP after that), which is apparently against Go's design principles as a simple language.

>Auto-Closable interface with syntactic support.

I don't see much difference here in practice; the whole difference is that in C#, for example, you use "using" on a whole object, while in Go it's a "defer" on a specific method of the object (or a standalone function). You are not limited to a single method and can use it on any method you deem necessary.

Auto-closeable/RAII, however, is less flexible in ad hoc situations specific to a certain function (you have to define dummy classes just to make sure a function is called no matter what), Go allows to use "defer" on a lambda. Auto-closeable also ties control flow to an object, which makes sense in an OOP-focused language, but Go isn't one.

woodruffw · on Jan 16, 2022

I agree with most of your points, but I wanted to also point out that you don't need C++'s ctor/dtor spaghetti to have useful RAII: Rust achieves it without even having first-class constructors (and opt-in custom destructors via Drop).

the_gipsy · on Jan 16, 2022

Result types

kgeist · on Jan 16, 2022

Defer is used to automatically release resources on function exit, I'm not sure how result types will help here. Can you give an example?

the_gipsy · on Jan 16, 2022

It's an alternative to try...finally or RAII.

Thaxll · on Jan 17, 2022

Defer has nothing to do with error handling. Defer is a great feature in Go, I've never seen someone complain about it.

laserbeam · on Jan 17, 2022

Defer CAN help with error handling, but you need other language features for that. You need well defined error types or exceptions in some way or another.

Zig has errdefer which only runs if an error occurred below the statement within that scope. It allows you to always keep cleanup locally, but you can still handle errors for your business logic somewhere else.

errdefer is cool, but I don't think C should support it. Mainly because you'd need to spec errors at a language level, and doing that today is probably impossible.

david2ndaccount · on Jan 16, 2022

I usually use the single-exit-point idiom with gotos to handle cleanup. I now wonder if there is a compiler that helps you enforce that so you don’t randomly stick a return in the middle of your function.

Anyway, this would be great to have in C as it would simplify how resources are handled.

loeg · on Jan 16, 2022

You can use the non-standard, but typically implemented, cleanup attribute today for function-scoped cleanup. I don’t know if compilers inline this kind of “destructor”. It works well — I’ve seen it used successfully at multiple employers.

rwmj · on Jan 16, 2022

Not only inline it, but also optimize it away (eg. _cleanup_free char *ptr = NULL; the cleanup path just disappears in places before ptr is assigned to a non-NULL value).

loeg · on Jan 16, 2022

Great!

95014_refugee · on Jan 16, 2022

You can use the cleanup attribute and a flag to detect early exit.

hgs3 · on Jan 17, 2022

Yes! I've been waiting for C to get a defer statement. However, my opinion of this proposal hinges on whether the implementation of lambdas requires runtime support. Not all systems running C have access to the C standard library, much less a runtime to manage lambdas (depending upon how they are implemented). Ideally defer would entirely be handled at compile time.

dataangel · on Jan 16, 2022

It's annoying to see everyone adopting defer (Jai, Go, Zig) and not destructors. Destructors are strictly more powerful. Defer only lets you delay something based on scope, while destructors let you delay based on object lifetime, which might be tied to scope (for an object allocated on the stack) but could also be on the heap. With destructors you can have a smart pointer type where you can't screw up the incrementing and decrementing, with defer you cannot do this.

naasking · on Jan 16, 2022

Destructors are a lot more complicated for that reason, and the semantics may not always be clear.

von_lohengramm · on Jan 17, 2022

Destructors bring about complications with lifetimes and move vs copy semantics among other things. Simple scope-based defers are trivially understandable (if you don't abuse text replace macros...) and are mostly just syntactic sugar allowing for better, more maintainable code in almost all codebases. IMO seeing such an overcomplicated version of defer even being considered for C is embarrassing to say the least.

dataangel · on Jan 22, 2022

C++ and Rust both have both move and copy semantics and destructors. What's the issue?

typon · on Jan 16, 2022

This is actually a feature, not a bug. With defer the control flow is more obvious and predictable. That’s one of the reasons new languages are adopting it.

gsliepen · on Jan 16, 2022

But with defer you can easily get the order of things wrong, whereas destructors are automatically called in the correct order.

watt · on Jan 17, 2022

The advantage of "defer" is that it's explicit. It's right there, and it's crystal clear when it will run.

The "implicitness" and hidden nature of destructors is a disadvantage, it's the unpleasant way it can introduce behavior which might be surprising.

foxfluff · on Jan 16, 2022

Certainly if the only thing I want to do is to make sure a piece of code runs when I leave the block or function, defer is the most direct way to do it. I don't care for "strictly more powerful" if it requires something as roundabout as defining an (zero-sized) object with a destructor and allocating an instance of it..

monocasa · on Jan 16, 2022

Allocating doesn't mean heap allocation. A zero sized auto allocated object is free in C today.

foxfluff · on Jan 17, 2022

Sure, though that's missing the point.

I want "call this function later", not "allocate (and later deallocate and do stuff with) this object." Using objects with destructors is just a roundabout way to achieve the former and fails to express the intent directly. Even though it might have zero runtime cost, it's not free of conceptual baggage.

monocasa · on Jan 17, 2022

defer is nearly always 'I want to clean up this resource acquisition'. When it's not you should probably be using something else anyway. IMO destructors on objects representing lifetimes specify intent even better than defer does.

foxfluff · on Jan 17, 2022

Only if you shoehorn some kind of object model into things that aren't inherently so. For example, though you can "acquire" locks, it's really just flipping a bit of state that already exists somewhere more than it is granting you a new something. I prefer to call it just what it is, a state change; you don't "acquire" a lock, you just lock it. Performed by a plain old function call that knows how to manipulate said state. Don't force me to accept the conceptual baggage of objects for what is a pair of bit function calls that flip a bit. That would be too opinionated for a language that isn't all about shoving OOP down my throat.

defer for unlock() after a lock() makes a lot of sense.

monocasa · on Jan 17, 2022

> For example, though you can "acquire" locks, it's really just flipping a bit of state that already exists somewhere more than it is granting you a new something.

That's true of literally every resource acquisition, including malloc and fopen. It's all bits in memory.

And IMO, the defer model is forcing a conceptual baggage because of it's strict definition of when it fires. You can always simply return the object and keep the resource acquired into your calling function, pass the acquired resource to another thread, etc. Defer enforces much more of a code structure in order to guarantee cleanup.

foxfluff · on Jan 17, 2022

> That's true of literally every resource acquisition, including malloc and fopen. It's all bits in memory.

Hard disagree. Taking an action that causes a finite resource (fds, memory, ports, etc.) to be allocated and used up is fundamentally different from just flipping a bit in a resource that already exists. The lock doesn't come into being when you "acquire" it, the lock already existed. Nothing was allocated when you acquired it. There's no new resource. There's no "oops I don't have enough {RAM/fd/whatever} to lock the lock for you" if the lock exists already.

> And IMO, the defer model is forcing a conceptual baggage because of it's strict definition of when it fires.

Is automatic storage and block scopes conceptual baggage too then? I think that is only true if you reject the execution model that implies a stack. Even if C isn't a perfect model of what happens under the hood, it's a a reasonably thin veneer over how your machine executes the code. Forcing an object model is strictly adding new concepts that do not exist at the level of the bare CPU (or the abstract machine as defined in the C spec, for that matter). It's also not required for programming. In that respect it is superfluous and if you don't want or need it, I think it's fair to call it conceptual baggage. Again, I don't need objects, and my CPU doesn't need objects, we just need a function call (or just a bare expression) to manipulate some state. There's no need to add to that.

monocasa · on Jan 17, 2022

> Taking an action that causes a finite resource (fds, memory, ports, etc.) to be allocated and used up is fundamentally different from just flipping a bit in a resource that already exists.

The lock itself is the finite resource. Only one consumer can lock it at a time. And there are very much systems where semaphores are kernel objects that can run out (although they tend to be older systems that thankfully we've moved away from for the most part).

> Is automatic storage and block scopes conceptual baggage too then? I think that is only true if you reject the execution model that implies a stack.

RAII very much uses the stack model of functions and blocks at its core too; defer doesn't have a monopoly on that.

dataangel · on Jan 17, 2022

It's true it's less typing, but it's not too hard to engineer a macro on top of destructors to get defer behavior (C++ at least will let you define a type inside a function, so the macro can expand to a trivial type with a destructor containing the users code then make a dummy instance on the stack). It is however impossible to do it the other direction. Also no reason we can't have both.

yk · on Jan 16, 2022

I don't like it. The great advantage of C is the very simple virtual machine and therefore very easy to reason about code. This bolting on language features is the hallmark of C++, which don't get me wrong has advantages, but in case we want the advantages we can already use C++.

eptcyka · on Jan 16, 2022

The feature wouldn't introduce much complexity to the abstract machine model IMO, not any more than what cleanup labels already do.

yk · on Jan 16, 2022

Cleanup labels are just a conditional jump, it's explicitly in the code. The defer feature would introduce implicit guarantees by the compiler.

mort96 · on Jan 16, 2022

..And cleanup attributes aren't in C.

usrbinbash · on Jan 16, 2022

How does "defer" make it harder to reason about the code? Especially when the alternative is a bunch of resource-flag-tests and cleanup labels?

There is a reason why it was introduced in Go.

yk · on Jan 16, 2022

Because the compiler has to introduce code implicitly. The control flow is usually very explicit in C, with this feature it is less explicit.

foxfluff · on Jan 17, 2022

It's slightly less explicit, but at least it's still all explicitly contained within your function. For example, you don't have to open up the definitions for a dozen different classes and read their destructors to know what happens (if anything) when your function returns. (That's the kind of stuff that can make C++ nigh impossible to reason about.)

It seems barely more implicit than expression-3 in a for loop.

I don't feel like it's going too far for C.

usrbinbash · on Jan 17, 2022

So its better to make the control flow explicitly complicated with a bunch of labels and resource-flags and doing so inconsistently (because everyone has a slightly different view on how to implement it), than having the compiler do it in a consistent way?

Pragmatic point of view:

In 99% of cases, the developer doesn't care what the compiler produces and doesn't have to. A mechanism providing a complex control flow in the compiler output but is incredibly easy to read and reason about in the source code is useful in the vast majority of cases.

And for the 1% of cases where the programmer actually needs to know what the compiler produces, and / or full control over the control flow, the solution is simple: don't use `defer` and there, done, full control to the developer.

So where exactly is the downside?

jstimpfle · on Jan 17, 2022

You don't need to code with "a bunch of labels and resource flags", though. There are most often good ways to clean up in a reasonable way. Even the simplistic malloc()/free() works with straight line code, just initialize to NULL and you can call free() without even checking that the resource was acquired. It's also often valid to not clean up at all because the OS cleans up after process exit.

And besides, if resources aren't released in the same function you'd need to come up with a different way to clean up anyway.

usrbinbash · on Jan 17, 2022

> There are most often good ways to clean up in a reasonable way.

Yes, and if a defer keyword were to be introduced in C, all these reasonable ways could still be used by anyone who wants to, while we could let the compiler handle it when we don't want to, making the source easier to read and reason about.

pjmlp · on Jan 17, 2022

Until a signal gets raised.

jstimpfle · on Jan 17, 2022

The comment was about the generated code, not about arcane OS hacks that were introduced to add asynchronous notifications to otherwise blocking OS APIs.

pjmlp · on Jan 17, 2022

It is part of ISO C standard library.

dgellow · on Jan 16, 2022

defer has nothing to do with C++.

dahfizz · on Jan 16, 2022

C++ has try finally, which is the same thing. Instead of bloating C, anyone needing defer can use a subset of C++ that fits their needs.

isomel · on Jan 16, 2022

Maybe you're mixing up with SEH (Structured Exception Handling) which is a Microsoft extension to C that came even before C++

dgellow · on Jan 17, 2022

So we moved from “defer” to “using a different language, with different semantics, and use the non-standard finally”.

eMSF · on Jan 16, 2022

C++ has no finally blocks.

AnonC · on Jan 16, 2022

> C++ has no finally blocks.

Not in the standard, but there are compiler implementations (like Microsoft’s) that have added it.

jcelerier · on Jan 16, 2022

I just did a gh code search (__finally lang:c++), a hundred-ish instances shows up in ~12 repos mainly related to MS / .Net:

‎ sleuthkit/scalpel‎

‎‎ microsoft/service-fabric‎

‎ ‎dotnet/runtime‎

‎ ‎microsoft/Detours‎

‎ ‎dotnet/wpf‎

‎ ‎Chuyu-Team/VC-LTL‎

‎ ‎apache/logging-log4net‎

‎ ‎microsoft/Windows-classic-samples‎

‎ ‎microsoft/winfile‎

‎ ‎dotnet/llilc‎

‎‎ microsoft/DirectXShaderCompiler‎

‎ ‎dotnet/diagnostics‎

‎‎ SoftEtherVPN/SoftEtherVPN‎

‎ ‎microsoft/PTVS‎

‎ ‎aybe/Windows-API-Code-Pack-1.1‎

I don't think that it's in any way meaningful, it's literally unused given that it's been there for likely 20+ years.

kaba0 · on Jan 16, 2022

You mix that up with RAII.

cyco130 · on Jan 16, 2022

Isn't this scope guard: https://tour.dlang.org/tour/en/gems/scope-guards ?

rwmj · on Jan 16, 2022

Ugh please don't do this. Just standardize the existing __attribute__((cleanup)) mechanism which is already implemented by compilers and widely used in many free software projects.

10000truths · on Jan 16, 2022

__attribute__((cleanup)) is ugly because for some reason, the gcc folks decided that the parameter passed to cleanup needed to be a function pointer of signature void(void**) instead of the much more sane void(void*), with no way to allow implicit casting of the parameter to a function pointer of different type, so none of the basic cleanup functions (e.g. free, fclose) work with it out of the box - you need to pollute your codebase with one wrapper for each cleanup function that you want to use.

They should have made it so that cleanup took an expression block:

    void* foo __attribute__((cleanup({ free(foo); }))) = malloc(bar);

Or at least allow a statement expression returning a compile-time known function pointer:

    void* foo __attribute__((cleanup(({ void wrapper(void** p) { free(*p); }; wrapper })))) = malloc(bar);

Sure, it still looks ugly, but at least a single generic macro would be able to clean it up and make it look better.

rwmj · on Jan 16, 2022

Sure - I agree! But, it exists and it's widely used already. The C working group should concentrate on standardizing existing practice and not start striking out on unproven and weird new syntaxes and keywords.

eqvinox · on Jan 16, 2022

The "good" way to go about that would be to have a second "cleanup_value" attribute.

(Or, since the standard would be creating new names, have "cleanup" and "cleanup_ptr" instead. Assuming that this is a separate attribute namespace, which I believe it is[?])

FWIW, you do sometimes need the address of the variable. Particularly if it's some struct that is made a member of some container (e.g. linked list) temporarily - you need the original variable's address to unlink it.

kaba0 · on Jan 16, 2022

Oh my God, why do we put up with C at all?

gmfawcett · on Jan 16, 2022

What are you going to do about it? I suppose you could delete all the C software from your system. Or make a list of all the C programs that your system depends on, and persuade each of their maintainers to change languages.

Or you could just put up with C, like everybody else does, and be quietly thankful that the myriad layers of our modern, complex systems are being maintained by other people.

kps · on Jan 16, 2022

“this feature cannot be easily lifted into C23 as a standard attribute, because the cleanup feature clearly changes the semantics of a program and can thus not be ignored.”

rwmj · on Jan 16, 2022

That's a very weak objection. In any case since most programs are hiding __attribute__((cleanup)) in a macro (eg. glib's g_autoptr macro or systemd's family of _cleanup_* macros) you could use another kind of annotation.

The point here is that the C working group should be standardizing existing practice and helping the existing users of C, and not striking out making bizarre new syntax choices and keywords which are completely unproven.

nmilo · on Jan 16, 2022

Then it doesn't need to use the existing attribute syntax, which requires that attributes can be ignored. They can make a new syntax.

jibalt · on Jan 17, 2022

So don't lift it as a standard attribute.

Shadonototra · on Jan 16, 2022

i prefer D's approach

    scope (exit), scope (failure), scope (success)

https://tour.dlang.org/tour/en/gems/scope-guards

laserbeam · on Jan 17, 2022

> This indicates that existing mechanism in compilers may have difficulties with a block model. So we only require it to be implemented for function bodies and make it implementation-defined if it is also offered for internal blocks.

Can someone explain to me how a feature used for freeing memory, closing file descriptors, releasing locks and all other cleanup scenarios are now possibly entering undefined behavior territory? I get that's not the intent but... Ugh...

foxfluff · on Jan 17, 2022

Implementation defined is quite different from undefined...

But I agree, making this bit implementation defined is not cool. I don't have strong opinions on whether defer should be in C or not. It does seem like a potentially useful feature. Whatever they do though, I think they should do it right or not do it at all. The last thing I want is to bloat the language with half-assed features that you can't use half the time because they're not defined well enough. If people want implementation defined features, geez, just keep using compiler extensions until the WG can define a proper standard.

Apart from this bit, the proposal seems pretty decent to me. I kinda like the lambda syntax and the control it gives you over the value you capture. Just make it block scoped and don't leave it implementation defined.

nrabulinski · on Jan 16, 2022

“Simple” defer feature - proceeds to implement c++ closures and semantics just to have a defer keyword which invokes them at the exit point of a scope

karmakaze · on Jan 16, 2022

The Java AutoCloseable seems like a cleaner version of the MS __try/__finally:

  try (MyCloseable c = useSomething()) {
    // stuff
  }

Why isn't this the path explored? The main difference would be not having extra nested blocks. That seems relatively minor, no? It can be improved arg list:

  try (MyCloseable c1 = useSomething(), ElseThing c2 = elseThing()) {
    // stuff
  }

CodesInChaos · on Jan 16, 2022

This syntax relies on destructors, which C doesn't have.

karmakaze · on Jan 17, 2022

The Java implementation doesn't use destructors. The AutoCloseable interface only has the one close() method, so for C a callable function would do.

kaba0 · on Jan 16, 2022

One could pass in a cleanup function that will be called - but this is very similar to gcc’s cleanup attribute.

dandotway · on Jan 16, 2022

Features that seem like a good idea at the time often don't stand the test of time 20-30 years in the future. In the mid-90s Object-Oriented Programming was super-hyped so a bunch of other languages bolted on OO, such as Fortran and Ada. But now we have Go/Rust/Zig rejecting brittle OO taxonomies because you always end up having a DuckBilledPlatypus that "is a" Mammal and "is a" EggLayer.

A great strength of C is that if you want more features you just go to a subset of C++, no need to add them to C. C++ is the big, ambitious, kitchen-sink language. When C++ exists we don't need to bloat C.

Fortran was originally carefully designed so that people who aren't compiler experts can generate very fast (and easily parallelized) code working with arrays the intuitive and obvious way. But later Fortran added OO and pointers making it much harder to auto-parallelize and avoid aliasing slowdown. Now that GPUs are rising it turns out that the original Fortran model of everything-is-array-or-scalar works really well for automatically offloading to the GPU. GPUs don't like method-lookup tables, nor do they like lambdas which are equivalent to stateful Objects with a single Apply method.

Scientists are moving to CUDA now, which on the GPU side deletes all these features that Fortran was bloated with. Now nVidia offers proprietary CUDA Fortran which is much more in the spirit of original Fortran, deleting OO and pointers for code that runs on GPU. If the ISO standards committee didn't ruin ISO Fortran for scientific computing by bloating it with trendy features we could all be running ISO Fortran automatically on CPUs and GPUs with identical code (or just a few pragmas) and not be locked in to proprietary nVidia CUDA.

But GPUs are now mainly used for crypto greed instead of science for finding cancer cures or making more aerodynamic aircraft so maybe it all doesn't matter anyway.

bee_rider · on Jan 16, 2022

Yeah. I think I'm much less informed on this topic, but my initial thought on reading the "Rationale" section was that this sort of feature would only be helpful in cases where C offered almost no advantages over C++.

svnpenn · on Jan 16, 2022

> A great strength of C is that if you want more features you just go to a subset of C++, no need to add them to C. C++ is the big, ambitious, kitchen-sink language. When C++ exists we don't need to bloat C.

This is a rationalization, and a bad one. When your solution is "just pull in another programming language", you have a problem.

dandotway · on Jan 16, 2022

"Another programming language" cannot even meaningfully exist if all programming languages are forced to have the same feature set. Should Python get C-like low-level pointer manipulation so that Python users don't need to "pull in another programming language" of C to do pointer manipulation?

C doesn't need "defer" because C programmers have managed since the 1970s to implement operating systems, compilers, interpreters, editors, etc., just fine without it. Those who want a bigger C can use C++, this pond is big enough for two fish.

svnpenn · on Jan 16, 2022

> all programming languages are forced to have the same feature set

Good straw man there. Did I say all languages need to be exactly the same? This comment just looks like something you can fall back on to reject any feature addition to C. Its too bad really, as its sentiment like this that is killing the language. Many people are sick and tired of old, crusty C, where it takes close to a decade to add or change anything. I like the idea of a small, performant language, but when you put such a stranglehold on changes, you choke out most chances of innovation.

dandotway · on Jan 17, 2022

> I like the idea of a small, performant language

So the earliest C compilers were under 5000 lines of C+asm:

  https://github.com/mortdeus/legacy-cc

If you want a minimal "standard committee approved" C89 compiler then David Hanson's lcc and Fabrice Bellard's tcc both come out to over 30,000 lines. To understand C89 fully you at a minimum have to read a ~220 page (14,248 line) copy of the (draft) ANSI standard:

  http://port70.net/~nsz/c/c89/c89-draft.txt

I don't know what the smallest C23 compiler would be with all the new features since C89 added, but it's at the point where a single human can't implement a C compiler anymore. It's becoming a language only rich corporations have the wealth and power to implement and steer.

rossy · on Jan 17, 2022

On the other hand, some features turn out to be a very good idea and do stand the test of time. Designated initializers and compound literals, introduced in C99, are perfect examples of C features that stuck and became very widespread, while keeping the spirit of the language. C shouldn't be set in stone.

The fact that goto-based solutions and a non-standard GCC extension are common methods of resource cleanup in C today seems to suggest that a standardized language construct for resource cleanup would be appreciated.

> A great strength of C is that if you want more features you just go to a subset of C++, no need to add them to C.

What is C for then? Cleanup of function-scoped resources is a major concern in every large C codebase I've seen.

eps · on Jan 17, 2022

It's not an major concern though.

If one has trouble writing correct cleanup code conventionally (with "goto out" and a single function exit), then allowing them to use defer will only lead to more obscure issues.

And if defer is meant to make code slimmer, it still doesn't belong to C, because it leads to implicit execution and memory/stack allocation.

C is an explicit and verbose language. What you see is what you get. This is the spirit of the language. Unlike with, say, C++ where "a + b" may actually produce kilobytes of machine code, because + just happend to be overloaded.

rossy · on Jan 17, 2022

> If one has trouble writing correct cleanup code conventionally (with "goto out" and a single function exit), then allowing them to use defer will only lead to more obscure issues.

I've written countless functions in this style and I don't enjoy it. I think it's better than the other styles of resource cleanup in C, but it's not ideal. In this style, whenever I add a resource to a function, I have to go to the top, add the declaration (with a sentinel value,) then go to the out label, check for the sentinel value and conditionally destroy it. I'd much rather add the declaration, initialization and destruction of the resource all in one place. That would make it much harder to forget the destruction, for one thing.

> And if defer is meant to make code slimmer, it still doesn't belong to C, because it leads to implicit execution and memory/stack allocation.

I don't get the implicit execution thing, and I don't see how it's like that C++ example. The only code that executes is written in the function itself, inside the defer block.

foxfluff · on Jan 17, 2022

> I've written countless functions in this style and I don't enjoy it. I think it's better than the other styles of resource cleanup in C, but it's not ideal.

I've got almost a couple decades of C behind me, and I agree. The way we handle cleanup at present is not particularly difficult, but it feels irritating. I'd imagine most C programmers agree that the goto based cleanup handlers just happen to be the best we've got, and aren't necessarily ideal.

foxfluff · on Jan 17, 2022

> And if defer is meant to make code slimmer, it still doesn't belong to C, because it leads to implicit execution and memory/stack allocation.

I don't see why block scoped defer should cause any more memory or stack allocation than a goto based cleanup handler. It's just a different way to organize the source code. In some instances it might actually allow you to omit some local variables (that otherwise would have to be optimized out by the compiler).

> C is an explicit and verbose language.

It's relatively explicit, I agree. However, defer doesn't change that much. You still see exactly what code runs inside your function. The only real change is that code's location. It's not that different from putting expression-3 in your loop header and having it be evaluated implicitly when you reach the end of the body or do a continue. If you wanted to be explicit, you'd ban for loops and use gotos in a while loop to replace continue. Umm, be my guest, but I prefer the less verbose approach.

And that gets me to the second point... C can be surprisingly terse despite requiring you to be rather explicit, and that's one of the things I really like about C. If anything I'd love to see features that allow it to be even more terse.

> Unlike with, say, C++ where "a + b" may actually produce kilobytes of machine code

Oh, I agree. I really don't want tons of hidden code in C. However, the deferred block is still explicitly coded inside your function and not at all hidden from you someplace else. So it's not like you need to go spelunking through a pile of headers and class definitions to discover that there are destructors running SQL queries when your function returns.

In that respect, defer remains very explicit and transparent so I'm ok with it.

eps · on Jan 17, 2022

> block scoped defer

That's the thing. Block-scoped is a better option as far as the language "spirit" is concerned, but it's limiting (see below). Function-scoped is more useful, but when used in loops it may lead to unbound stack usage and that sorta goes against the rest of C, because no other _language construct_ comes with such lovely side effect.

Re: limiting - It's not uncommon for a function to need to grab some resource conditionally and then use it in the rest of the function code, e.g.

    void foo()
    {
        bar * b = NULL;
        if (x && y)
        {
            this();
            b = that();
        }
        ...
        baz(1, 2, b); // b may be null
        ...
        release(b);
    }

This can't be handled with block-scope defers. This needs function-scoped ones.

A better option would (probably) be to allow binding defers to a specific on-stack variable... but that's basically a destructor and that opens its own can of worms, not all of which as technical.

foxfluff · on Jan 18, 2022

It seems a bit limiting, yes, but this does not seem like a major limitation to me. Especially if we compare it to how existing practice with goto based cleanup handlers would work in this example. It doesn't really matter that the resource was obtained in a block, the variable holding a reference is still scoped to the function body and will be checked at the end just as it would be with goto.

    void foo()
    {
        bar * b = NULL;
        defer [&]{if (b) release(b);}

        if (x && y)
        {
            this();
            b = that();
        }

        if (something_gone_wrong())
        {
            return; // no problem, b gets released if it was acquired
        }
        ...
        baz(1, 2, b); // b may be null
    }

If making the release conditional seems a bit hacky, remember that you need that sort of thing anyway for the hugely common case where you allocate & initialize a bunch of things and then let the caller keep the resources, except if there's an error.. in which case you need to clean everything up. Without some additional language features (first class error types or "error returns", then error defers?) these conditions are unavoidable.

eps · on Jan 19, 2022

Sticking defer under the var declaration is clever, but it doesn't look an improvement in terms of the code quality to me. It trades verbosity of the "out:" pattern for the need to register the cleanup code before the acquisition code. That's just weird. It's not complicated, just... backwards. Almost like a solution in search of a problem :)

foxfluff · on Jan 19, 2022

Dunno, I feel like the goto out pattern is substantially more irritating any time you actually want to return a value from the function. I'd like to just return val instead of int val; /* ... */ ret=val; goto out;

mst · on Jan 16, 2022

> GPUs don't like method-lookup tables, nor do they like lambdas which are equivalent to stateful Objects with a single Apply method.

Since I tend towards read-only instance data, I often live my life considering an object to mostly be a bag of closures with a shared outer scope.

jcelerier · on Jan 16, 2022

Took a few years for C to adopt function prototypes from C++, one can hope that RAII makes it in less than a century

thechao · on Jan 16, 2022

I hope it's not mandatory. I'd much prefer static analysis to tell me of use-before-acquire errors: I have a number of hot-inner-loops where initialization nukes performance.

edflsafoiewq · on Jan 16, 2022

Can you give an example of how initialization nukes performance?

thechao · on Jan 16, 2022

In the context I'm thinking of, I have fairly sizable arrays -- up to 512 bytes; usually, only the first few bytes are used (for state tracking); an `int` tells me the high-water mark so the data is only read after being written. The amount of work int the loop is (in almost all cases) only 5 or 6 instructions before termination. Initializing 512 bytes is best case 8 ops, which is >100% overhead. The code is recursive, but bounded to a depth of 8, wity internal linkage (only two `int`s and a pointer are passed) in the tail call position, so the calling convention is just three registers. Even a compiler like q9x, or clang with no opts, produces excellent code. But, even a sniff of initialization tanks perf to the tune of 3x.

jcelerier · on Jan 16, 2022

> Initializing 512 bytes

RAII in C++ does not mean that all bytes are initialized, only the ones you want. In

    struct Foo {
      int header = 0;
      float foo;
      int data[64];
      int footer;

      // just to show that having an explicit constructor / dtor does not change anything
      Foo(): footer{456} { } 
      ~Foo() { }
    };

only "header" and "footer" will be initialized: https://gcc.godbolt.org/z/4W1WzfPEv

It's not a matter of optimization, absolutely no compiler will zero-initialize unless you explicitely ask for it (or you are in a situation where the language mandates initialization, like global static in C and C++)

dataangel · on Jan 16, 2022

> It's not a matter of optimization, absolutely no compiler will zero-initialize unless you explicitely ask for it

Oh man I'm about to ruin your day. Behold the horror:

https://stackoverflow.com/questions/29765961/default-value-a...

Gibbon1 · on Jan 16, 2022

Well don't know about the parent but you ruined my day.

Thing I keep coming back to is the proportion of effort you have to spend learning and keeping up with your tools vs the problem domain your working on. Feels like a big problem with C++ is how much energy the language itself uses up.

A friend of mine is, unlike me, really really smart. He likes golang and hates C++. And not like he hasn't spent years professionally writing C++ code. I think the advantage of golang is he can write it reflexively. So all of his attention is on the problem not the language.

Me I feel like the problem with C is not so much the language. It's that I'm always worried about stepping off a ledge. Things like defer would help. Because failing to clean up resources is a big problem with C.

dataangel · on Jan 17, 2022

The silver lining is that really 99% of the time it just doesn't matter. I just explicitly initialize everything to zero/nullptr (unless I have a specific reason to want a different explicit value), and I only worry about looking up the particular rules if I need them in a situation where I know initialization can be a significant cost like a gigantic array. It is annoying to look up every time that happens but it's not that often.

foxfluff · on Jan 16, 2022

But is failing to clean up transient resources that are only allocated for the duration of one function (or block) really a big problem with C? These tend to be the easiest ones to handle (but admittedly a bit annoying & verbose without defer) and spot if missing. (Also: static analyzers are relatively good at pointing out resources leaked in a single function.)

I think most of my functions that allocate a resource only release it on error; otherwise the allocated resource lives on and gets freed later by another function. Defer doesn't help at all here. It just helps with the trivial ones..

jcelerier · on Jan 17, 2022

This link is exactly what I'm saying, if you don't ask for it you don't get initialization. There are multiple ways to ask for it because the person who writes the type may eitherwant to take on the responsibility of initializing entirely by initializing on the ctor, or delegate it to the call site which can then either initialise (T t{};) or not (T t;).

The SO post looks confused ; A, B, D, E and F are the exact same case wrt the initialization of the int. It does not matter that there's a constructor or not and which shape it has, only that the variables get initialization somewhere or not (and that can be in the ctor's member init list, in the struct définition or even when creating a value if aggregate-initializing).

jstimpfle · on Jan 17, 2022

I'm sure you know what you're talking about wrt C++ spec, but you ignored the main point. It's easy to be confused by stuff like that. It's hard to read C++ code and know (other than intuit) what it does. For a lot of people including myself, at least.

jibalt · on Jan 17, 2022

Prototypes were in the original C standard of 1985, before first edition of The C++ Programming Language was released and long before C++ was formalized standardized.

Sindisil · on Jan 19, 2022

Oy. Your history is ... confused.

While the first draft was released in 1983, the first ratified C standard was in 1989.

The first edition of "The C Programming Language" (from which I first learned the language) was published in 1978, over a decade before the language was standardized.

You are correct that C89 added function prototypes. However, contrary to your implication, C borrowed the concept and syntax from C++.

jibalt · on Jan 21, 2022

Everything I said was correct. I was a member of X3J11 (which was established in summer 1983), the C language standard, and was the first person on the planet to vote to approve it (due to alphabetical order) ... that was in 1985. It was several years before it was ratified, the delay having to do with standardization politics which required a lot of buy-in. I have no idea why you're mentioning the first edition of K&R from years earlier, which of course did not have prototypes.

There's a notable difference between C's prototypes and C++'s ... `int foo();` in C is not a prototype, whereas in C++ it's equivalent to `int foo(void);` ... C had to maintain compatibility with K&R style declarations, whereas C++ didn't.

abaines · on Jan 16, 2022

I feel as though they should change the name of the paper, because this must be one of the most complex takes on 'defer' that I have seen.

They make a classic mistake of trying to solve a problem by adding more complexity. They cannot decide on by-value or by-reference, so they take the convoluted C++ syntax to allow specifying one (as well as other semantics?).

It does not fit with C in my opinion. It will also hurt third party tools that wish to understand C source code, because they must inherit this complexity.

I would prefer a solution where they simply pick one (by ref / by value) and issue a compiler warning if the programmer has misunderstood. For example, pick by-value and issue a warning if a variable used inside the defer is reassigned later in the scope.

tedunangst · on Jan 16, 2022

I think people are getting hung up on the lambda syntax, but it seems they're just taking what they're given. If c23 adds lambdas, at this point I'd say it's more likely to be c++ syntax than c syntax because c++ is already out there. So instead of "to resolve ambiguity we force user to choose" it's more like "the probable lambda syntax already makes this explicit so we will use it too." I think that makes more sense than two different syntaxes for lambda and defer, whatever the other merits of the proposal.

fshee · on Jan 16, 2022

Somewhat related: I hacked some macro up in gnu89 C to get a Go styled defer once upon a time. I felt bad for making my compiler course instructor review that code... Very convenient, though.