ISO C is increasingly moronic

ajross · on Dec 20, 2011

Doesn't seem like the author really thought some of this through:

FTA: "The <nostdreturn.h> file according to the standard shall have exactly this content: "#define noreturn _Noreturn" Are you crying or laughing yet ?"

It's a compatibility constraint. They can't simply add new reserved words to the language because it will break preexisting code (c.f. all the old C headers with idenfiers like "bool" or "class" or "virtual" that don't build in C++). The underscore-followed-by-capital convention happened to have been reserved by ISO C90, so that's what they picked.

But they allow that this is ugly, so there's a new C1X header file you can include in new C1X code that makes the identifiers look like you expect they should. Obviously old code won't include this header and is insulated from the change. (I guess if you wanted to nit here, having a separate header for each new feature seems dumb to me. Why not a single <c1x.h> or the like?)

Compatibility is important, especially for languages like C with staggeringly large bases of mature code. For an example of what happens when you break compatibility in the name of "progress" see Python3, which years on is still not installed by default anywhere, and probably never will be.

phkamp · on Dec 20, 2011

First of all: You're wrong, the compatibility is the other way around: The old code will have to include <stdnoreturn.h> if they used the "noreturn" compiler-specific keyword, while waiting for the glacial ISO WG progress.

Second: There are two kinds of compatibility: Forwards compatbility and backwards compatibilty.

There is a finite number of existing programs whereas the number of future programs to be written is unbounded and very likely to be much higher over time.

Therefore forwards compatibility is always, by definition, more important than backwards compatibility, and you should never penalize future programs and programmers for the misdeeds and sloppines of the past programs and programmers.

Besides: I have yet to see a compiler that didn't have flags to make it compile to a slew of older dialects of C, so if C1X happens to break your old code, you set such an option, or you fix your source code.

There, fixed it for you.

The "Backwards compatibility, no matter the cost" mentaility is costing us dearly in the quality of the tools we have work with in the future, while providing us no relevant new benefits.

Crap like <stdnoreturn.h> is just pointless ornamentation, cluttering our source code.

kmm · on Dec 20, 2011

No, you're wrong. The old code will have something like

typedef char bool

And thus compiling this code in C99 would not work if bool were a keyword of the language. That is why the boolean datatype is called _Bool, so it doesn't clash with old code. In new code, you can still use the bool datatype, but instead of defining it yourself with char or int, you should better use a typedef of the _Bool datatype in stdbool.h

alexchamberlain · on Dec 21, 2011

Your attitude is rediculous... Let code break! Make sure there is a __c1x macro and maintained projects will take 30 seconds to fix. Unmaintained projects will die, yay!

mcherm · on Dec 21, 2011

I will never purchase a compiler made by you or someone who thinks like you. Sure, it will take 30 seconds to fix your puny 2000 line program that's contained in just a handful of files all of which you control and with no code generation engines or text transformation steps in its build chain. But it wouldn't be that easy for everyone.

scott_s · on Dec 20, 2011

If you don't value backwards compatibility, then perhaps you should not use a 40-year-old language with a standards committee that does value it.

jasonwatkinspdx · on Dec 21, 2011

I see no reason why he shouldn't express his opinion on the committee's goals re backward computability.

Defending the status quo by saying "well duh, you should know better, take it or leave it" is just a way of rejecting someone's contentions without actually addressing them. In a reasoned debate that's indefensible.

If he's wrong, tell him why rather than telling him to take his ball and go home.

scott_s · on Dec 21, 2011

This is from the Rationale for an International Standard for C, in the intro where they state their guiding principles from 2003 (http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10...):

Existing code is important, existing implementations are not. A large body of C code exists of considerable commercial value. Every attempt has been made to ensure that the bulk of this code will be acceptable to any implementation conforming to the Standard. The C89 Committee 20 did not want to force most programmers to modify their C programs just to have them accepted by a conforming translator.

Note that this guiding principle was listed first. The author disagrees with one of their fundamental guiding principles. My point is that he then has a fundamental disagreement with the purpose of the standard, and for that reason, perhaps he is using the wrong tool.

If someone has the guiding principle X, and someone else has the guiding principle !X, I contend that difference is irreconcilable.

4ad · on Dec 24, 2011

Dennis Ritchie, C's creator, along with Ken Thompson, Unix' creator, and other Bell Labs people from the Computing Science Research Center thought the C standardization committee was wrong, so the Plan9 C compiler[1,2] is not ANSI compatible. Not only the standard library is nothing alike, the language itself is slightly changed in an incompatible way.

[1] http://cm.bell-labs.com/sys/doc/comp.html

[2] http://cm.bell-labs.com/sys/doc/compiler.html

dextorious · on Dec 20, 2011

tl;dr: if you don't value backwards compatibility, bend over backwards.

comex · on Dec 20, 2011

> The old code will have to include <stdnoreturn.h> if they used the "noreturn" compiler-specific keyword, while waiting for the glacial ISO WG progress.

Huh? stdnoreturn.h only makes sense with the new _Noreturn keyword. The old code can keep using __attribute__((noreturn)) or (if it was portable code) nothing at all, without including any headers.

> Crap like <stdnoreturn.h> is just pointless ornamentation, cluttering our source code.

I think C's "add seemingly randomly named include files to the top of your code to use different parts of the standard library" is pointless, and I'd like to see the whole standard library included by default (using some yet-to-be-invented more flexible PCH magic to ensure it doesn't slow down compilation). But in lieu of that, it's hardly a big jump from library to syntax - and if keeping backward compatibility isn't hard, why not do it?

tptacek · on Dec 21, 2011

I tend to agree with you about C include files, but observe that you can get more or less the same effect with a single "myproject.h" file included in your myproject-x source files.

shadowfox · on Dec 21, 2011

> C's "add seemingly randomly named include files to the top of your code to use different parts of the standard library" is pointless

But doesn't most languages do this? Like java's imports (only a really small subset is included by default) or python. I would have thought that namespace separation is a good thing.

comex · on Dec 21, 2011

I don't use Java, but Python has a reasonably large set of builtins, and the base modules like os and sys provide a lot more - C would require a longer list of includes. Still, I would like to be able to write something like "mmap::mmap" and have Python automatically figure out what to import, keeping the namespace separation but eliminating the redundancy of explicit imports.

(maybe pick something other than double colons, heh.)

lloeki · on Dec 21, 2011

    python -m this | head -4 | tail -1

Or use Ruby and ActiveSupport::Dependencies.

chipsy · on Dec 20, 2011

Maybe the ISO committee is conspiring to destroy C with mediocrity so that a shiny new systems language can take its place, correcting C's deeper warts in the process.

That would be nice, if only I believed it.

ajross · on Dec 20, 2011

Exactly. I've long since lost track of which shiny new systems language we're supposed to be using.

politician · on Dec 20, 2011

What do you think about D? ;)

zem · on Dec 22, 2011

D seems to be targeting C++ more than it does C.

angersock · on Dec 20, 2011

Therefore forwards compatibility is always, by definition, more important than backwards compatibility, and you should never penalize future programs and programmers for the misdeeds and sloppines of the past programs and programmers.

Ha. You're adorable.

While I do support the general "hey, let's fix old code and not be slaves to backwards compatibility", the fact of the matter is that legacy code / libraries / whatever in C aren't merely misdeeds/sloppiness--you might as well complain about having to use the same molecules of metal in your tools that our ancestors did building Rome instead of making your own from subatomic particles.

There is a finite number of existing programs whereas the number of future programs to be written is unbounded and very likely to be much higher over time.

But in the future, it's all Javascript/Ruby/CSS/HTML5! Why worry about writing more C code? ;)

dextorious · on Dec 20, 2011

"""While I do support the general "hey, let's fix old code and not be slaves to backwards compatibility", the fact of the matter is that legacy code / libraries / whatever in C aren't merely misdeeds/sloppiness--you might as well complain about having to use the same molecules of metal in your tools that our ancestors did building Rome instead of making your own from subatomic particles.""""

Only you have to use the same molecules of metal by the laws of physics, while compilers you can IMPROVE. Heck, we even produce new metal alloys and materials all the time. So, the analogy is 100% flawed.

angersock · on Dec 20, 2011

I've got one involving cars if you prefer. :)

teyc · on Dec 20, 2011

  The "Backwards compatibility, no matter the cost" 
  mentaility is costing us dearly in the quality of
  the tools we have work with in the future, while
  providing us no relevant new benefits.

That is only because you don't have a vested interest in old code bases. (Taking a rough guess here: you're under 30?)

phkamp · on Dec 20, 2011

You have absolutely no idea what, or rather: who, you are talking about: I'm the second most active committer to FreeBSDs kernel over the project lifetime and I'm the author of Varnish, I even have C-code running in ATC systems, heck when it comes to that: My code scrambles your password.

It's exactly because I am responsible for so much old code that I say we should not cripple the future for it.

teyc · on Dec 20, 2011

First off, I didn't realize. I'm sorry.

     I am responsible for so much old code that 
     I say we should not cripple the future for it.

I had to read that a few times. Are you advocating a clean break so that whatever that is future is clearly not called C or even purports to be backwards compatible with C? That would be difficult wouldn't it? It simply wouldn't get traction.

phkamp · on Dec 20, 2011

I'm not sure I have a coherent proposal. Clearly ISO-C is a cul-de-sac by now, and will have to be ditched. What we should replace it with is not clear to me.

It is a big problem that so many standards, requirements and tools (from EMACS to Coverity over lint) have their fingers in the C-syntax. The autocrap abomination is a further complication.

So I guess a good place to start is to swear that we will never touch or use the botched ISO-C thread API, and beat some sense into the WG14, by whatever means are available.

teyc · on Dec 20, 2011

Perhaps the only place for C to go is for a final consolidation specification, which will be frozen for 50 years. You might even remove some broken parts of previous specifications, but with 50 years of stability, it gives a lot of confidence that any work would be worthwhile.

A long period of stability may also make it easy for C to interact with other languages. I'm out of my depth here. Some standard specifications for pinning memory from garbage collection, if Scheme could specify TCO, perhaps C-consolidation release could improve its toolability for things like Coverity; A better autoconf; Makefiles; Standard compilation error codes; Something that would clean up all the rough edges in supporting C compilers.

nitrogen · on Dec 20, 2011

Do you think a new language (Go, maybe?) will take C's place, given your distaste for ISO's handling of the standard?

nitrogen · on Dec 21, 2011

Dead post from ballard:

ballard 3 hours ago | link [dead]

Java was an over-reaction to C.

Go was a reaction to Java back to almost center.

C is close to hardware, which is important for things like game engine cores, video drivers, varnish and the list of use-cases goes on and on.

Finally, people tend to complain about their tools even when those tools work. Be thankful if you didn't have to write the tool yourself. But if you would feel compelled to complain, write a better one yourself first instead. :)

@phk: When was the last time you wrote inline asm in C to solve any problem on an non-embedded system? Mine was 1996.

phkamp · on Dec 21, 2011

Last time I wrote inline ASM was 2 days ago. I'm a kernel programmer, remember ?

luser001 · on Dec 21, 2011

Just curious: what was it for? Link to public commit etc would also be ok. Thx. I was the impression that all the asm code was abstracted away in rarely-changing header file macros?

phkamp · on Dec 20, 2011

I doubt any language will replace C ever, given the penetration it has, but maybe we could get a better C2X than the crap C1X seems to be. It would require a significant fraction of programmers to agree with me that C1X is crap in the first place.

rbanffy · on Dec 21, 2011

I am mostly a Python programmer who still remembers C and I won't say "crap".

I will, however, say "ugly".

zem · on Dec 22, 2011

how true is it that one reason C is increasingly unlikely to be replaced is that cpus these days specifically optimise the performance of C-compiler-generated code?

tedunangst · on Dec 20, 2011

phk under 30? hahaha. Does freebsd count as an old code base?

If you direct your remarks at what's being said instead of who (you think) is saying it, you wouldn't miss so wildly.

Confusion · on Dec 21, 2011

I don't think that's entirely fair. Sometimes who someone is, is a decisive factor in what arguments they present. It can give them a valuable jolt of insight to point out that they present those arguments because they are missing part of the view. Of course you should also explain why the arguments in themselves are wrong, but the main argument may be "you've never experienced X; those that have tend to argue Y, because of Z"

alexchamberlain · on Dec 21, 2011

Age discrimination just makes you look like a fool.

tedunangst · on Dec 21, 2011

It's very easy to make those arguments just by saying "From my experience doing X, you're wrong because..." without speculating about what the other person hasn't done.

teyc · on Dec 25, 2011

on the contrary, I find communicating in this way a rather reliable method of talking in short hand. I was happy to be wrong, since phk illuminated another reason why some people may advocate breaking compatibility, and it is easy to get to his core arguments once his credentials were established.

aliguori · on Dec 20, 2011

I don't agree with your assertion that backwards compatibility isn't important, but I would like to point out that the very fact that C compilers allow you to use identifiers that are in the form _Word means that they are breaking backwards compatibility by introducing new keywords.

Many of the libraries on a typical Linux system (anything glib/gtk based, libxml2, etc.) use _Word extensively.

breadbox · on Dec 20, 2011

Initial underscore followed by a capital letter is a reserved "namespace" since 1990, as a previous commenter noted. (And for the record, identifiers that begin with two underscores are reserved for use by the implementation, and identifiers followed by a lowercase letter are guaranteed to be unreserved as long as they are file-scoped symbols, and not globally visible.)

aliguori · on Dec 20, 2011

Yes, I realize this, but since it's only "reserved" in the specification and the compiler doesn't enforce this, there is a massive amount of code that does this today.

The spec could contain, "don't add any bugs into your C programs" but that doesn't magically mean that everyone would write bug free code.

Reserving a namespace and not having a mechanism to enforce it was a massive fail on the part of the standards committee.

plq · on Dec 20, 2011

> Reserving a namespace and not having a mechanism to enforce it was a massive fail on the part of the standards committee.

How do you expect the compiler to enforce the "identifiers that begin with two underscores are reserved for use by the implementation" rule? Some things just belong in the documentation.

aliguori · on Dec 21, 2011

If the goal is to allow the introduction of new keywords, then it's fairly simple. Don't parse anything in the form __[a-z0-9$_] as anything but a keyword.

Problem solved. Instead, they did something awkward by trying to carve out namespaces for the stdlib too but quite frequently don't adhere to their own rules.

gsg · on Dec 21, 2011

The goal is not just to allow for new keywords. Reserved identifiers are also intended to be used in system headers to protect implementation code from the unfortunate effect of user defined symbols.

Go and read your /usr/include/stdio.h (or equivalent). It will be filled with such symbols. Obviously the compiler must lex them exactly as it would unreserved symbols.

This is an unfortunately legacy of C's decision to pass compilation information around by textual inclusion, and isn't really fixable at this point.

JoshTriplett · on Dec 21, 2011

System headers could use a #pragma _SystemIdentifiers, much like TeX's "makeatletter" and "makeatother" that allow packages to use @ in the name of internal symbols to avoid conflicts.

I do think compilers ought to have warnings for using identifiers in the reserved namespace. Most projects probably wouldn't trip over the ^_[A-Z] reserved space; however, numerous projects would trip over the reservation of any identifier containing two adjacent underscores.

gsg · on Dec 21, 2011

That would require changing every single existing header that currently contains reserved identifiers. Do you have any idea how much work that is? Even if you could somehow magically enumerate and alter them all - and test them, since the necessary renaming of identifiers in non-system headers might introduce a bug - legacy versions would be floating around for years.

Conceptually a warning is a nice idea, but it just isn't practical.

JoshTriplett · on Dec 21, 2011

Not necessarily. GCC, at least, avoids showing warnings in system headers unless explicitly requested. So, normally you'd only see warnings about the use of reserved identifiers in your own code.

TheNewAndy · on Dec 21, 2011

But there are lots of places where these identifiers aren't keywords, and it is valid code (try writing a compliant stdlib without using them).

The problem is that too many people assume that their usage of C is the same as everyone elses. I don't want to be writing a system library and fighting my compiler everytime I put something in the symbol table that isn't in the C spec.

to3m · on Dec 21, 2011

It's to allow use of any identifiers anywhere, not just keywords, without having to worry about conflicts with the including code.

breadbox · on Dec 20, 2011

Indeed. As a general rule, if you think the ISO committee has done something pointlessly moronic, there's a pretty good chance that there's more going on than you know about.

aliguori · on Dec 20, 2011

This is exactly how stdbool worked in C99 too. _Bool is the actual typename and stdbool.h includes a macro to normalize it.

alexis-d · on Dec 20, 2011

"For an example of what happens when you break compatibility in the name of "progress" see Python3, which years on is still not installed by default anywhere, and probably never will be."

Python 3 is the default on some Linux distros (e.g. Arch Linux)...

reacweb · on Dec 20, 2011

breakage of pre-existing code is really a poor excuse for such craziness. Seriously do you prefer an error message during compilation that is easy to understand and not so difficult to fix or a language so tricky that nobody can read the huge reference manual. I fully agree with the author. Seek and hang the culprits !

ajross · on Dec 20, 2011

That's not what C1X is for. It's a straightforward evolution of ISO C99 which adds features without breaking anything. If you want a similar low level language which is not compatible with legacy C you don't have to look far: history is littered with their carcasses.

groovy2shoes · on Dec 21, 2011

One of those carcasses happens to be C99.

reacweb · on Dec 20, 2011

Think about C++: not only legacy C++ was broken by each standard, it was also broken by each compiler releases. The change of compiler release was always a journey.<br> It is normal that introduction of a new keyword breaks legacy C. A new standard is compatible with legacy code if the only adaptation required is to rename identifiers that collides with new keywords. There is no excuse for producing such a crap.

pmr_ · on Dec 20, 2011

The issue always confuses me. The committee puts in a lot of effort to avoid introducing new keywords or repurposing obsolete keywords. This introduces some horribly overloaded keywords and awful syntax. Of course, new keywords break old code but what is so difficult about a grep-replace on your code base? The only reason I can come up with are the few corner cases where turning an identifier into a keyword can produce well-formed code with a different meaning but I have a hard time imagining such a thing. Can someone shed some light on this?

scott_s · on Dec 21, 2011

Pain now, or pain later? Most choose to put off the pain.

Python 3 chose pain now - and I think they made the correct choice - but three years later, people are still upset and many, many Python libraries have not made the transition.

quanticle · on Dec 21, 2011

To be fair, though, the Python core team envisioned a five-year transition plan and we are starting to see major frameworks make the switch. Django, for example, recently announced full Python 3 compatibility in their nightly builds.

scott_s · on Dec 21, 2011

I agree that's fair, hence why I think it's the right choice even given the pain. The reason I pointed out that it's been three years is that sometimes, it's not just a choice about pain now, but a prolonged pain starting now and maybe lasting years.

scott_s · on Dec 20, 2011

C++ also goes out of its way to avoid introducing new keywords for exactly the same reasons as C.

wvenable · on Dec 21, 2011

I think C++ ultimately proves what a poor idea that really was!

gcr · on Dec 21, 2011

I know this is orthoganal to your point, but Arch Linux ships with python3 installed by default. Of course, python 2.7 is in the repositories, but you have to say /usr/bin/python2 to get to it. That breaks all those stupid build scripts that expect python 2.x to be installed in /usr/bin/python and nowhere else.

ajross · on Dec 21, 2011

Wait... it's stupid to write a script that uses /usr/bin/python (or really "/bin/env python" per the convention in that community)? When those scripts were written, there was no "python2". It's certainly not their fault that "python" changed to be an interpreter for a different language.

This is exactly the problem. Distros (c.f. Red Hat, SUSE, Ubuntu) with large installed bases simply cannot ever ship a /usr/bin/python as anything but python 2 for exactly the same reason that the kernel cannot change syscall numbers. It will break their customer's software. Arch Linux doesn't really have "customers" in this sense, so they're free to play. The serious distros aren't.

Just mark my words: python 3 will never be a default "python". We're entering year 4 of the python3 era. You really think anyone's going to jump now if they haven't already?

SomeOtherGuy · on Dec 21, 2011

>Doesn't seem like the author really thought some of this through

Yeah, he says the same thing at the bottom:

>Poul-Henning

I guess it would be nice if he warned you up front, but at least he does include the warning.

jacques_chester · on Dec 20, 2011

The Sydney Opera House is a good example, actually.

It's an iconic building. Australia is frequently represented by a picture of the opera house in front the the harbour bridge, possibly with Uluru in the middle distance (it's just outside Sydney, apparently).

There's just one problem. It's not a very good opera house. The structural requirements make it impossible to have an ochestral pit for the musicians, and the shape of the building make it very difficult to have all the usual invisible magic that makes opera work smoothly. It's cramped and oddly shaped.

The SOH is a perfect example of the triumph of style over substance. The New South Welsh who commissioned it are now facing a $1 billion dollar bill for maintenance over the coming decade.

edit: decade, not 5 years.

phkamp · on Dec 20, 2011

Funny you should say that: I've always wondered how it worked in the inside.

I'm not claiming it is a great opera house, but it is a very good example of using modern building tools.

When Utzon died, one of our (Danish) newspapers explained that the required maintenance is now necessary because the architects original plan to cap the "top-seams" with metal was judged too expensive, and some plastic-sealant ("Caulk" ?) were filled into the seam instead, and that has allowed moisture to seep into the construction. I have no idea if this is true or not.

jacques_chester · on Dec 20, 2011

Basically Utzon designed it to look awesome. And it does. As a work of sculpture it's really lovely, and the engineering to make it stand up is difficult and impressive.

But as I said, Utzon and his offsiders obviously did not care about the actual purpose of the building, or whether their beautiful design would be at all practical. It's not. Even Frank Gehry, whose work I find to be obnoxious, has managed to design a semi-practical music hall.

It's not really architecture, in a Vitruvian sense. It's walk-through sculpture.

onan_barbarian · on Dec 20, 2011

Your remarks are superficially plausible and fit many people's pet theories about architects. The only flaw is that they don't have any relation to the reality of the construction of the actual Sydney Opera House.

The internals of the building were not built to Utzon's design; they were redesigned and built after Utzon resigned in 1966.

jacques_chester · on Dec 20, 2011

> The internals of the building were not built to Utzon's design; they were redesigned and built after Utzon resigned in 1966.

My understanding is that Utzon's original design was, basically, impossible. The new design was a compromise made during intense negotiations with physical reality after Utzon had quit.

The further point is that those compromises rule out little things such as the ochestral pit or the tower. Things that are sorta kinda really really useful for ochestras.

If it was up to me I'd build the usual mausoleum somewhere else and turn the "opera house" into something else. A museum perhaps.

> Your remarks are superficially plausible and fit many people's pet theories about architects.

Availability bias. We never hear about the vast majority of architects who stick to designing safe, sensible and non-hideous buildings. We do hear about self-promoting designers of monuments to their own egos, such as Gehry. And naturally this tilts the public view of architecture.

jacques_chester · on Dec 20, 2011

On the other hand ...

As a kid I wanted to be an architect. Maybe I'm suffering from armchair expertise. "I use buildings, therefore I'm an architect".

But sometimes mistakes are so visible that end users can point them out. My personal bête noire is the business school building at the University of Western Australia. Just a stunning array of dumb details, but it's visually bold and jaunty. I guess that's what sold.

haberman · on Dec 21, 2011

> Now, don't get me wrong: There are lot of ways to improve the C language that would make sense: Bitmaps, defined structure packing (think: communication protocol packets), big/little endian variables (data sharing), sensible handling of linked lists etc.

> As ugly as it is, even the printf()/scanf() format strings could be improved, by offering a sensible plugin mechanism, which the compiler can understand and use to issue warnings.

> Heck, even a simple basic object facility would be good addition, now that C++ have become this huge bloated monster language.

As a C user, I would not want to see any of these things in C.

C++ was originally named "C with Classes," and it didn't start out as a monstrosity. Adding "simple" classes to C because C++ is too complicated makes no sense. C++ got complicated because of all of the things that users end up wanting if you walk that path. There's no reason to think that "simple" classes added to C would stay simple.

Plugins for printf()/scanf()?? Not even remotely the kind of thing that needs to be in the core language libraries.

I don't think defined structure packing would really add much; if the idea is to memcpy() the raw data into your structure, then you're just making the actual data processing part more expensive because the compiler has to store the structure in a sub-optimal way (ie. with non-native endianness). Simple expressions like x.y++ could end up being much more expensive than expected.

tptacek · on Dec 21, 2011

For the past 10 years or so, I've been using an alternative to printf() that I lifted out of Hanson's _C Interfaces and Implementations_; in my tree, it's "fmt.c", and includes the family of fmt_<star> functions. They're just string processing code, and so are trivially portable to WinAPI, OS X, Linux, and FreeBSD.

This is so much of a win that I don't understand why everyone doesn't do it:

* Cross-platform counted/allocated/concatenated string semantics that with no annoying nits from one platform to another.

* "Native" support for printing IP addresses (%i), which gets rid of inet_ntoa and ilk, and binary.

* A registration system for adding new format codes, which is like being able to create Object#inspect functions in C code.

All of which is a roundabout way of saying I agree that we don't need to improve printf/scanf; we need to burn them with fire.

I'm going to tend to believe PHK if he thinks explicit structure packing is a win, noting at the same time how unlikely it is that any implementation of it is going to be a performance bottleneck compared with I/O.

haberman · on Dec 21, 2011

> I'm going to tend to believe PHK if he thinks explicit structure packing is a win, noting at the same time how unlikely it is that any implementation of it is going to be a performance bottleneck compared with I/O.

I've spent a lot of my life writing parsers for various network formats, and one thing that I can say with authority is that at both my current company (Google) and at my previous company (Amazon) the CPU cost of parsing bytes off the network was noticeable enough to spend significant resources optimizing.

I haven't done benchmarks myself about eg. bitvectors, but I'm pretty sure I've heard that the performance of packed bitvectors is noticeably slower than non-packed. I also think the cost would be comparable to 64-bit math on 32-bit processors (ie. an overhead of 3-5 instructions per operation), which is a non-trivial cost.

tptacek · on Dec 21, 2011

In the sense that your parsing strategy influences your I/O strategy and, in particular, may incur extra copies, I buy this.

The idea that Amazon has cycle-optimized network parsing code, and that they did it for a significant practical benefit... I have no reason to doubt you, but I'd like to hear more.

I've done a fair bit of high performance network code (not for Amazon or Google, but, for instance, for code watching most of the Internet's tier-1 backbone networks on a flow-by-flow basis) and I'm not sure I could have won much by counting the cycles it took me to pull (say) an NBO integer out of a packet.

This stuff always makes me think about:

http://cr.yp.to/sarcasm/modest-proposal.txt

haberman · on Dec 21, 2011

> The idea that Amazon has cycle-optimized network parsing code, and that they did it for a significant practical benefit... I have no reason to doubt you, but I'd like to hear more.

I can speak better to Google, since it's my more recent experience. Google's internal data format is Protocol Buffers (and all the code is open-sourced, as you probably know). The C++ code that is generated to parse Protocol Buffers is fast (on the order of hundreds of MB/s) as a result of a lot of optimization. This has reached a rough ceiling of what I believe is possible with this approach (generated C++). Even so, Protocol Buffer parsing code shows up in company-wide CPU profiles, and certain teams in particular have performance issues where Protocol Buffer parsing is a significant concern for them.

To address these issues, I wrote a Protocol Buffer parser that improves performance in two ways:

- it is an event-based parser (like SAX) instead of the protobuf generated classes which are a more DOM-like approach (always parsing into a tree of data structures). With my parser you bind fields to callbacks, and you can parse into any data structure (or do pure stream processing).

- I wrote a JIT compiler that can translate a Protocol Buffer schema directly into x86-64 machine code that parses that schema. Without the intermediate C++ step, I can generate better machine code than the C++ compiler does. In an apples-to-apples test, I beat the generated C++ by 10-40%. If you do more pure stream parsing the win is even greater.

My protobuf work is open source: https://github.com/haberman/upb

> I'm not sure I could have won much by counting the cycles it took me to pull (say) an NBO integer out of a packet.

That's certainly different experience than mine. I don't know much about routers.

tptacek · on Dec 21, 2011

Bad-ass. I'm happy to be wrong if it solicits comments like this. :)

swah · on Dec 21, 2011

Is that on your 20%? I noticed you use Lua :)

mark_h · on Dec 21, 2011

+1 for Hanson's book in general (which I think I might have initially read on your recommendation anyway!). I don't do a huge amount of C programming these days, but it was a huge eye-opener.

phkamp · on Dec 21, 2011

The reason why struct packing would be a win is probably not performance, but code readability and less bugs.

Making be/le/native conversion a job for the programmer is not only error-prone and a waste of time.

The compiler could safely optimize the byte-swizzles away on non-arithmetic operations, whereas most programmers tend to covert everything before they start working on it.

__david__ · on Dec 21, 2011

Along those lines I wrote a wishful thinking, blue sky blog post a number of years ago: http://porkrind.org/missives/hardware-friendly-c-structures/

jetsnoc · on Dec 21, 2011

Thank you for the excellent book suggestion. I've only been writing in C for a few years now. I've been grappling for a better way to manage and abstract some of our utilities in our poorly written legacy application. Every linked list is custom, and every array malloced and freed on the fly -- it's so fragile that sometimes I feel paralyzed. It's been terribly wrong for a long time but I've personally lacked the experience and formal training to make it better through libraries and wrappers. About a year ago I picked up "Mastering Algorithms with C" and it was very helpful at teaching complex algorithms but never really helped me abstract our code base in to something portable and reusable.

I'm certain this book will help me tremendously. :P

tptacek · on Dec 21, 2011

It is a great, great, great book. Instantly and utterly useful.

techdmn · on Dec 21, 2011

Sometimes I wonder if C++ deserves more credit for /saving/ C, by absorbing all the crap that might otherwise have been added. :)

jronkone · on Dec 21, 2011

> Plugins for printf()/scanf()?? Not even remotely the kind of thing that needs to be in the core language libraries.

Where then? As far as I can tell, there's no sensible way to extend the functionality of printf()/scanf(), and compilers special-case their format-strings.

adgar · on Dec 21, 2011

> Simple expressions like x.y++ could end up being much more expensive than expected.

He's talking about pulling protocol packets off the wire - who increments values in a packet like that? Typically all you ever do is pull the relevant fields out into your own structure or pass them off as arguments to a function call. Which will always involve unaligned reads from buffer somewhere.

The only real way I can see it being misused in a way that results in poor performance is if you read data into an explicitly-packed struct and then pass a reference to it around everywhere, reading from it willy-nilly instead of extracting the values once.

haberman · on Dec 21, 2011

If you're just going to extract the values once, why not just write the code to actually do that instead of trying to play tricks with memcpy()? Those tricks won't even work in many cases where certain fields indicate the length of certain other fields (which is quite common).

When I first started programming in C I also had this fantasy that I could parse network formats with memcpy(). Since then I've become convinced that writing the actual parsing code is for the best.

adgar · on Dec 21, 2011

> If you're just going to extract the values once, why not just write the code to actually do that instead of trying to play tricks with memcpy()?

Because it's more expressive, doesn't cost performance, and is less error-prone than writing field extraction code manually. And because it's the type of processing people actually use C for.

> Those tricks won't even work in many cases where certain fields indicate the length of certain other fields (which is quite common).

... it's common for a field to specify the size of the data portion of a packet, not other header fields. The one exception I can think of is a version number, which indicates the layout of the rest of the packet, which isn't exactly rocket science to model using explicitly-packed structs.

> When I first started programming in C I also had this fantasy that I could parse network formats with memcpy().

Nobody here has such a fantasy, they're expressing a desire for this to be made possible.

> Since then I've become convinced that writing the actual parsing code is for the best.

Given a programming language grammar, would you prefer to use an LR parser generator, to write an LR grammar by hand, or to write a recursive-descent LL parser by hand?

ori_b · on Dec 21, 2011

It seems that the preferred implementation these days is to write an LL parser by hand, if most production compilers and interpreters are anything to go by.

adgar · on Dec 23, 2011

There are some well-publicized LL parsers - C# for example - but I'm not convinced by "most production compilers and interpreters."

bhurt · on Dec 20, 2011

The C standards committee has been broken since C99 introduced long long, thus silently breaking conforming code.

How do you print out a size_t, portably, without losing information? The standard says size_t is an unsigned integer type, but doesn't say way size. However, the C89 spec explicitly stated that there are no integers sizes longer than long- so the conforming way to do this is to explicitly cast the size_t to unsigned long (which, while it may add bits, is guaranteed not to lose them), and print the unsigned long.

In an attempt to save all the broken, non-conforming code that assumed that sizeof(int) == sizeof(long), C99 introduced the long long type. And, among other things, allowed size_t to now be unsigned long long. Which meant the conforming C89 code that wants to print out a size_t is now wrong. Worse yet, it's silently broken- because the type cast is explicit, the compiler has to assume it's correct. They added a new way to print size_t's, granted- but this is no help for the legacy code (or code that needs to continue to support C89-only compilers).

Of course, the punchline here is that I have yet to see code that assumed sizeof(int) == sizeof(long) that didn't also assume sizeof(void *) == sizeof(int). So all that broken non-conforming code they were trying to save? It still needed to get fixed for 64-bit.

phkamp · on Dec 20, 2011

You overlook the %z printf specifier which attempts, but not quite solves the problem, because size_t comes in both a signed (ssize_t) and unsigned (size_t) variant.

I usually end up doing printf("%j", (intmax_t)foo);

haberman · on Dec 21, 2011

> because size_t comes in both a signed (ssize_t) and unsigned (size_t) variant.

ssize_t is not ISO C. Also, %z is a modifier, not a specifier, so you can print size_t with %zu, or ssize_t with %zd.

to3m · on Dec 20, 2011

Won't "%zu" vs "%zd" do the trick? `z' is a modifier, like `l'.

(That said, I've only ever used %zu myself... don't think I've ever used ssize_t. I'm pretty sure it's non-ISO.)

caf · on Dec 21, 2011

It does (other variants like `%zx` are also fine).

scott_s · on Dec 20, 2011

I find this author's characterization of the history of C inaccurate. At the time, C was innovative. And its existence did enable computer science research, because with C, operating systems were finally portable. C itself was a worthwhile contribution to computer science research. Kernighan and Ritchie built upon prior languages, but they were able to distil what levels and kinds of abstraction were needed to implement portable systems programs. Many of the concepts in C existed in C's predecessors, but not in the same form we know them as. It's easy to under-estimate how novel that contribution is because so many of us think in C now.

For a history of C from the source, Dennis Ritchie, read this: http://cm.bell-labs.com/who/dmr/chist.html

phkamp · on Dec 20, 2011

You are rationalizing: C was not "novel", it was pretty much BCPL-light. There also were portable operating systems before UNIX, some of them were written in PL/1 and FORTRAN.

scott_s · on Dec 20, 2011

BCPL had a single data type (the "word") and no structures.

Ritchie's paper above covers the innovations in C quite well. See sections "The Problems of B", "Embryonic C" and "Neonatal C".

phkamp · on Dec 20, 2011

And that's where C started out (as 'B').

Adding structures or for that matter pointers to a programming language was not "novel" in 1970.

The sheer success of the UNIX and the myth it has built, has many contemporary programmers thinking that everybody else punched cards with flint tools around the bonfire. Even MULTICS, a very innovative and in many ways wonderful OS has gotten a bad rap because of the UNIX-fanboiz cult-building.

Dennis, Ken & Brian broke ground, but very few people can correctly say what new ground they broke. (Hint: namespaces, file structure, what a file contains).

caf · on Dec 21, 2011

As noted elsewhere, the _Noreturn spelling is used precisely because such identifiers have been reserved since the first ISO C standard, so existing conforming code shouldn't use them and hence shouldn't break.

However, rather than introducing <stdnoreturn.h> to create the pretty spelling, it would perhaps have been better if the standard specified that if you define a macro like _C_SOURCE to a suitable value before including any standard headers then you get that define.

That would mean old conforming code would continue to work, and new code would just have to start out with

  #define _C_SOURCE 20110101

(or whatever the actual date specified is) to declare its allegience. This kind of thing would also allow older-standard compilers to detect and reject code that requires the newer standard.

Someone · on Dec 21, 2011

Neither approach would allow one to mix the two versions. Consider the following use case: program P uses two third party libraries L and M, so it uses:

  #include "L.h"
  #include "M.h"

Program P gets updated by downloading improved versions of L and M. The new version of L.h does a

  #include <stdnoreturn.h>

or

  #define _C_SOURCE 20110101

Now, program P accidentally gets processed while noreturn is a keyword.

I do not see how to fix this (you could #undef every macro that the new C standard introduces before including M.h, but M.h might have a noreturn macro of its own that is not the ISO C version). It is just as if 'old C' and 'new C' are two different languages that happen to look similar, and that can be compiled by a single compiler.

caf · on Dec 21, 2011

For precisely this reason, third-party libraries should not be including <stdnoreturn.h> in their external headers unless they are specifically intended to work only on C1x.

Under the _C_SOURCE scheme they certainly shouldn't be defining that macro, since the program which is including them has likely already defined it and you cannot have a duplicate macro definition. If they care, they should instead be testing its current value with #if, not changing it - ie, if a library requires C11x then it would use something like:

  #if _C_SOURCE < 20110101
      #error C11x or better required for this library
  #endif

phkamp · on Dec 21, 2011

Provided you were prevented from linking objects together which were compiled with different C versions, that would be both better and safer.

The problem with ISO-C's approach is that the chances of having both "#include <stdbool.h>" and "#define bool char" in a codebase of a few million lines are almost unity, and causes bugs which are near-impossible to find, unless you happen to know the intricacies of hacks like this.

In general, optional features in a standard are a bad thing. ISO-C seems to think they are the solution to everything they cannot agree to do properly.

latitude · on Dec 20, 2011

Taking bets on when ISO C will acquire a form of templates, or taking it in even more generic direction - parametrized namespaces. Something like this (an illustrative example, don't knock me down for including val into list):

  namespace(T)
  {
    struct list
    {
       T      val;
       list * next;
       list * prev;
    };

    void append(list * l, list * i)
    {
      ...
    }
  }

Then

  list<int> * foo, * bar;
  ...
  append(foo, bar);

No need for namespace nesting, default arguments and other ++isms. Just something to replace multi-line macro blocks.

koenigdavidmj · on Dec 21, 2011

Why not just use C++ without namespace nesting, default arguments, and other ++isms?

antirez · on Dec 20, 2011

Well this sounds very messy, the fact you can't change the stack of the thread is a no go for many purposes, just an example, in Redis this must be done because otherwise lzw encryption will cause a stack overflow.

Also the ISO C guys completely miss how important is to provide a better libc with data structures and so forth, very very very well designed.

About the epic quote of C being a language without ambitions, one of the reasons if D or Go, or other languages are going to hardly replace C is that they are indeed, too ambitious. No one is trying to fixing C with minimal but important changes.

kqr2 · on Dec 20, 2011

Link to the current C1X draft which is mentioned in the article:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

leoc · on Dec 21, 2011

> Concrete has been known since antiquity, but steel-reinforced concrete and massive numerical calculations of stress-distribution, is the tools that makes the difference between using concrete as a filler material between stones, and as gravity-defying curved but perfectly safe load-bearing wall.

Ahem. http://www.google.ie/imgres?imgurl=http://romanconcrete.com/...

forgotusername · on Dec 20, 2011

Ignoring complaints about the introduction of a new style of _Identifier, the rant about pthreads seems myopic or just plain wrong.

For a start as a vendor-neutral spec (the C standard itself), pthreads would never be adopted as-is. It simply couldn't happen for political reasons (without knowing a thing about the internals of the working group, I'm fairly confident of that). In the meantime it is still useful to define some uniform cross-platform API, even if most organizations continue to use pthreads in legacy code.

Assuming "too dangerous" wasn't a direct quote: stack size is an implementation detail, something they probably intentionally left undefined. There are good reasons for this, not least baking assumptions about the underlying machine into the standard has never been a goal. Secondly, with advances like Go's dynamic stacks and its <5 assembly instruction overhead, why would you go and bake a feature akin to a modern sbrk(2) in when sexy alternatives that might see wider adoption are already seeing deployment.

As for timed sleeps, providing absolute timestamps rather than intervals is important because it prevents drift: given any function manipulating time, or some timeout, calculating some perceptible end time in the function prologue is much more immune to stupid developers introducing drift (via loops), than expecting the average Joe to account for the latency/contention introduced by system calls, the scheduler, power management, etc.

Fears regarding ntpd causing sudden jumps aren't well-founded (ntpd adjusts time very slowly, in sub-second intervals), although the general argument is fair (the sys admin, or other crazy external sources, can cause large time jumps).

phkamp · on Dec 20, 2011

This comment was filled with fail.

Pthreads has been the standard for 20 years, either you do better than it does, or you are wasting everybodys time, including your own.

"Too dangerous" is a direct quote from a WG14 member.

Stacksize is not an implementation detail, it often is crucial resource management issue for each individual application, in particular in high performance computing and network service applications. The C1X thread API provides no way you can set it.

With respect to timed sleeps: You can simulate wall-clock sleeps with duration-sleeps, but not the other way around. Just that simple fact should make it clear why the API, if it can only offer one kind of timeout, should offer duration-sleeps.

NTPD will step your clock up to +/- 3599 seconds. I happen to know: I helped write it.

forgotusername · on Dec 20, 2011

With respect to stack size, I still believe it is an implementation attribute independent of the abstract machine for which C is specified. There is no reason why some undefined external vendor-specific control (e.g. some glibc function) can't be provided by implementations, where required (however as in the example I gave, implementations are possible where such a control would be meaningless - its inclusion would be myopic).

Following your argument, complex systems could not be designed without similar controls on memory allocator behaviour ("how can I allocate anything when I don't know if malloc() will introduce syscall latency for a size-32 allocation!?") or environment table size ("how can I possibly execute a subprocess if I can't be certain setenv() will always succeed!?").

The answer to both of course is that you don't design for the standard, you design (and measure) for a particular closed system. I can't see why stack size is any different from the two (of many) examples above.

Re: absolute timeouts, the impossibility of emulating intervals reliably with wall-clock is a very good point.

phkamp · on Dec 20, 2011

First: Why should the API for setting stack-size be implementation defined, when a stack mandated to implement the language ?

Second: You seem to labour under the misunderstanding that all threads in a program have, and should have the same stack size ? That's simply not true, you can look in Varnish for a good example: We may have 10 "overhead" threads with big stacks and 100.000 worker threads with small stacks.

Third: You cannot add "some glibc function" to set the stack-size of a thread you have not yet created, and setting it afterwards may be impossible (if you want it larger) or cause memory fragmentation (if you want it smaller).

Fourth: What does malloc(3) have to do with thread stacks ??

forgotusername · on Dec 20, 2011

For the third time, the API for setting stack size should be implementation defined because there are perfectly practical implementations for which any specification would be meaningless.

Another reason is that almost all prevailing contemporary environments use virtual memory, and the prevailing embedded architecture (ARM) is about to go 64 bit (the desktop/server world already has). In a world with virtual memory, paging, and even where we're 10 years away from "complex system" interconnects powerful enough to provide shared memory across 10k+ computers filling a football field, how much longer would setting the stack size have any practical meaning.

It would be like insisting on a frewindtapedrive() library call, because your machine happens to have a tape drive (and aren't files stored on tapes eventually!). Thankfully you weren't on the committee in the 1980s. ;)

Nowhere did I suggest all threads should have the same stack size - if anything by the Go example, I suggested they might have no specific 'size' at all.

malloc(3) has nothing to do with thread stacks, it was just an example of yet something else that has very implementation-defined behaviour, for which any standardized tuning API would probably come up short.

jedbrown · on Dec 21, 2011

> almost all prevailing contemporary environments use virtual memory

IBM's Compute Node Kernel (Blue Gene) uses offset-mapped memory (no TLB). This is great for reproducible performance (much less than 1% variability on thousands of cores, compare to 30% or more on Cray). Admittedly, most jobs running on CNK will use constant stack sizes across all threads, but not all.

The C standard does not specify that the language be implemented using a stack at all, so it would be inconsistent to provide an API for manipulating stack sizes.

ArbitraryLimits · on Dec 21, 2011

If they don't specify the use of a stack, why is there alloca() vs malloc()?

KaeseEs · on Dec 21, 2011

alloca(3) isn't in ISO C (or even in POSIX).

FooBarWidget · on Dec 20, 2011

> Another reason is that almost all prevailing contemporary environments use virtual memory, and the prevailing embedded architecture (ARM) is about to go 64 bit (the desktop/server world already has). In a world with virtual memory, paging, and even where we're 10 years away from "complex system" interconnects powerful enough to provide shared memory across 10k+ computers filling a football field, how much longer would setting the stack size have any practical meaning.

I set the stack size on my worker threads because I don't want my users looking at my process in 'ps', see a big VSIZE, and conclude that my software is bloated and memory hogging. Yes I know I can educate my users but I would rather prevent them from bothering me with these false conclusions in the first place.

phkamp · on Dec 20, 2011

Please give an example of an implementation of this thread API where having the ability to specify the desired stacksize would "be meaningless" ?

Even on 64bit machines you can and will have memory fragmentation when you approach a million threads.

PS: A bad analogy is like a wet screwdriver.

forgotusername · on Dec 20, 2011

1) http://gcc.gnu.org/wiki/SplitStacks

> This is currently implemented for 32-bit and 64-bit x86 targets running GNU/Linux in gcc 4.6.0 and later. For full functionality you must be using the gold linker, which you can get by building binutils 2.21 or later with --enable-gold.

2) You're still conflating heap and stack.

phkamp · on Dec 20, 2011

Too bad it is not part of the ABI specification on any known platform, so if you call a library function compiled without this magic compiler you're totally screwed.

But an interesting research project, I'll grant you that.

mikeash · on Dec 20, 2011

There's a section in that link which details how they handle calls to libraries that aren't aware of what's going on. Sounds like it works just fine.

nitrogen · on Dec 20, 2011

As I'm sure you are aware, there are C compilers for Harvard architecture machines with fixed-sized call stacks.

SpikeGronim · on Dec 20, 2011

"stack size is an implementation detail"

Thread stack size is an important performance tuning knob. If your app has 100,000 threads you really do care about running out of memory. If you have 10 threads you don't care. You should of course separate the choice of stack size from your application code. Your library still needs to support this.

"providing absolute timestamps rather than intervals is important because it prevents drift"

I don't know what you mean by "drift". When I put a timeout on a threaded operation (joining, acquiring a mutex, etc.) I am guarding against blocking forever 99% of the time. It is much cleaner to say "this should take no more than 5s" vs. "this should end at XYZ time UTC". Forcing the application to track elapsed time in the face of NTP etc. sucks. Letting the library handle those edge cases is good. It can use a CPU tick counter or similar.

"Fears regarding ntpd causing sudden jumps aren't well-founded (ntpd adjusts time very slowly, in sub-second intervals)"

What if I am sleeping in sub-second intervals? This goes to the general thrust of PHK's article. He is endorsing K&R C and pthreads because they give you minimal tools that can express a very wide range of programs. Building assumptions about how long people want to sleep for into an extremely widely deployed standard is a mistake.

scott_s · on Dec 20, 2011

Your comments on thread stack size don't contradict the parent's point. Thread stack size is an implementation detail, and it's also an important performance tuning knob. That implies that it should be a knob that belongs to the system, not the C standard.

pantaloons · on Dec 20, 2011

Do you mind elaborating on the supposed political quibbles that would prevent a pthreads-like API from being adopted? Are you are implying that vendors of competing API's would oppose anything pthread-like making its way into the standard?

kenjackson · on Dec 20, 2011

If you read the minutes they borrow generously from PThreads. But I can imagine some major vendors, like Microsoft, may say, "We have a threading library where PThreads conflicts with how we do threading. Rather than just taking PThreads as a whole, lets look at these issues and come up with a solution that will work reasonably for everyone."

Otherwise you end up with C1x coming out and MS just saying, "This is broken for Windows. Our compiler won't implement it." And just let gcc and Intel pick up the load. Which may be fine for some, but seems counterproductive.

wmobit · on Dec 20, 2011

As if MS will implement C1X. They haven't caught up to the last C standard from 12 years ago

saurik · on Dec 21, 2011

FWIW, Microsoft does not bother with C99 not because they are lazy, but because I honestly never remember C being "the focus" at all: they have been strong backers of C++ since before I started doing Windows development (which itself was back in 1994). They mostly seem to ship a C compiler only because it is often easy to dumb down their C++ compiler to do so (and even then, C++-isms sometimes slip into their C modes).

"""Thanks for submitting this suggestion. I've resolved it as Won't Fix, because we currently have no plans to implement C99 Core Language features. While we recognize that a few programmers are interested in those features, our finite development and testing resources force us to focus on implementing features that will have the greatest impact on the greatest number of programmers, which means C++."""

-- http://connect.microsoft.com/VisualStudio/feedback/details/5...

"""Unfortunately 1) There are many, many more users of the Microsoft C++ compiler than there are of the C compiler; 2) Anytime we do customers discussion and/or solicit feedback the overwhelming response is that we should focus on C++ (especially at the moment C++-0x); 3) We just don't have the resources to do everything we would like. So while we are slowly improving our C-99 support (and we are active in the C-1x discussions) I can't promise we'll add any of these features."""

-- http://connect.microsoft.com/VisualStudio/feedback/details/5...

JoshTriplett · on Dec 21, 2011

Fortunately, there exist at least two decent C compilers for Windows that I know of: GCC and ICC. GCC provides the same FOSS compiler available on every other platform, and also allows cross-compiling from FOSS platforms to Windows, which helps when producing cross-platform binaries (to avoid needing to have a Windows system around to build release binaries). ICC doesn't use a FOSS license, and it lags behind GCC, but it does provide a replacement backend for Visual C++, which can make it more usable in some environments. Either one provides much better support for C than the Visual C++ C compiler.

phkamp · on Dec 20, 2011

Yeah, that would have been neat, unfortunately they came up with a solution which will work for nobody...

kenjackson · on Dec 20, 2011

Oddly, that might be the best place to be. If it worked for 50% of the group, there might not be enough votes to get it changed (the 50% can effectively block progress). If it works for nobody, then there's still a chance it can get changed.

And then people wonder why I'm not the guy in the room who is generally not excited about standards.

po · on Dec 21, 2011

In case you missed it, this article is part of the 'Poul-Hennings random outbursts' section which is a nice collection of thoughts from a very talented programmer. I have always enjoyed reading these:

https://www.varnish-cache.org/docs/trunk/phk/index.html

comex · on Dec 20, 2011

I don't see this as particularly monstrous - it's reasonably succinct, and in this case a macro implementation does the job just as well as a built-in linked list would, and with much more flexibility.

    #define VTAILQ_INSERT_BEFORE(listelm, elm, field) do {              \
        (elm)->field.vtqe_prev = (listelm)->field.vtqe_prev;            \
        VTAILQ_NEXT((elm), field) = (listelm);                          \
        *(listelm)->field.vtqe_prev = (elm);                            \
        (listelm)->field.vtqe_prev = &VTAILQ_NEXT((elm), field);        \
    } while (0)

(I don't understand why it uses VTAILQ_NEXT((elm), field) instead of what it expands to, (elm)->field.vtqe_next - that would make it more obvious what's going on by paralleling the use of vtqe_prev - but that's the fault of the macro implementation.)

aliguori · on Dec 20, 2011

There are many things horrible about it. Its arguments are evaluated many times which is different than what you would expect. The fact that you have to pass field is just plain clumsy.

C could benefit tremendously from a parameterized type mechanism.

comex · on Dec 20, 2011

> Its arguments are evaluated many times which is different than what you would expect.

True - I would like to see the GCC extension that allows this to be done safely standardized, or improved.

> The fact that you have to pass field is just plain clumsy.

Sort of. I think the C trick of having linked list pointers inside the struct rather than the usual external wrapper is quite nice: it has good performance (especially in cases with multiple lists, to the extent that Boost has Boost.Intrusive even though it's really gross in C++), and explicitly codifying "struct A is a member of exactly one A-list" makes things feel simpler to me. You could have some kind of magic object in the list head that defines which member field it uses, but I don't know if that's worth the downside of the implementation no longer being trivial (and thus easy to understand).

I mean, there are lots of generally ugly points to C macros, but I have yet to see a better option for this kind of stuff that doesn't sacrifice either performance or simplicity - although the macro language itself could certainly be vastly improved.

tkahn6 · on Dec 20, 2011

Can anyone explain why this needs to be a macro? What does this get you over a plain function?

tptacek · on Dec 21, 2011

The idiomatic way you'd express a generic tailq in C with functions is with void-stars, which cost 4-8 bytes and incur the costs of indirecting through another memory address and, probably, of allocating lots of fiddly little structs at random times.

The macro version expresses the same generalized logic but embeds the link pointers in the structure you're queueing.

tkahn6 · on Dec 21, 2011

Thanks! Helpful as always.

saurik · on Dec 21, 2011

The "field" argument cannot be passed as an argument to a normal function in C. FWIW, you could pass it as an argument in C++ (which supports pointers to members), but in C++ there are numerous better ways of implementing this.

jedbrown · on Dec 21, 2011

It's intended to be used with lists with different payloads, all "inheriting" fields from a standard list structure (usually another macro expansion).

nyellin · on Dec 20, 2011

As ugly as it is, even the printf()/scanf() format strings could be improved, by offering a sensible plugin mechanism, which the compiler can understand and use to issue warnings.

gcc allows this with the 'format' attribute: http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html#F...

phkamp · on Dec 20, 2011

Yes, but gcc gives you no way to explain to the compiler that %Q takes a struct foobar* as argument, so either you don't use extensions, and have the seatbelts provided by -Wformat, or you use extensions and have no seatbelts at all.

Aissen · on Dec 21, 2011

For instance, neither pthreads nor C1X-threads offer a "assert I'm holding this mutex locked" facility. I will posit that you cannot successfully develop real-world threaded programs and APIs without that, or without wasting a lot of time debugging silly mistakes.

It's funny, Alan Cox thinks the same: https://plus.google.com/111104121194250082892/posts/jWjJ9897...

stephencanon · on Dec 21, 2011

Underscore-capital has been a reserved identifier prefix for many, many years. It's rather late to begin complaining about it. The side note about name-mangling on some platforms also using a leading underscore only shows that you can use a tool for years and still not understand it; that issue is completely orthogonal. I assure you, everyone on the committee is perfectly clear on the difference between name mangling and the reserved namespace.

The section on stack size is where it actually goes off the rails. C does not require that there be a stack. (Actually, the standard does not even contain the word "stack".) From that perspective, being able to control the "stack size" of a thread is nonsensical.

I'd be the first to agree that there have been some questionable additions to the C language, but this criticism comes off as half-baked at best. If you really care about this stuff, you should get involved with the standards process, or at least provide this feedback directly to the committee members--they are, for the most part, a very reasonable and thoughtful group of people.

tedunangst · on Dec 20, 2011

The deal with underscores isn't anything new, that was there in C99 for sure. Maybe C89 but I don't know. In any case, I think getting upset about a more than decade old change is a little late and detracts from the main point.

mberning · on Dec 21, 2011

I am usually not one to appeal to authority, but I do find it amusing the number of people attempting to argue with somebody like phk over stylings of the C language. Most people on here would be overwhelmed with pride to have written a fraction of the important code he has.

pwaring · on Dec 21, 2011

I'm not a C expert by any means, but is there a need for another new version? Last time I checked, some compilers hadn't even got round to implementing C99 (e.g. Visual C++ doesn't support variable length arrays). Even gcc hasn't got everything:

http://gcc.gnu.org/c99status.html

api · on Dec 20, 2011

Nothing good ever came from a committee.

zb · on Dec 20, 2011

Actually I think C emerged from its original ANSI standardisation process in much better shape than when it went it. You don't see too much K&R C being written these days for good reason.

aidenn0 · on Dec 21, 2011

Most of the K&R isms were already gone before C89, ANSI just standardized those. I learned C from a pre-ANSI book and it definitely wasn't that K&Rish

phkamp · on Dec 20, 2011

You're probably right about that, ANSI did a couple of good things, but I'm not sure I think they came out in the black in the end.

ISO on the other hand, seems like a total disaster.

StephenFalken · on Dec 21, 2011

It's interesting to note the bumpy road that lead to the final standardization of C as ANSI X3.159-1989, as documented on the following post:

http://groups.google.com/group/comp.lang.c/msg/991b9116ffa83...

Dennis Ritchie spend part of that time fighting against the introduction of several new features he felt were not the proper future of the language. Unfortunately, he had little to no involvement on the C99 standardization process. The final outcome was a butchered language. I'm sticking to C89/C90 on all my C development, no matter what.

"I was satisfied with the 1989/1990 ANSI/ISO standard. The new C99 standard is much bulkier, and though the committee has signaled that much of their time was spent in resisting feature-suggestions, there are still plenty of accepted ones to digest. I certainly don't desire additional ones, and the most obvious reaction is that I wish they had resisted more firmly." --dmr

JoshTriplett · on Dec 21, 2011

The post you point at just talks at length about two features: const, which Dennis Ritchie liked but wanted changes to, and noalias, which he wanted to kill off. Const works exactly like he suggested; noalias became "restrict", which seems dead outside of a few standard library functions.

That doesn't seem to me like a pervasive sign of problems with C99. Personally, I find the C99 standard incredibly useful and I can hardly stand to write C code that can't rely on C99. I like having designated initializers. I like having a "bool" type. I like having structure literals, for use in arguments or return types. I like having the family of sized integer types, such as uint32_t and uint64_t. And I like having standardized versions of pre-existing features such as "long long", "inline", and variables declared in the middle of code.

Do you really despise all of those features?

jemfinch · on Dec 20, 2011

The constitutional convention of 1787 seems to have done a pretty decent job.

phkamp · on Dec 20, 2011

You mean the job where the first 10 ammendments were necessary after just two years ? :-)

derobert · on Dec 20, 2011

Amendments 1–10 and 27 were products of the same Philadelphia Convention that produced the US Constitution, and were proposed along with it.

You could try Wikipedia: http://en.wikipedia.org/wiki/United_States_Bill_of_Rights#Ph...

jshen · on Dec 20, 2011

Nothing you said negates his argument. Why didn't the committee ad them to the constitution itself when it was passed?

mbell · on Dec 21, 2011

They didn't agree about the inclusion of the articles and they wanted to get the constitution signed and proposed to the states for ratification ASAP. Thus those that wanted to get the Bill of Rights in there were promised "it would be coming soon" which appeased them enough to "sign off " on the constitution and those that had some issue with the articles were happy that they weren't initially included. Keep in mind that the "Bill of Rights" originally had 12 articles, only 10 passed initially, 1 finally passed in 1992 and 1 is still outstanding.

Also consider that they Philadelphia Convention wasn't some magical aw-inspiring meeting of the minds as it is sometimes made out to be. Only 39 of 55 actually actually signed the constitution and several were so pissed off they just walked out. I'm sure lots of bargaining and negotiation was done and the 12 articles in the Bill of Rights was simply one of the bargaining chips.

jshen · on Dec 21, 2011

Right, that's what phkamp was saying. That the committee was a messy process that produced an artifact that needed to be fixed right out the gate, and didn't get up to a decent standard for a long long time.

myhnaccount108 · on Dec 21, 2011

Opponents of BofR claimed that was the broken part because it was redundant (and potentially dangerous) to enumerate rights not explicitly granted to the government. Also the built in amendment process allowed the document to adapt as the country evolved, while allowing the highest priority requirements to be met in the first iteration.

aliguori · on Dec 20, 2011

Minus the 3/5ths compromise of course...

http://en.wikipedia.org/wiki/Three-Fifths_Compromise

Committees are all about compromise and compromises are usually a worse solution in everyone's eyes. I'm not saying C14 threads are as bad as slavery, but they are a compromise that I'm sure no one will prefer to their native thread library.

I'm a bit surprise that ISO is even attempting to do generic threads. Who is this really a problem for? We have oodles of libraries that provide cross platform thread libraries. I don't see why we need the C standard to specify one.

billrobertson42 · on Dec 21, 2011

I wonder if he's looked at golang.

swaits · on Dec 21, 2011

phk is one of my personal heroes!

angersock · on Dec 20, 2011

Absolutely wonderful quote from the article:

My tool for writing Varnish is the C-language which in many ways is unique amongst all of the computer programming languages for having no ambitions.

The C language was invented as a portable assembler language, it doesn't do objects and garbage-collection, it does numbers and pointers, just like your CPU.

Compared to the high ambitions, then as now, of new programming languages, that was almost ridiculous unambitious. Other people were trying to make their programming languages provably correct, or safe for multiprogramming and quite an effort went into using natural languages as programming languages.

But C was written to write programs, not to research computer science and that's exactly what made it useful and popular.

This is a great thing to meditate on when trying to understand why the language is still used.

dextorious · on Dec 20, 2011

"""The <nostdreturn.h> file according to the standard shall have exactly this content: #define noreturn _Noreturn"""

Damn, this standard is way past the point of no return!

16s · on Dec 21, 2011

Some of the comments here seem rather harsh and certainly after the fact. There are a lot of knowledgeable, well-intended people on the committee and the standard was ratified by nation states. So it is finished. The time to take issue with it (draft/formal comment period) has passed.