One FD_CLOEXEC-related bug that got us in Chrome: on Linux, while we were careful to close any open fds when spawning a subprocess, the GTK IME ("input method editor", modules to facilitate typing Japanese etc.) could itself spawn subprocesses in response to innocuous calls (like user keystrokes). Those subprocesses could then inherit (and keep open) fds we wanted to be closed.
/usr/bin/google-chrome is written by Google and analyzed by the security team. /usr/bin/chromium is written by enthusiasts (the distro's package author).
The thing is that spawned processed do not give a damn about already opened file descriptors (except for stdio). But the standard is broken for historical reasons, very unfortunately.
It would be nice to have more detail here, at least why this is "historical" and "unfortunate". A whole slew of programs wouldn't be possible and capabilities wouldn't exist without this, both older and modern. Two examples are inetd's wait functionality (letting the spawned program accept(2) further connections on the inherited listening socket) and systemd's, ahem, socket activation.
main program sens[sic] commands to remote location and closes the socket
If you intend the socket connection to be ended, you should be calling shutdown(2) on it, not close(2). close(2) only asserts that the calling process is done with it. The child process would end up with a dead socket fd (although, there are bugs related to improperly handling this, here's one that bit me: https://github.com/brianmario/mysql2/issues/516 ). This is why the failure cases on syscalls should always be examined and there should be a sane, default failure case. In an ideal world, the leaked file descriptor would merely be just that, and a properly written program would ignore (or close) any file descriptors it wasn't explicitly designed/meant to use.
It would be nice to have more detail here, at least why this is "historical" and "unfortunate".
I think a lot of people feel that way about Unix features that interact awkwardly with threads. I tend to feel that threads are the unfortunate part, but that seems to be minority opinion.
The thing is that spawned processed do not give a damn about already opened file descriptors (except for stdio). But the standard is broken for historical reasons, very unfortunately.
I read that as "it would be better if the default for spawned processes were to _not_ share file descriptors", not as "it would be better if it were impossible to inherit file descriptors"
I had a really "fun" redirection bug. My library was appening a log file to stderr, but for various convenience reasons, it opened /dev/stderr (instead of just writing to fd #2).
We got a bug report that if you ran a program that used the library by doing:
prog >& /tmp/log
then stderr from the program before my library started logging would disappear. If you did this intead, it would work fine:
prog |& tee /tmp/log
It turned out (obvious in hindsight, not at the time) that we opened /dev/stderr with O_TRUNC, causing /tmp/log to be truncated and earlier logs to be lost.
TLDR: /dev/stderr can be truncated, which was unexpected.
I'm not sure if this is POSIX, but if you program accepts positional arguments, in bash you can do eg.:
diff <(somecommand) <(someothercommand) and the shell will replace these with /dev/fd/<n>, allowing you to also accept standard input without relying on hardcoded file descriptors.
Great post, and can I also plug the fact that the parent website / app (HTTRACK) is an excellent app and something I use or have used on a daily basis for a few years now.