One of the standard things coreutils does right that many other implementations ...

wahern · on March 9, 2021

That's a matter of taste. Argument permutation is evil, IMHO. It's also dangerous. If someone can't be bothered to order their arguments, they also can't be bothered to use the "--" option terminator, which means permutation is a correctness and security headache.

But it's the behavior on GNU systems, and it's even the behavior of most applications using getopt_long on other non-GNU, non-Linux systems (because getopt_long permuted by default, and getopt_long is a de facto standard now). So it should be supported.

JoshTriplett · on March 9, 2021

If you're writing a script, perhaps.

I'm talking about interactive command-line usage, for which the ability to put an argument at the end provides more user-friendliness.

wahern · on March 9, 2021

But the command doesn't know that, and in general best practice is (or was) for commands to not alter their behavior based on whether they're attached to a terminal or not.

I won't deny the convenience. (Technically jumping backwards across arguments in the line editor is trivial, but I admit I keep forgetting the command sequence.) But from a software programming standpoint, the benefit isn't worth the cost, IMO.

And there are more costs than meet the eye. Have you ever tried to implement argument permutation? You can throw together a compliant getopt or getopt_long in surprisingly few lines of code.[1] Toss in argument permutation and the complexity explodes, both in SLoC and asymptoptic runtime cost (though you can trade the latter for the former to some extent).

[1] Example: https://github.com/wahern/lunix/blob/master/src/unix-getopt....

JoshTriplett · on March 9, 2021

> But the command doesn't know that, and in general best practice is (or was) for commands to not alter their behavior based on whether they're attached to a terminal or not.

I completely agree; most commands should behave the same on the command-line and in scripts, because many scripts will start out of command-line experimentation. That's one of the good and bad things about shell scripting.

> Have you ever tried to implement argument permutation? You can throw together a compliant getopt or getopt_long in surprisingly few lines of code. Toss in argument permutation and the complexity explodes, both in SLoC and asymptoptic runtime cost (though you can trade the latter for the former to some extent).

"surprisingly few lines of code" doesn't seem like a critical property for a library that needs implementing once and can then be reused many times. "No more complexity than necessary to implement the required features" seems like a more useful property.

I've used many command-line processors in various languages, all of which have supported passing flags after arguments. There are many libraries available for this. I don't think anyone should reimplement command-line processing in the course of building a normal command-line tool.

I personally don't think permutation (in the style of getopt and getopt_long, at least in their default mode) is the right approach. Don't rearrange the command line to look like all the arguments come first. Just parse the command line and process everything wherever it is. You can either parse it into a separate structure, or make two passes over the arguments; neither one is going to add substantial cost to a command-line tool.

So, this is only painful for someone who needs to reimplement a fully compatible implementation of getopt or getopt_long. And there are enough of those out there that it should be possible to reuse one of the existing ones rather than writing a new one.

burntsushi · on March 10, 2021

> and in general best practice is (or was) for commands to not alter their behavior based on whether they're attached to a terminal or not.

Not so sure about that. ls has been changing its output format based on whether it is being used interactively or not for as long as I can remember at least. Both GNU and BSD versions.

JoshTriplett · on March 10, 2021

Fair distinction. There's a kind of unwritten understanding of what programs should and shouldn't do based on isatty, and I've never seen it explicitly documented.

Things many programs do if attached to a TTY: add color, add progress bars and similar uses of erase-and-redisplay, add/modify whitespace characters for readability, refuse to print raw binary, etc.

Things some programs do, which can be problematic: prompt interactively when they're otherwise non-interactive.

Things no program does or should do: change command-line processing, semantic behavior, or similar.

burntsushi · on March 10, 2021

> Things no program does or should do: change command-line processing, semantic behavior, or similar.

Arguably ripgrep breaks this rule. :-) Compare `echo foo | rg foo` and `rg foo`. The former will search stdin. The latter will search the current working directory.

In any case, I bring this up, because I've heard from folks that ripgrep changing its output format is "bad practice" and that it should "follow standard Unix conventions and not change the output format." And that's when I bring up `ls`. :-)

sylvestre · on March 13, 2021

This is one of the thing that I loved starting using rg :)

simias · on March 10, 2021

Technically it's based on whether the output is a tty or piped/redirected into something, not whether it's run from the shell's prompt or a script.

So for instance if you run a bare `ls` from a script that outputs straight into the terminal you'll get the multi-column "human readable" output. Conversely if you type `ls | cat` in the shell you'll get the single column output.

It can definitely be surprising if you don't know about it but technically it behaves the same in scripts and interactive environments.

burntsushi · on March 10, 2021

That's exactly what I meant. I used "interactively" to mean "attached to a tty." Look at what I was responding to:

"for commands to not alter their behavior based on whether they're attached to a terminal or not"

ls is a clear counter-example of that.

I think the behavior is a good thing. I'm pushing back against this notion of what is "best practice" or not. It's more nuanced than "doesn't change its output format."