I appreciate that this isn’t full of Bash-isms—it’s sometimes hard to find out how to do something in a shell script, because you get a bunch of Bash results.
Actually, there are still some bashism, and real foot traps at that. `trap EXIT` basically only "works" for try/finally behavior in bash, not even zsh. There should really be a warning around this in the article: if you want to be have portable cleanup for all exit conditions that works in different shells, you need a lot of busywork.
Linux in practice always comes with bash, but sh is not necessarily an alias; e.g. on the world's most popular distro:
> readlink -f /bin/sh
/usr/bin/dash
Also, BSDs and derivatives either have no bash installed by default (e.g. OpenBSD) or, in the case of macOS a horribly outdated one.
Finally there are good reasons to hate bash: Bourne shell is flawed but brilliant, whereas bash is a bloated mess of unpredictable and counter-intuitive behaviour that also tends to change between versions. It's also much, much slower than e.g. dash. Unfortunately even for pure shell scripting bash in practice is the way to go; posix shell (and dash) unfortunately lack a few things that are pretty crucial and very cumbersome to work around, like process substitution.
That's just for the login shell. It still ships with /bin/bash, and /bin/sh still by default invokes bash (in sh compatibility mode). However there is a systemwide flag you can set to make /bin/sh run dash or zsh instead.
That's probably going to change with macs having it as the default.
Never underestimate the power of "Well, it worked on my machine"
People seem to be willing to do apt-gets (or equivalent) in docker files, I could see several of my old co-workers doing that just to get a script working.
Unlikely. Apple have made the default interactive shell zsh, but /bin/bash still stays around and I believe /bin/sh still aliases to it by default. Meanwhile there is an enormous ecosystems of shell scripts including countless curl ... | bash installers which no one has any motivation to port to some shell which is not installed by default on most Linux distros. Although Microsoft would undoubtedly be delighted with Apple accelerating the move of developers to Windows as the Unix of choice, I'm not convinced even Apple's current management is going to be dumb enough to remove bash altogether.
The main problem I have with bashisms is that most people who use them just assume everyone runs bash as default shell. Instead of specifying xxx/bin/bash in their script, they use xxx/bin/sh (and they don't run any check about which kind of shell is running to ensure that bash is running and fail gracefully otherwise, and they don't either document that the script must be run from bash).
There's no problem about writing in a specific language, if you clearly say you did. If I stretch the example a bit, I wouldn't like a Python script to start with xxx/bin/ruby.
If you do an experiment you'll easily find that dash is faster than bash, and since dash basically just a bare POSIX shell with very few extensions, a script that runs in dash is more likely to run anywhere (ignoring the fact that nearly every program you call from a shell script comes in both a GNU and BSD variant, and possibly a more minimalist POSIX version).
So I write my shell scripts to dash, which is /bin/sh on my system (Debian).
If you use Bash-isms, you should definitely not put #!/bin/sh at the top of your script, you should be putting #!/bin/bash
Just because it is present, doesn't mean its the default or used internally by things that call shell scripts.
For example... Not that long ago I started a new job where I took over as maintainer of a project with a fairly extensive Makefile, including targets for release. I cleaned it up some but missed a couple bashisms. The original author wrote it on a Mac, on which /bin/sh is bash, and used bashisms. I took over on Linux, which has /bin/sh as dash (POSIX). So while the author used it fine, it broke for me in subtle ways that I didn't notice until I made my first, broken release (error was in formatting of md5 signature file).
The problem with bashisms is that people don't think through if their current use case is an appropriate one.
The `/bin/bash` on macOS is severely out of date. So a script that starts with `#!/bin/bash` will be unable to use a lot of features.
If you only rely on `/bin/sh` the user has the flexibility to point `/bin/sh` to whatever shell on their system is fastest and most up-to-date (e.g. `/bin/dash` is a pretty good choice for performance).
As you should. Bash --- at least bash 3.x --- is available literally everywhere and has many features essential for robust programming, like local variables. Instead of writing for some antique shell, we should all just write for bash or zsh or something modern. I don't care about being compatible with some random AIX installation that's from 1870 and powered by a steam engine.
sh is used because it is POSIX sh–you know it will work not "literally everywhere" but really, truly, literally everywhere. And your bashisms aren't going to do all that well on BusyBox, or dash. Just because you don't care doesn't mean that we should make incompatible scripts and not be aware that we are doing so.
I've worked on storage-constrained embedded systems where adding bash and its dependencies expanded my root filesystem by 50%. busybox ash is a much smaller alternative.
This would make sense if it said "zsh" instead. Not only is only better in terms of features, it's also released under a non-problematic Open Source license.
This is a nice resource. Many moons ago when I first heard of Kakoune [1], I wanted to write some of my own plugins for it. Whether or not the philosophy has changed, back then it was 'just use shell scripts and make sure they're POSIX compliant'.
That's when I learned about things like named pipes, the `mkfifo` command, and that it does take quite a lot of conscious effort to not accidentally include a convenient bashism. That said, there was nothing stopping you from writing the main functionality in another language and just shelling out to it. No need for the editor's config language to include primitives to do most of what a programming language would give you.
The first Korn shell from 1988 far predates the POSIX shell, and is a much richer language (from which BASH has stolen very much).
As I understand it, ksh88 had to run on a 286 processor with 64kb data and instruction (segmented memory). The code required to do this was byzantine and frightening to maintain.
Because all of these features could not be included in maintainable code on systems with very low memory, the POSIX shell eviscerated the ksh88 language standard.
Surprisingly, checking the HP-UX man page (man sh-posix) gives most ksh88 functionality, which is definitly not in the Almquist shell (AFAIK the most popular POSIX shell).
It is unfortunate that the POSIX shell had to take such a large step backwards when a far more powerful language predated it, but the reasoning is clear (and mostly centered on Xenix on a 286).
Standardization does not work like this. Usually the consideration is that the standard standardizes the greatest common subset, subject to some give and take, of available implementations, which at that time included more than the Korn shell.
There's an entire rationale that accompanies the the Single Unix Specification, whose section on the shell command language quite clearly explains the basis for the standard. Talking about how POSIX based things on the Korn shell is to not even have read the very first sentence of that rationale section.
I clearly wrote "the very first sentence of that rationale section". Even though you've erroneously picked the standard from 2004 instead of the current edition, and then erroneously didn't even look at the rationale, the first sentence of that section of that old edition's rationale is the same.
c) Under Solaris and HP-UX, ksh88 is installed in /usr/bin/ksh, ksh93 is installed in /usr/dt/bin/dtksh, but the default shell is the "Posix" shell, a superset of ksh. Is there any hope of getting this mess straightened out?
"c) Since ksh88 is not fully POSIX compliant, some system vendors have modified ksh88 to make it compliant and used that for their POSIX shell. One way to clean up this mess is to get all the vendors to move to ksh93. ksh93 has a single source that compiles on all systems from pc,'s mac's, unix systems, and mainframes. I have no say over what vendors do, but users on these systems certainly can state their preferences."
...
# ls -la `which ksh`
-r-xr-xr-x 2 bin bin 186356 Jul 16 1997 /usr/bin/ksh
A lot of effort was made to keep ksh88 small. In fact the size you report on Solaris is without stripping the symbol table. The size that I am getting for ksh88i on Solaris is 160K and the size on NetBSD on intel is 135K.
ksh88 was able to compile on machines that only allowed 64K text. There were many compromises to this approach.
Most shells needed updates to fully-conform with the POSIX standard. The wiki mentions that the Almquist shell did not implement a "test" program that fully conformed.
The sh-posix on HP-UX also had subtle changes to the ksh88 source to bring it into compliance (but this shell still supported arrays and coprocesses).
dash is an Almquist shell. At this point there are several major ones (the Debian one, the FreeBSD one, the NetBSD one, and the BusyBox one) and a whole raft of minor ones (Minix, Cygwin, Android, et al.).
And Wikipedia does not say that. It says that the Minix Almquist shell, specifically, had a non-conformant test utility. The Almquist shell did not, after all, have a built-in test command at all to start with, so the standard conformance of that utility wasn't a matter for the Almquist shell.
Once again Wikipedia is wrong, because Thomas Dickey's original page that purportedly supports this claim points out that the Minix test command was an external command and not part of the Almquist shell. M. Dickey even pointed to the source code for the external command on GitHub.
On the subject of optimizing tput, this article mentions using hard-coded strings. Another approach (for some tput commands) is to run the command once, save its output in a variable, and re-use it.
For example, a slow version of printing a blank chess board:
#! /bin/sh
for rowpair in 1 2 3 4
do
for colpair in 1 2 3 4
do
echo -n " $(tput rev) $(tput sgr0)"
done
echo
for colpair in 1 2 3 4
do
echo -n "$(tput rev) $(tput sgr0) "
done
echo
done
And a faster version:
#! /bin/sh
rev=$(tput rev)
sgr0=$(tput sgr0)
for rowpair in 1 2 3 4
do
for colpair in 1 2 3 4
do
echo -n " $rev $sgr0"
done
echo
for colpair in 1 2 3 4
do
echo -n "$rev $sgr0 "
done
echo
done
In many cases, tput is doing doing anything more than looking up a string and printing it. Though in other cases like "tput cols", it is doing more. (And anyway, the number of columns isn't a constant.)
As the headlined page is about portable shell script coding, note that the -n option to the echo utility is not portable. You should be using printf here, to be in the spirit of the headlined page. (And you should also be quoting variable assignments for safety.)
There's a long history to this, which starts with the fact that adding any option support at all to the original echo breaks stuff. The result over the years has been quite a mess. The current Single Unix Specification has an note stating outright that one should simply not use -n or escape sequences at all with echo if one desires portability, and also stating that one should use printf instead.
Thanks. With "echo", I am probably stuck in the past. I recall encountering systems where "printf" (the command) didn't exist. Maybe this is no longer a concern.
I'm less sure whether the quoting is technically necessary. It's probably better style to quote anyway, so this is mostly academic, but I'll ask anyway.
Does word splitting, etc. occur when assigning a command substitution to a variable? I know there are times when it does occur:
$ for i in $(echo a b c)
> do
> echo "$i"
> done
a
b
c
But then, empirically, when assigning to a variable, anything I throw at it seems to come through unscathed even without quotes:
$ x=$(echo " * " b c"'"; echo d;)
$ echo "$x"
* b c'
d
$
No wildcards were expanded, spaces and quotes weren't stripped, etc.
I couldn't find in the standards doc (that you linked) anything that addresses this either way.
It's one of those things where the exact rules are complex and easy to misremember. Ironically, your two scripts demonstrate one way in which the problem arises. The rules are different for each script, because the command expansions are in different contexts. And when someone in the future decides to make your code into a shell function with local variables, suddenly the rules change again.
For maintainability, for safety, for the people who come after one, and for one's own sanity, the general advice is to always quote.
> If a character sequence beginning with "((" would be parsed by the shell as an arithmetic expansion if preceded by a '$', shells which implement an extension whereby "((expression))" is evaluated as an arithmetic expression may treat the "((" as introducing as an arithmetic evaluation instead of a grouping command. A conforming application shall ensure that it separates the two leading '(' characters with white space to prevent the shell from performing an arithmetic evaluation.
Shells such as dash and ash do not support it, which are the most common test cases for POSIX scripts. It should also be noted that pre-increment itself is considered optional by POSIX, although += should work.
I wish Bash had proper support for in-process subshells. I timed my code recently, and a variety of loop constructs was dramatically faster without subshells - but often required crazy contortions to avoid spawning one.