Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Pure sh bible – Posix sh alternatives to external processes (github.com/dylanaraps)
137 points by IA21 on Oct 19, 2020 | hide | past | favorite | 56 comments


I appreciate that this isn’t full of Bash-isms—it’s sometimes hard to find out how to do something in a shell script, because you get a bunch of Bash results.


Actually, there are still some bashism, and real foot traps at that. `trap EXIT` basically only "works" for try/finally behavior in bash, not even zsh. There should really be a warning around this in the article: if you want to be have portable cleanup for all exit conditions that works in different shells, you need a lot of busywork.


Where do you see such usage in the article?



why the bash hate? I thought it was really uncommon for any unixlike os to come without bash (sh symlinked to bash)


Linux in practice always comes with bash, but sh is not necessarily an alias; e.g. on the world's most popular distro:

    > readlink -f /bin/sh
    /usr/bin/dash
Also, BSDs and derivatives either have no bash installed by default (e.g. OpenBSD) or, in the case of macOS a horribly outdated one.

Finally there are good reasons to hate bash: Bourne shell is flawed but brilliant, whereas bash is a bloated mess of unpredictable and counter-intuitive behaviour that also tends to change between versions. It's also much, much slower than e.g. dash. Unfortunately even for pure shell scripting bash in practice is the way to go; posix shell (and dash) unfortunately lack a few things that are pretty crucial and very cumbersome to work around, like process substitution.


    > or, in the case of macOS a horribly outdated one.
The latest version of macOS defaults to zsh instead of the outdated version of bash.


Doesn't matter when shebang is #!/bin/bash


That's just for the login shell. It still ships with /bin/bash, and /bin/sh still by default invokes bash (in sh compatibility mode). However there is a systemwide flag you can set to make /bin/sh run dash or zsh instead.


That's mostly irrelevant, since basically no one writes shell code in zsh and this is unlikely to change that.


That's probably going to change with macs having it as the default.

Never underestimate the power of "Well, it worked on my machine"

People seem to be willing to do apt-gets (or equivalent) in docker files, I could see several of my old co-workers doing that just to get a script working.


Unlikely. Apple have made the default interactive shell zsh, but /bin/bash still stays around and I believe /bin/sh still aliases to it by default. Meanwhile there is an enormous ecosystems of shell scripts including countless curl ... | bash installers which no one has any motivation to port to some shell which is not installed by default on most Linux distros. Although Microsoft would undoubtedly be delighted with Apple accelerating the move of developers to Windows as the Unix of choice, I'm not convinced even Apple's current management is going to be dumb enough to remove bash altogether.


Why speak of hate?

The main problem I have with bashisms is that most people who use them just assume everyone runs bash as default shell. Instead of specifying xxx/bin/bash in their script, they use xxx/bin/sh (and they don't run any check about which kind of shell is running to ensure that bash is running and fail gracefully otherwise, and they don't either document that the script must be run from bash).

There's no problem about writing in a specific language, if you clearly say you did. If I stretch the example a bit, I wouldn't like a Python script to start with xxx/bin/ruby.


There is absolutely no Bash hate here.

If you do an experiment you'll easily find that dash is faster than bash, and since dash basically just a bare POSIX shell with very few extensions, a script that runs in dash is more likely to run anywhere (ignoring the fact that nearly every program you call from a shell script comes in both a GNU and BSD variant, and possibly a more minimalist POSIX version).

So I write my shell scripts to dash, which is /bin/sh on my system (Debian).

If you use Bash-isms, you should definitely not put #!/bin/sh at the top of your script, you should be putting #!/bin/bash


Just because it is present, doesn't mean its the default or used internally by things that call shell scripts.

For example... Not that long ago I started a new job where I took over as maintainer of a project with a fairly extensive Makefile, including targets for release. I cleaned it up some but missed a couple bashisms. The original author wrote it on a Mac, on which /bin/sh is bash, and used bashisms. I took over on Linux, which has /bin/sh as dash (POSIX). So while the author used it fine, it broke for me in subtle ways that I didn't notice until I made my first, broken release (error was in formatting of md5 signature file).

The problem with bashisms is that people don't think through if their current use case is an appropriate one.


just change the shebang


It's not bash hate. It's that a lot of the time, you don't have bash available (e.g., BusyBox environments), and you want your scripts to still work.


The `/bin/bash` on macOS is severely out of date. So a script that starts with `#!/bin/bash` will be unable to use a lot of features.

If you only rely on `/bin/sh` the user has the flexibility to point `/bin/sh` to whatever shell on their system is fastest and most up-to-date (e.g. `/bin/dash` is a pretty good choice for performance).


None of my FreeBSD machines have bash installed because it's not part of FreeBSD.


I don't know if it still is, but for a decent while, Debian sh was not bash.


Debian uses dash for non-interactive shell, but bash for the interactive shell (if I recall).


Indeed /bin/sh is still dash, not bash.


If someone asked you to list all the UNIX-like OS you know about, would BSD or Plan9 make the list.


As you should. Bash --- at least bash 3.x --- is available literally everywhere and has many features essential for robust programming, like local variables. Instead of writing for some antique shell, we should all just write for bash or zsh or something modern. I don't care about being compatible with some random AIX installation that's from 1870 and powered by a steam engine.


sh is used because it is POSIX sh–you know it will work not "literally everywhere" but really, truly, literally everywhere. And your bashisms aren't going to do all that well on BusyBox, or dash. Just because you don't care doesn't mean that we should make incompatible scripts and not be aware that we are doing so.


One counterexample: bash isn’t available in the base system of typical BSD variants (though, yes, you can install it from ports).


I've worked on storage-constrained embedded systems where adding bash and its dependencies expanded my root filesystem by 50%. busybox ash is a much smaller alternative.


Security teams love stripping bash out of linux images. I was forced into living a pure sh diet while working for a bank.


If I want to use modern features, I’ll just use Perl or Python, both of which are also available literally everywhere.


Or I'll just write it in Go and easily cross compile an execuable binary for whatever platform I need.


It is however useful to know the difference, and in some cases you actually need to write POSIX (code meant to be sourced for example).


As others have mentioned... not everywhere. And sometimes you’re just writing something that really shouldn’t require a dependency on bash.

You wouldn’t write a Python script to run a few quick shell commands like copying a few files around. Same concept.


This would make sense if it said "zsh" instead. Not only is only better in terms of features, it's also released under a non-problematic Open Source license.


This is a nice resource. Many moons ago when I first heard of Kakoune [1], I wanted to write some of my own plugins for it. Whether or not the philosophy has changed, back then it was 'just use shell scripts and make sure they're POSIX compliant'.

That's when I learned about things like named pipes, the `mkfifo` command, and that it does take quite a lot of conscious effort to not accidentally include a convenient bashism. That said, there was nothing stopping you from writing the main functionality in another language and just shelling out to it. No need for the editor's config language to include primitives to do most of what a programming language would give you.

[1] http://kakoune.org/


The first Korn shell from 1988 far predates the POSIX shell, and is a much richer language (from which BASH has stolen very much).

As I understand it, ksh88 had to run on a 286 processor with 64kb data and instruction (segmented memory). The code required to do this was byzantine and frightening to maintain.

Because all of these features could not be included in maintainable code on systems with very low memory, the POSIX shell eviscerated the ksh88 language standard.

Surprisingly, checking the HP-UX man page (man sh-posix) gives most ksh88 functionality, which is definitly not in the Almquist shell (AFAIK the most popular POSIX shell).

It is unfortunate that the POSIX shell had to take such a large step backwards when a far more powerful language predated it, but the reasoning is clear (and mostly centered on Xenix on a 286).


Standardization does not work like this. Usually the consideration is that the standard standardizes the greatest common subset, subject to some give and take, of available implementations, which at that time included more than the Korn shell.

There's an entire rationale that accompanies the the Single Unix Specification, whose section on the shell command language quite clearly explains the basis for the standard. Talking about how POSIX based things on the Korn shell is to not even have read the very first sentence of that rationale section.


The first sentence of the standard is:

"This chapter contains the definition of the Shell Command Language."

I'm not familiar with the advocacy. I'm not sure that I would see much value to it.

https://pubs.opengroup.org/onlinepubs/009695399/utilities/xc...


I clearly wrote "the very first sentence of that rationale section". Even though you've erroneously picked the standard from 2004 instead of the current edition, and then erroneously didn't even look at the rationale, the first sentence of that section of that old edition's rationale is the same.


Isn't dash the more popular/common POSIX shell nowadays? Not sure how to go about evaluating this claim.


David Korn explains what happened here:

c) Under Solaris and HP-UX, ksh88 is installed in /usr/bin/ksh, ksh93 is installed in /usr/dt/bin/dtksh, but the default shell is the "Posix" shell, a superset of ksh. Is there any hope of getting this mess straightened out?

"c) Since ksh88 is not fully POSIX compliant, some system vendors have modified ksh88 to make it compliant and used that for their POSIX shell. One way to clean up this mess is to get all the vendors to move to ksh93. ksh93 has a single source that compiles on all systems from pc,'s mac's, unix systems, and mainframes. I have no say over what vendors do, but users on these systems certainly can state their preferences."

...

   # ls -la `which ksh`
   -r-xr-xr-x 2 bin bin 186356 Jul 16 1997 /usr/bin/ksh

A lot of effort was made to keep ksh88 small. In fact the size you report on Solaris is without stripping the symbol table. The size that I am getting for ksh88i on Solaris is 160K and the size on NetBSD on intel is 135K.

ksh88 was able to compile on machines that only allowed 64K text. There were many compromises to this approach.

https://news.slashdot.org/story/01/02/06/2030205/david-korn-...


Dash is the Almquist shell.

Most shells needed updates to fully-conform with the POSIX standard. The wiki mentions that the Almquist shell did not implement a "test" program that fully conformed.

The sh-posix on HP-UX also had subtle changes to the ksh88 source to bring it into compliance (but this shell still supported arrays and coprocesses).

https://en.wikipedia.org/wiki/Almquist_shell


dash is an Almquist shell. At this point there are several major ones (the Debian one, the FreeBSD one, the NetBSD one, and the BusyBox one) and a whole raft of minor ones (Minix, Cygwin, Android, et al.).

* https://www.in-ulm.de/~mascheck/various/ash/

And Wikipedia does not say that. It says that the Minix Almquist shell, specifically, had a non-conformant test utility. The Almquist shell did not, after all, have a built-in test command at all to start with, so the standard conformance of that utility wasn't a matter for the Almquist shell.

Once again Wikipedia is wrong, because Thomas Dickey's original page that purportedly supports this claim points out that the Minix test command was an external command and not part of the Almquist shell. M. Dickey even pointed to the source code for the external command on GitHub.

* https://invisible-island.net/autoconf/portability-test.html#...

* https://github.com/Stichting-MINIX-Research-Foundation/minix...

* https://commons.wikimedia.org/wiki/File:Don't%20abbreviate%2...


FWIW shellcheck can help flag cases where you used a bashism in a file with the #!/bin/sh shebang.


On the subject of optimizing tput, this article mentions using hard-coded strings. Another approach (for some tput commands) is to run the command once, save its output in a variable, and re-use it.

For example, a slow version of printing a blank chess board:

   #! /bin/sh
   
   for rowpair in 1 2 3 4
   do
     for colpair in 1 2 3 4
     do
       echo -n " $(tput rev) $(tput sgr0)"
     done
     echo
   
     for colpair in 1 2 3 4
     do
       echo -n "$(tput rev) $(tput sgr0) "
     done
     echo
   done
And a faster version:

   #! /bin/sh
   
   rev=$(tput rev)
   sgr0=$(tput sgr0)
   
   for rowpair in 1 2 3 4
   do
     for colpair in 1 2 3 4
     do
       echo -n " $rev $sgr0"
     done
     echo
   
     for colpair in 1 2 3 4
     do
       echo -n "$rev $sgr0 "
     done
     echo
   done
In many cases, tput is doing doing anything more than looking up a string and printing it. Though in other cases like "tput cols", it is doing more. (And anyway, the number of columns isn't a constant.)


As the headlined page is about portable shell script coding, note that the -n option to the echo utility is not portable. You should be using printf here, to be in the spirit of the headlined page. (And you should also be quoting variable assignments for safety.)

There's a long history to this, which starts with the fact that adding any option support at all to the original echo breaks stuff. The result over the years has been quite a mess. The current Single Unix Specification has an note stating outright that one should simply not use -n or escape sequences at all with echo if one desires portability, and also stating that one should use printf instead.

* https://unix.stackexchange.com/q/65803/5132

* https://pubs.opengroup.org/onlinepubs/9699919799/utilities/e...


Thanks. With "echo", I am probably stuck in the past. I recall encountering systems where "printf" (the command) didn't exist. Maybe this is no longer a concern.

I'm less sure whether the quoting is technically necessary. It's probably better style to quote anyway, so this is mostly academic, but I'll ask anyway.

Does word splitting, etc. occur when assigning a command substitution to a variable? I know there are times when it does occur:

    $ for i in $(echo a b c)
    > do
    >   echo "$i"
    > done
    a
    b
    c
But then, empirically, when assigning to a variable, anything I throw at it seems to come through unscathed even without quotes:

    $ x=$(echo " * " b c"'"; echo d;)
    $ echo "$x"
     *  b c'
    d
    $ 
No wildcards were expanded, spaces and quotes weren't stripped, etc.

I couldn't find in the standards doc (that you linked) anything that addresses this either way.


It's one of those things where the exact rules are complex and easy to misremember. Ironically, your two scripts demonstrate one way in which the problem arises. The rules are different for each script, because the command expansions are in different contexts. And when someone in the future decides to make your code into a shell function with local variables, suddenly the rules change again.

For maintainability, for safety, for the people who come after one, and for one's own sanity, the general advice is to always quote.

* https://mywiki.wooledge.org/Quotes

* https://unix.stackexchange.com/q/97560/5132

* https://unix.stackexchange.com/q/68694/5132

* https://unix.stackexchange.com/q/131766/5132


And you can of course put the whole chessboard in a string ...

   blackwhite=$(tput rev; echo -n " "; tput sgr0; echo -n " ")
   whiteblack=$(tput sgr0; echo -n " "; tput rev; echo -n " ") 
   row_odd=$(for col in {1..4}; do echo -n "$whiteblack"; done; echo)
   row_even=$(for col in {1..4}; do echo -n "$blackwhite"; done; echo)
   chessboard=$(for i in {1..4}; do echo "$row_odd"; echo "$row_even"; done;tput sgr0)


General comment ... I'm hesitant to fiddle with IFS; if the function exits early it can make the rest of the script nonsensical.

Same goes for other 'globals' or externals such as globbing and using tput. You have to be careful to trap errors and return things to sanity.


Use a subshell.


This uses e.g. i=$(($i + 1)) to increment.

The simpler and possibly faster way is just

   ((i++))
or ((i+=1))

This syntax works in macos ksh, so pretty sure it's POSIX.

Also, ksh93 supports sequence generation.. so as an alternative to the loops, use

   start=0
   end=10
   for i in {$start..$end}; do echo $i; done


I don't think it's actually specified by POSIX, it only mentions arithmetic expansion with $(()) and there is no mention of (()) for arithmetic operations in the grammar: https://pubs.opengroup.org/onlinepubs/007904875/utilities/xc...

The example given for arithmetic expansion also uses x=$(($x-1)): https://pubs.opengroup.org/onlinepubs/007904875/utilities/xc...


Interesting! I wonder if any shells that support $((..)) do not support ((..) though.

The update 2017 standard mentions ((..)) but not in the obvious place. It's under 'compound commands/grouping' ...

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V...

> If a character sequence beginning with "((" would be parsed by the shell as an arithmetic expansion if preceded by a '$', shells which implement an extension whereby "((expression))" is evaluated as an arithmetic expression may treat the "((" as introducing as an arithmetic evaluation instead of a grouping command. A conforming application shall ensure that it separates the two leading '(' characters with white space to prevent the shell from performing an arithmetic evaluation.


Shells such as dash and ash do not support it, which are the most common test cases for POSIX scripts. It should also be noted that pre-increment itself is considered optional by POSIX, although += should work.


I wish Bash had proper support for in-process subshells. I timed my code recently, and a variety of loop constructs was dramatically faster without subshells - but often required crazy contortions to avoid spawning one.


they also have a pure bash bible: https://github.com/dylanaraps/pure-bash-bible




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: