Hacker News new | past | comments | ask | show | jobs | submit login
Insufficiently known Posix shell features (apenwarr.ca)
140 points by rcfox on Feb 28, 2011 | hide | past | favorite | 34 comments



Are those really POSIX or just bash? Namely the $() example. I was under the impression that only `` was POSIX, and that $() was bash/zsh-specific (maybe ksh too).

Also,

  ${CC:=gcc}
is only more readable than:

  [ -z "$CC" ] || CC=gcc
if you know what that syntax means. There are a ton of syntax in bash/zsh to modify varables inside of the ${} construct. I usually have to look them up every time that I see them, even when I was the one the wrote the original code. I would say that more people know the 'test' syntax than the ${:=} syntax.

  > 3. You can assign one variable to another without quoting
This is somewhat annoying, because something like this:

  command="ls -l"
  $command /path/to/dir
doesn't work. It tries to run a command called "ls -l" which probably doesn't exist. Passing that to exec will work, though. A more common example of this issue would be trying to build a command-line in the shell script before executing it, you might have base level of options:

  base_options="-l -d --debug=1"
  command $base_options $opt1 $opt2
But $base_options gets passed as a single parameter to command.


$() is standard, not a bashism. Every code example in the article was tested on multiple shells. Use $() and be happy, and I recommend never touching the backquotes again. They're a disaster as soon as you need to nest them.

${CC:=gcc} is a special case of syntax, but one of the points in the article is: as with any programming language, you should actually learn it so you can write better code. If people start using this one, it won't be any lesser known than any other weird sh syntax. The other variants (=, -, :-, etc) the article recommends against anyway.

As for your comments on #3, both of your examples are untrue; they work fine in a POSIX shell. Try it. (They fail in zsh, when zsh isn't in "sh compatibility" mode, so perhaps you're using that. Try switching it to sh compatibility mode.) Either way, those examples aren't actually related to point #3 in the article.


  > Use $() and be happy, and I recommend never touching
  > the backquotes again.
Any clue why the 'sh' filetype in Vim highlights $() as an error unless it reads "#!/bin/bash" on the first line of the file? (Zsh actually has its own filetype, but unfortunately no folding built-in support like the sh filetype) There's gotta be a reasoning behind that. Maybe $() isn't POSIX, but unofficially standard b/c most shells support it?

[ Note (IIRC) in vim shell filetypes are like this: sh=(sh, bash, ksh) zsh=(zsh) csh=(csh, tcsh) ]

  > Either way, those examples aren't actually related to point
  > #3 in the article.
They were sort of related in that I was talking about the shell implicitly quoting the contents of the variable. I was just pointing out a different context where (I believe) that it also does something similar.

  > They fail in zsh, when zsh isn't in "sh compatibility" mode,
  > so perhaps you're using that.
I ran into issues with that a few years ago, and then steered clear of using things in that way. I may well have been using zsh at the time as I was using it for a while before switching to bash, and then back again to zsh (and I didn't keep notes on that timeline).


Looks like $() is really really POSIX:

http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu...

In general, beware of trusting the people who wrote your editor instead just trying it to see if it works :)


I think it would be nifty if every title linked to the relevant section of the POSIX spec (maybe with a [POSIX] link next to the title itself).

Also, OpenBSD's pdksh seems to stick quite closely to POSIX shell features, it's probably a good test environment.

> In general, beware of trusting the people who wrote your editor instead just trying it to see if it works :)

That runs the risk of encountering an additional feature of the current shell compared to POSIX, this is very much the source of all bashisms: just trying to see if it works.


pdksh (at least the one in Debian) has numerous bugs; try the shelltest.od script in redo to see some of them.

I agree that "just see if it works" isn't a good portability strategy if you only test one shell. But the idea is to get as many shells as possible and try them all, or (as redo does) to write a script that accepts only shells with all the expected features. Blindly assuming POSIX compliance is a dangerous way to go, since there's no guarantee that it's in a shell just because it's in the spec.

I'll grant you that using a feature that's both in POSIX and in all the shells you test is the best option, though :) $() is in both categories.


Don't rely on vim syntax highlighting. By default it highlights shell scripts according to original Bourne shell syntax rather than POSIX. See the note on this page http://www.pixelbeat.org/programming/shell_script_mistakes.h... and this linked mailing list thread http://groups.google.com/group/vim_dev/t/41139a32772b2f5f


I'm not 'relying' on Vim, I'm just wondering. I got the impression that backticks were the only POSIX-approved way outside of Vim, but Vim highlighting it as such did help to re-enforce that idea (along with the exclusive usage of backticks in other people's code that was attempting to be strictly POSIX-compliant).


I didn't meant to come across as critical. That's a very reasonable conclusion to come to, and it's the reason why it's such a shame that vim upstream don't want to highlight #!/bin/sh scripts as POSIX syntax by default.


contrast this with emacs' syntax highlighting, that tends to break badly on any non-trivial use of backquotes. That actually shows the advantage of $() - it is simply easier to scan, both for highlighters and human readers.

That said, there are many shells, that don't support $() and other random POSIX shell features. In fact, assuming that POSIX = portable is asking for trouble, because most systems actually aren't entirely conforming to POSIX (even when they are certified as being so). Another problem with relying on POSIX for portability is that in some places, the specification is worded in way that suggest one meaning and actually means something else (see for example behavior of glibc when operating on already terminated threads, which is POSIX compliant, but that this is correct is not immediately obvious from the specification).


I heard recently that Solaris 5.10 doesn't like $().

http://article.gmane.org/gmane.lisp.guile.devel/11707


I'm pretty sure Solaris is one of those nasty OSes that continues to include a broken /bin/sh for "backwards compatibility." They presumably also have a non-broken shell around somewhere in the standard install. This is why redo goes through extra effort to choose a non-broken shell for running your .do scripts.


Solaris has a POSIX shell in /usr/xpg4/bin/sh. Unfortunately, there is no way to make /bin/sh POSIX-compliant, e.g. by setting an environment variable, a feature many other Unixes have.


Sweet. I've changed redo to try that as one of its possible shells.


The Solaris /bin/sh is not a POSIX shell.


For writing portable scritps, I strongly advise against using anything not in the POSIX spec. No matter how many shells you've tested it in, there will always be one somewhere that doesn't implement it. Conversely, if a POSIX feature is found to be broken in at least one moderately common shell, it should be avoided or worked around if at all practical.


I would just like to mention, that in order to really take the pain out of portable scripting, the easiest way to go is python scripts - and forget about the differences between win32, bash, csh and the rest.


That's what I do (s/Python/Tcl/). I feel pretty comfortable doing it, but a -little- uneasy. I feel like I'm cheating; I usually prescribe (and follow) "as dumb as possible (but no dumber)" rule, and use /bin/sh as my interactive shell to keep from relying on non-portable "creature-features". Using /bin/sh satisfies the "as dumb as possible..." rule because it's basic, standard, and everywhere. However, as soon as a shell script starts getting moderately complicated, I feel more comfortable moving to something (Tcl) that's (to my mind) less obtuse and less surprising.

edit: and then I read http://news.ycombinator.com/item?id=2271004 further below, which makes Tcl (or python, or ...) look even better.


> I usually prescribe (and follow) "as dumb as possible (but no dumber)" rule

to what end?


To my mind, the "dumb" tools are the simple ones that often ship w/ Unix, and using them forces understanding of principles, rather than rote memorization of meta-commands. From that (the hope is that) one can _understand_ what a problem is, and now to fix it.


This works great as long as your definition of portability extend only to systems with python (or perl, or tcl, or php, or ruby) installed.

Shell is the least common denominator.


This is generally good advice, but redo tries to do something else: it actually tests all your shells, and picks the one that supports the features we're expecting. That way we can set a minimum baseline.

All the features in the article are tested by redo, so at least in your redo scripts, you can assume those features are available. Maybe someone can generalize the "redo-sh" feature somehow.


As the converse, if you don't know what's in the POSIX spec, or haven't tested widely, please put "#!/usr/bin/env bash" at the top of your script. Most people who don't have bash as the default shell at least have it somewhere. And those who don't would rather have your script fail with an obvious error than a subtle one.


If you don't know what's in the POSIX spec, you should go and read it. If you do this, you will learn not only what the shell must do, but also that /usr/bin/env might not exist at all. The env utility is required, but it isn't always in /usr/bin.


I like the ability to have local variables though, which isn't in the POSIX standard. However, the Debian Policy Manual does state that any /bin/sh shell should be posix-compliant and support local variables (along with a couple of other minor extensions). This means I'm fairly comfortable using them despite their absence from the standard. http://www.debian.org/doc/debian-policy/ch-files.html#s-scri...


Here's another (much cleaner) way of reading two variables:

    $ { read A; read B } < <(echo "C"; echo "D")
    $ echo $A
    C
    $ echo $B
    D
See http://tldp.org/LDP/abs/html/process-sub.html for more information.


That only works in bash.


Is there any modern serious OS that does not provide bash?


Yes, and /bin/sh is often not bash. A sufficiently hosed system may give you only /bin/sh until you can repair things, and knowing these tricks in those cases can be a real lifesaver.


A sufficiently hosed system may not even boot. Under normal circumstances, however, you have a full set of shells, depending on how... feature-rich is the OS you are using.


Is there any modern serious OS that does not provide bash?

By default? Windows (heh). And most Unix servers don't come with bash by default. Though I think you can install it on most anything, it is not a good idea to change the default root shell on Unix servers.

That being said, OSX and Linux have Bash as the default shell. If that's all you deal with then go ahead and learn Bash (and the other GNU utilities) instead of POSIX sh. Unless you're an admin there's no point in being so pedantic.


> By default? Windows (heh)

This is neither serious nor modern, nor particularly operating :-P

> And most Unix servers don't come with bash by default

Most Unix systems are either Mac, Linux or BSD boxes, all of them provide bash (it may be a tiny little bit more complicated under BSDs). I will risk saying the list of recent Unixes (as in "with versions published in the last 5 years") that don't provide bash is limited to 3 items.

Perhaps four. IRIX 6.5.30 was launched in late 2006.

edit: maybe five. Tru64 had a launch (patches) in 2010. Guess I should have said "major version"...


There is a keen difference between "modern serious OS" and "serious production OS".


In that case, here's a very generic method for reading a specific number of lines:

    $ mkfifo /tmp/pipename
    $ (echo "C"; echo "D") >/tmp/pipename &
    $ { read A; read B } </tmp/pipename
    $ echo "$A"
    C
    $ echo "B"
    D
    $ rm /tmp/pipename
Using named pipes does introduce the issue of cleaning them up afterwards, but if you're managing this yourself you can do that manually. I'm still investigating fun alternative solutions to this problem.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: