Hacker News new | past | comments | ask | show | jobs | submit login
Safe ways to do things in bash (github.com/anordal)
892 points by signa11 on May 14, 2018 | hide | past | favorite | 240 comments



I've written a ridiculous amount of shell script in my day, especially when doing "devops" before we had a term like "devops" to describe it. I've fallen in almost every pit bash has. With that background, here is my opinion.

1. This article contains excellent advice and should be starred for later retrieval.

2. Having basic scripting skills will make you a way better programmer. Many times I've done huge refactors and needle-in-hay-stack searches using only shell commands.

3. Shell is the universal language.

4. Bash isn't that bad once you get used to it (seriously. I'll grant you tho that arrays are still nasty ;-) ).

5. Bash is not that dangerous if you follow best practices. Don't be lazy!

6. You will not regret getting really good at shell script. You'll have to take my word for it now because you don't know what you're missing.


Re 6: I am an /extremely/ mediocre developer but I have crafted some real 99th percentile bash skills (within my company, not globally) and I completely self-sustain myself on just that. (You're wondering how many people write bash where I am, and the answer is somewhere between 8 and 15 thousand people).

It's funny how many things that seem kind of incredible for a pure bash solution, looking back over the past 21 years doing this drivel. This year is the first I had to learn how to export all of the shell's variables and function definitions for a forked clone to run asynchronous background processing of the main thread. Bash! Bash of all things. Probably 10 seconds in any other language, but here is bash kind of hopping alongside. Callbacks, reflective programming, there's always a really awkward way in bash to do what's happening in the popular languages.


> It's funny how many things that seem kind of incredible for a pure bash solution, ...

My go to example of “Bash can do whaaat?!” is the source for xip.io. A custom DNS server written in a handful of lines of Bash!

https://github.com/basecamp/xip-pdns/blob/master/bin/xip-pdn...


I have made a lot of DNS products and doing the same thing happening here is a few lines of code in most any language. This code, while entertaining, cannot handle very simple DNS packets because of its "compression encoding.


This is ran behind PowerDNS, which handles the compression prior to handing off to this code, so that shouldn't be a problem.


Which means it’s not really doing any of the heavy lifting.


Which is also the case for most bash scripts, the heaving lifting is done by all the executables that are called from bash, not in native bash.


Nobody said it couldn't be done, just that it shouldn't. Fun stunt but ultimately suitable only for toys or as a gimmick.


I've developed on consumer grade routers from big vendors - Asus, linksys, netgear etc, and they are all using identical quality scripts for a large majority of the router's data processing. These scripts may not be pristine, but they are sure not always just a toy.


Over 20 years ago, I was home on a sick day in the middle of refactoring a build system based on Bourne shell. We could not use any of the new features of BASH but had to stick to what worked as /bin/sh across a bunch of supercomputer vendors. I was finding a lot of arcane tricks to optimize execution on those systems, some of which had improbably expensive fork-exec.

In my fevered state, I wrote some arithmetic functions to do the equivalent of `expr $a + $b` and `expr $a \* $b` without any fork-exec to external commands, just to prove to myself that it could be done. I think I abused things like $IFS, for loops, recursive function calls, and case statements to make an arbitrary-precision decimal adder function.


I think it's time people start using something better than bash/zsh that is decades old, like fish or even come up with a more modern shell.

Even by looking at these examples, you see it has less verbosity like "then" and "do", you can reference arguments as $argv instead of cryptic $@ and exit status code as $status instead of $? which is confusing with $! and the likes.

https://blog.codeship.com/lets-talk-about-shell-scripting/

https://fishshell.com/docs/current/tutorial.html

Shell is such an integral part of admins and programmers workflow yet I find it hard to believe this field has been so slow at improving. Even the fish site jokingly states "Finally, a command line shell for the 90s" implying the others are even older.


You might find the Oil Blog[0] interesting, there are a lot of interesting thoughts there about what a truly modern shell might look like.

[0]: https://www.oilshell.org/blog/


Great read, thanks.


I think a big issue is that bash is available everywhere, while fish might not be. There's also the fact that a lot of us have fancy dotfiles for our work/home computers, and switching to another shell would mean having to rewrite them in the target shell language.


So, you'd rather use something that is decades old because you can't/don't want to (ask to) install 1 new program on the server and take a weekend to rebuild your config file?

You can simply import aliases as is and in some cases, you may be able to even simplify parts of your config.

Personally, it was easy for me as I'm administering the servers and it was all just installing fish on every servers. (some dozens.)


> So, you'd rather use something that is decades old because you can't/don't want to (ask to) install 1 new program on the server and take a weekend to rebuild your config file?

Absolutely, yes. Because long story short, bash is TriedAndTrue® technology. Stuff that works.

It takes a bit of dedication to be mastered at a decent level but it pays off immensely. It's so ubiquitous it is one of those tools that you can learn once, use for the rest of your life, and use in a lot of contexts.

It's so widespread that it can bring you very far with very little.

All this being said, as someone who used to write code for a living (and now works as a System Engineer, using the bash shell everyday) I must say that if you do not do input verification (according to the language of your choice) and something goes wrong then it's your fault.


"works"

More like "better the devil you know" (if you can even say that much).

Bash doesn't scale. Every shop larger than 1 has been burned by bash gotchas. Use a real scripting language and shell out to the commands and builtins when necessary.

At my shop bash is strictly disallowed in production environments and we're all better off for it.


> Bash doesn't scale.

Thanks mate, I had a good laugh.

I wouldn't expect bash to scale anyway. That's not what it's meant for. It's meant for system administration task automation.

On a more serious note ...

In many occasions, the performance you get depends on how you tackle the problem you have, though. Even using bash and the tools from the unix toolbox, sometimes you can gain significant improvements on how you manage your data.

Anecdotal: I cannot remember the details, but I remember that rearranging the order of sorting and searching and removing duplicates (sort, sort -u, grep, uniq mainly) I saw a significant speedup.

Anecdotal (2): I cut the execution time of a night-running job from hours to minutes (tens of minutes, to be honest - but still less than an hour) just by slicing the size of a problem into smaller parts and by handling each slice in parallel (the machine had 48 cpus, but the problem was being solved "sequentially" on one cpu alone). I wrote some 30-50 lines of python, just to implement parallelism control: the rest of the problem was still handled with bash script. Partial results were reassembled at the end. Bash has coprocesses, so I might have handled that in bash as well, but python was more handy at the time (meh, i just wanted to optimize that problem).

What I am trying to say is that sometimes the "scaling" you get is justified by the size of the problem, sometimes it's not.


It's worth noting that bash is really a glue language to call other programs. If you mainly use the tools from the unix toolbox (i'm thinking of grep, for example) you really get the "scaling" (the performance) of native executable code.

Again, it really depends on how you handle your data.

Having a number of filters chained via pipes is really efficient, for example, when compared with looping over an array and executing some python/perl/ruby one-liners every time.


I should have clarified. I'm not talking about micro-optimizations. Bash doesn't scale operationally. You might be a bash wizard who never fucks up, but you're never going to be able to keep non-wizards from having to use your awful bash codebase.

Bash is not at all meant for "system administration task automation", the very idea is ludicrous. It has to be the most singularly ill-purposed invention ever to be applied towards task automation.


> Use a real scripting language and shell out to the commands and builtins when necessary.

Not to mention, for example: https://julialang.org/blog/2012/03/shelling-out-sucks


It is not webscale!


Do you guys have a different, approved shell for prod?


Nobody should be using any shell in prod


It's not just me, but convincing my team to switch to a new scripting language. Especially if they've never heard of fish and have been using bash/sh their whole careers.


Availability is a big issue. It is nearly impossible to get a seasoned sysadmin to install fish for you when bash is available.


If you're convinced your productivity may change by ditching bash, I'm not sure what kind of counterargument the sysadmin will bring.


Avoiding the next Shellshock? As bad as it was, who knows what goodies an even less-audited shell has in store


Using bash never prevented the problem in the first place.



What's wrong with something that's decades old? Bash is great for command line and small scripts. It's once you surpass 100 lines where things get problematic. At that point, you're not writing a shell script, you're writing a tiny application and need to treat it as such.


What's wrong with something that's decades old?

In the case of shells, not benefitting from the extra decades of experience. There are some horrible things that you can do with Bash just by mistyping a single character or using the wrong type of quotes or failing to appreciate how something will expand in some edge case. This is not a desirable property for an environment that people use after being paged at 4am, while their employer is losing $XXX,000/minute because something critical is down, and where the half-asleep operator is one short command away from deleting the universe without so much as a confirmation prompt.


I've started using Ammonite for scripting, which certainly feels more solid than bash.

https://github.com/lihaoyi/Ammonite

Many times previously I've starting really quickly testing/prototyping an idea in bash, which eventually grew large and unweildy. Ammonite appears like it might allow quick hacking, but also the option to easily transition your scripts into scala proper while leveraging the good bits of Java.


Why not both?

> Bash has arrays and a safe mode, which may make it just about acceptable under safe coding practices, when used correctly.

> Fish is easier to use correctly, but lacks a safe mode.

> Prototyping in fish is therefore a good idea, provided that you know how to translate correctly from fish to bash.


Doesn’t the original article say fish isn’t safe?

> Fish is easier to use correctly, but lacks a safe mode. Prototyping in fish is therefore a good idea, provided that you know how to translate correctly from fish to bash.


Just make the start. When a modern shell becomes reliable and popular enough, it'll slowly replace the old ones.


Re 4: bash is an acceptably-good domain-specific language.

Which leads right into 5: that domain is unsafe-by-design and failure-tolerant. Which is phenomenally useful and efficient for one-off tasks, and phenomenally dangerous for something you allow others to inject behavior into.[1]

There are a fair number of things which bash could do better, and probably should. But many of the gotchas like "if you don't quote it, it'll expand into multiple arguments" are features when you want to throw things around efficiently. It wouldn't be as efficient for plugging so many disparate tools together if it weren't such an efficient footgun.

[1]: If you need anything even remotely fault-resistant, `set -euo pipefail` and use `shellcheck` (and the rest of this resource, it is indeed great) and then consider using a "normal" programming language instead. But for behavior you fully control it's not too bad if you're careful.


> 6. You will not regret getting really good at shell script. You'll have to take my word for it now because you don't know what you're missing.

Stories! C'mon, it was a long Monday :-)


Haha. Mostly it is dozens, probably hundreds of little things over time when turning to the shell proved highly productive and effective. The shell is surprisingly really, really good at processing text (using the good 'ol unix tools), and it's amazing how often "text processing" type problems come up. Whether you are doing find/replace on source code, looking for where certain strings are used/defined, or curling some page and extracting data out of it, the shell can often provide very quickly and effectively. The bonus feature is that people are often very impressed when you whip out the shell skills and nail something they thought would be difficult.

I also started capturing most tasks as bash functions or aliases in my `~/.bashrc` file. This doubles as both a handy reference for me to look at later to remember how to do something, and automating simple things.


That's really cool, thanks. It seems that whenever I come across an annoying text-processing problem, it's always really hard or too generic-sounding to get search engine help with potential bash solutions. However I'd somehow like to improve in that area.


Check out the sed and awk tutorials here: http://www.grymoire.com/Unix/Sed.html

sed in particular is really amazing in what it can do, very simply. (Note, depending on your system, you might want to get GNU sed, for "extended" regex support.)

Those pages do a great job breaking down the basic behavior and explaining pitfalls. They're skim-able, and pretty easily searchable after than, so you can come back and remind yourself of the thing you learned last week.


Thanks!


Back before DevOps was a thing, us sysadmins were building the same automation tools in shell scripts. Using rsync, git (cli) and such like instead of docker, GitHub plugins, concourse / AWS Code Deploy etc. And ssh keys on VMware / xen stacks instead of puppet / terraform etc.

As much as I think some of the new generation of tools are pretty awesome for getting stuff done, some of it honestly feels like a step backwards because you're relying on 3rd party services or require a new server(s) to be provisioned to host your management servers; some of which require extra man power just to keep running. And as much as Bash gets a bad rep for hidden traps, I'd take that any day over Terraforms "I'm not really a programming language but I like to pretend to be" markup, HCL.

But that's the nature of the game now; it's less about hacking stuff together and more about using preformed Lego bricks to build the same approximation.


You need to fix your mentality. HCL not being a complete language is a feature. Bash is a liability. If you honestly think you can build more reliable modern systems in bash, you deserve to be laughed right back into the 70s where you belong.


Wow. It's a bit ironic to comment about fixing mentality while you're blatantly ignoring a rational argument with nothing.

I concur with the same statement. Newer domain specific syntaxes and tools are increasing the dependencies and cognitive load to get things done. One would have to remember the yaml syntax, and version specific keywords for Kubernetes, Docker, Ansible and everything in the container world just to deploy a statically linked Go binary. That's where we've come to now. It is worth pondering whether we are increasingly moving towards "give me the lego bricks, and I don't care to know what it specifically does" attitude, and if it's good or bad.


I rarely bother responding to troll accounts but I do feel the need to point a couple of things out:

1/ HCL not being a complete language is a burden when you're writing build scripts that need to target multiple environments (eg dev, UAT, staging, live) as it means you have to use the `count` syntax to include/exclude resources. Also I've frequently run into issues where output will expect a variable or resource to exist even when it's inside a ternary operator where the condition isn't met (isn't the point of an `if` condition that you don't evaluate all blocks of code in the structure?).

2/ The reason HCL isn't a "complete language" isn't because "complete languages" are a liability (clearly that's nonsense); it's because Hashicorp wanted something that met the middle ground between JSON and scripting so that all ranges of technical expertese were comfortable; and they wanted a format that could still be compiled back down to JSON. Personally I feel the monster they created is the worst of both worlds but I do think the goal they were trying to achieve is an honorable one.

3/ I was never trying to make the point that I've built more reliable systems in Bash. However once you start writing pretty complex Terraform projects (some of my build scripts are a several thousand lines of code; inc modules) you really do start to bang your head with Terraform - not all of it because of HCL though. Don't get me wrong, Terraform is definitely the best tool we have at the moment for deploying to "the cloud"; but that doesn't mean it isn't still bloody annoying to work with at times. A few bugs I've run into in the last month:

3a/ Error message wasn't being passed from AWS to the user so infra would fail to build but without explanation. A few hours of debugging and we discovered what resource was failing and why. This was a bug in Terraform so issue raised on Github.

3b/ Workspace name pushed the resource name tag over it's character limit. We did get an error message here but it was inaccurate. Thankfully this is an issue we've run into before so it was pretty quick to resolve.

3c/ The aforementioned problem of all code getting evaluated inside both conditions of a ternary operator

3d/ If your AWS token expires before the Terraform apply is complete you're left with incomplete infrastructure and no valid state file to rollback. This is particularly annoying if your builds do certificate management or ELK; both of which can take 15 to 20 minutes just on those resources alone.

3e/ You don't get line numbers nor even file names when errors are raised which makes debugging through large projects painful.

4/ I'm not about to sing the praises of Bash as a modern deployment tool as it can be a complete nightmare to work with if you're not proficient in it's hidden traps. But to answer your question; yes I have built reliable systems in Bash. It's a tool that has been around for decades, people can and have built some pretty reliable stuff in it and I personally consider myself a veteran. (for what it's worth I have nearly 3 decades of dev skills and have used a lot of other languages we might now consider "dangerous" - some of which are still used in production). As an aside note; it's the decades of years of experience in Bash and other, much lower level, languages that inspired me to write my own $SHELL and scripting language. So I tend to use that more than Bash these days.

5/ I didn't proof read this so apologies if some parts didn't make any sense. It's a long post and in replying to you I'm interrupting fixing one of those massive Terraform projects I described earlier.


I probably have a similar history but I've come to the belief that you should not use Bash if you're doing anything fancy or long (>10 lines).

Just use Perl, Python or Ruby. One or all of them is installed on every machine you are likely to use.


Yes, but which one, and which version? This is especially painful with Python ;)


Use Ruby. It doesn't change much from version to version, thank God. I curse python every time I have to install a package. Nothing Just Works.

If you install just one gem, bundler, you can get dependency management in a one-file script. Checkout Bundler Inline:

https://github.com/bundler/bundler/blob/master/lib/bundler/i...


What's painful about Python but not so for others? Writing anything complicated in bash/zsh is painful enough I wouldn't even think about it.


2 vs 3 is not a small change. and tools for helping you do things (like a 3+2 compatibility lib) get installed with pip... which is horrifyingly fragile. `pip install your-lib` -> you have no idea if you have the dependencies you require, because pip doesn't behave rationally. it often Just Works™, but when it doesn't there's no help and a massive minefield of potential problems.


I currently find some users have python2, some have python3, some both. It's fine in a controlled environment, or if your users are Devs, but a pain with normal users who I just want to send a script to.


KotlinScript is yet another option that handles that worry nicely.

https://github.com/holgerbrandl/kscript


Is there an easy way to run shell commands in Python, like backticks in PERL, Ruby or even PHP?

I'd really like to use Python more often, but most of the time it only complicates things with those verbose syscalls.


IPython. Seems to work for simple stuff like `ls blah`, but I'm not aware how far this stretches, and is probably not super safe to use.

EDIT: Seems I've only tried too basic things - it works thanks to %automagic, and seemingly %man is a thing. Other option is to use %%sh or %%bash, but that's a bit verbose.


Check out python sh: https://github.com/amoffat/sh


>sh.ls("-l", "/tmp", color="never")

Nope. I really need to be able to test commands on the terminal and then copy/paste them without any modifications. I won't even try to get used to that stupid syntax.


Unfortunately not. I have used Ruby a little bit and that is really nice feature.


part of the complication is due to actually having the proper data seperation that you need to jump through hoops to get in bash. If you go with backtick style calls you lose a lot of that.



Except for perl, python and ruby are rarely on container images and AMIs I interact with unless they are needed for the application itself to minimize attack surfaces and vectors. Perl is installed sometimes due to dependent packages that themselves depend on perl, but that's becoming much less true as time goes on. With Docker it's also very easy to keep your image super minimal. Most of the time we don't even have bash installed (just sh).


I did a talk on this a while ago:

https://www.youtube.com/watch?v=pb3k0sGKrjQ&t=457s

'Take bash seriously'


It's time we need a better shell than bash instead of thinking it's great after bleeding with it for years and gets used to it.


We have just had the 30th anniversary of someone acting upon that very thought.

> Perl is a interpreted language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information. It's also a good language for many system management tasks. The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal). It combines (in the author's opinion, anyway) some of the best features of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it. (Language historians will also note some vestiges of csh, Pascal, and even BASIC-PLUS.) Expression syntax corresponds quite closely to C expression syntax. If you have a problem that would ordinarily use sed or awk or sh, but it exceeds their capabilities or must run a little faster, and you don't want to write the silly thing in C, then perl may be for you. There are also translators to turn your sed and awk scripts into perl scripts.

-- Larry Wall, perl 1.0 release announcement, comp.sources.unix v13i001, 1988-01-02.


I’d vote for PowerShell in a heartbeat. It passes real objects instead of strings. It’s cross platform and open source. It’s imperative but borrows some functional concepts.


PowerShell is good. I don't like some of the hardened conventions .. but it's highly convenient without being a mess. Nice output, regular sort/filter functions. I like sed but it goes old quickly.


I really wonder if PowerShell wouldn't have been much bigger had it been cross-platform from the start. Definitely better late than never (:clap: Microsoft) but at least in my sphere we never even thought seriously about PowerShell because we had too many linux machines in our environment.


As I say in the talk (and get a big laugh) 'Powershell is boring. It's too well-designed.'


The title 'bash is awesome' was explicitly _not_ what I wanted it to be called. If you watch the talk you'll see what I meant.


Can't edit original, but to answer the question regarding good sources for learning, I highly recommend The Linux Command Line . Great book: http://linuxcommand.org/tlcl.php


What is your opinion on using other languages to augment bash? Awk, perl etc.


I personally consider awk, sed, grep, cut, etc. to be "part" of bash. I don't think I ever write a script without invoking those venerable tools at least once. I often have many pipes, such as

  some_command \
    | grep -E 'some.*regex$' \
    | sed -e 's/erase_text//g' \
    | sed -e 's/erase_more//g' \
    | awk '{ print $2 }' \
    | cut -d ':' -f 1
I used to reach for Perl one-liners a lot, and still do sometimes when I need a gross regex that `sed -E` can't handle (See Perl Pie[1]), but Ruby has been making an appearance more and more often for handy one-liners. Roblog has a great post on using Ruby [2].

If the script is complex enough and I'm willing to take a dependency on Ruby, that is usually where I turn. Where that line is does change from time to time. Ruby makes it super trivial to call shell commands and process them easily as well, which makes it easy to integrate into existing scripting [3].

Hopefully that answered your question :-)

[1] http://technosophos.com/2009/05/21/perl-pie-if-you-only-lear...

[2] https://robm.me.uk/ruby/2013/11/20/ruby-enp.html

[3] https://stackoverflow.com/a/2400/2062384


You might already be aware of this, but you can chain multiple sed functions together into one command. eg

    sed -e 's/erase_text//g; s/erase_more//g'
But I appreciate your example there is more for illustrative purposes rather than an actual pipeline you have in production.


These types of pipelines are why bash and the traditional tools are so maligned. Expert knowledge of sed|(g)awk|bash tools is necessary otherwise you end up with companies telling you that you can't use them.

some_command | gawk ' /some.*regex$/ {gsub(erase_text,"");gsub(erase_more,"");split($2,a,":"); print (length(a[1]) ? a[1] : "STRING ERROR")}'


I'll definitely agree that it can get out of hand. If it's a script/tool that will have many eyes on it and people that need to understand it, bash tools probably aren't the best way to go. I usually turn to ruby in those cases. But even for one off commands, I find myself using pipes like that all the time and I'm the only one that will ever see it, so if it's greek it's no problem :-)


> sed -e 's/erase_text//g' \

I like the pattern. Copied it to my notes to try next time I am grepping through logs.

I usually just string a few 'grep -v ignore_this' with pipes but then it ignores the whole line. But I can see how erasing some parts of the line would be very helpful sometimes.


Drive by code review, you can end a line with the pipe (also && and ||) to avoid a backslash.

    some_command |
      grep -E 'some.*regex$' |
      sed -e 's/erase_text//g' |
      sed -e 's/erase_more//g' |
      awk '{ print $2 }' |
      cut -d ':' -f 1


>> I'll grant you tho that arrays are still nasty ;-)

If you're interested, here's an article I wrote recently that attempts to explain the method in the madness: https://medium.com/p/the-weird-wondrous-world-of-bash-arrays...


Highly recommend shellcheck[1]. There is a SublimeLinter plugin[2] that automatically checks your shell scripts as you code them. It generally makes best practice suggestions including quoting.

[1] https://github.com/koalaman/shellcheck

[2] https://github.com/SublimeLinter/SublimeLinter-shellcheck


Not only does it make suggestions, almost all of the 'error codes' have extensive documentation on why something is wrong and often multiple alternative solutions for each use case (eg: https://github.com/koalaman/shellcheck/wiki/SC2086). I learned more bash from Shellcheck than all tutorials and references combined.


Same. I was a bit cocky when I first tried out shell check because I had been doing bash for years. Shell check flagged something I'd been using for a while and after reading the docs I realized shell check's suggestio was much cleaner and just-as-safe way to do what I was doing. Really impressive piece of software.


Seconding this. Shellcheck is fucking amazing and whenever I visit the wiki page for a specific issue it's extremely descriptive and helps you understand why the flagged behavior is a problem.

Some of the tips are obscure stuff that I never would have realized, like using `command -v` instead of `which` in my (arch) linux install script because command -v is more portable.

And then you have the classic stuff like printf instead of echo -e, using pritnf string formatting, writing `\\n` instead of `\n`, double quoting variables etc


I'm all for "command -v", but do you know which OSes don't have "which" by default?



From the article:

> Should I use curly braces?

    Bad: some_command $arg1 $arg2 $arg3
    Extra bad (cargo culting unnecessary braces): some_command ${arg1} ${arg2} ${arg3}
    Correct: some_command "${arg1}" "${arg2}" "${arg3}"
    Better: some_command "$arg1" "$arg2" "$arg3"
> In the "extra bad" and "correct" examples, braces compete with quotes under the limits of tolerable verbosity.

> Shellharden will rewrite all these variants into the "better" form.

I prefer the "${bracey}" form for all variable usage. Yes it's marginally more verbose but it has the advantage of being consistent, easier on the eyes due to less overall quoting when part of full string interpolation[1], and cleaner diffs[2] as converting "${foo}" to "${foo}-bar" only leads to word-diff of "-bar".

[1]: "${foo} bar baz" v.s. "$foo"" bar baz"

[2]: You are source controlling your shell scripts right?


> You are source controlling your shell scripts right?

My personal heuristic for shell scripts are that if I care enough about them to put them under source control, they shouldn't be a shell script.


I have seen entirely too many Python scripts that are poor reimplementations of shell scripts that don't get the edge cases right, have more security vulnerabilities, etc. than a shell script a fifth of the size. There are things you simply can't do in other languages without excessive verbosity. (One that came up at my workplace recently is using <(...) to load a password from a secret store without writing it to disk or putting it in the environment.)

Use the right tool for the job, and then put it in version control (and add tests, too, by the way). Shell is the right language for tasks that primarily involve running lots of subprocesses, whether they are simple or complex. If a small part of your task needs functionality that can't be done well in shell, fortunately, shell is very good at running subprocesses, and it's a perfectly reasonable approach to do something like this:

    foo () {
        python3 -c 'import sys, foo; print(foo.bar(sys.argv[1:]))' "$@"
    }

    a="$(foo "$baz" "$quux")"
I regularly do this with the requests and json modules in particular, because being an HTTP client or a JSON parser is not a thing shell is good at. (For the specific problem of manipulating JSON, jq is another fine option if you have it installed.)


Beware that "python -c" prepends empty string (i.e., cwd) to sys.path, which is most likely not what you want in the context of a shell script.


> without excessive verbosity

If no-one (including you, unless you dream in bash) is ever going to modify or extend the script, OK. Otherwise python's readability/maintainability trumps excessive verbosity every time.


By "excessive verbosity" I do mean poor readability and poor maintainability. If you're invoking 5 commands and most of your Python code is wiring up the commands to each other right, just write 5 lines of shell, don't make me pull up the subprocess docs to see if your 30 lines of Python are doing the same thing and how to make a small change without risking pipes deadlocking.

Python is readable and maintainable when it's doing the things Python is suited towards doing. Running lots of external processes is not one of those. This is not a complaint or an insult to Python as a language (which I use regularly!), it is just a statement that different languages have different strengths and you should use the right tool for the job.

If you know of a good Python library that handles things like shell pipelines, <(...), and automatic creation of process groups (so that signal handling does the right thing), I'd be extremely interested, because I would like to use Python for these use cases. But it's currently the wrong thing for maintainable code for this one use case, and there is a very good language that handles this use case very well and is extremely stable and widely deployed.


I'd recommend checking out Plumbum (https://plumbum.readthedocs.io) -- at the very least, it has a solid base for easily setting up pipelines, input/output redirection, and signal handling.


Wow, thanks! I'll have to check this out.


Does the Plumbum library grant your wishes? I've not used it myself so I don't know, but its purpose is to make you "never write shell scripts again", so it might.


My personal experience would disagree, and I feel that is throwing caution to the wind.

As any software person knows, you really can't tell what's going to happen to the code/scripts you put out, not committing it to source control is a dangerous game to play.

If you have a simple shell script sitting on a server doing some basic task, why would you not have it under source control where it can be viewed by future teams and seeing what changes have been made to it over time? Seemingly simple problems can be caused by minor changes which are very visible if its under source control.

Just because its simple doesn't make it any less important. Complexity is not a good measure of its importance.

Especially when you start trying to implement IAC in legacy areas...


At that point, you care enough to put it under source control, and it shouldn't be a shell script. That's the entire point of the comment you replied to


And that's a stupid heuristic. Simple bash scripts can be critical to a processing pipeline while still being the best tool for the job.


It's that arbitrary distinction that I am responding to. I don't agree that you should reserve source control for the complex.

A shell script can be the appropriate tool for many important tasks.


I agree that simpler scripts deserve to be in source control too. I read it to mean what I also would say, myself: if it's anything but the most trivial of scripts (so, rather _programs_), they shouldn't be in BASH. Some people in this discussion are clearly very well versed in BASH. Great. For average developers, though, it's hard to build (good) programs in BASH, and the rabbit hole swallows them. Every time.


I completely agree, source control is not used to track complexity. It's to track changes and the reasons for those changes... however small they may be.


Straight from teenage wisdom on freenode ##linux circa 2003


Not sure I agree with this. We have lots of scripts that are shell scripts and are important enough to be in source control. I don't view shell scripts as throw away solely


My sequence is wiki steps, script, shebang (better scripting language), application. Every step is under version control of some form or other.

The shell script is just locking down some sequential command line interaction into something less error prone.


Do Makefiles fall under that category?


Chiming in for myself, no. Makefiles are more about dependency graphs and reproducible pipelines. I use them to make more sophisticated things than bash scripts, which are purely imperative.

I very much dislike cmake, and I've not yet been impressed by any of the other many Make replacement candidates.

How else would you accomplish what a Makefile would?


I think for C compilation, Tup is probably better than make. It also has a couple of small syntax improvements compared to make. In particular, it's much better with commands that loop over targets and dependencies.

However, as someone who doesn't write much C, I went back to make. Makefiles seem to adapt really well to different workflows.

A lot of my makefiles look like "download this publically available data, scp this secret data off a company server, mungle all the data with some scripts, shallow clone this branch of that repository, run the program in the repository a few times under different conditions, make a chart of the results".

Tup really doesn't like steps like "clone this repository". Make isn't exactly great with keeping track of targets that output entire directories, but it does work.

I also feel like, for me, make has become more powerful over time as I've learned other tools. Specifically: being able to just magic up any old environment using Nix and be confident that it won't break, mashing data with things like jq, cut, sed and awk, and quick database operations with SQLite.


I haven't used them enough to have an opinion because of questions like these, which I need time to answer for myself.


Honest question: what happens when the variable has a quote in it? If FOO is ‹xyz"; rm -r *; "xyz› (where I've used ‹› as delimiters) then won't even "${FOO}" expand to multiple arguments/commands? Or does bash automatically escape the quotes in the situation?


Quotes that arise from substitutions aren't considered to be quotes.

Unfortunately, the way GNU Bash handles this sort of requirement is to internally translate these protected quotes into some character that "nobody" would ever use, and then recover them later.

That characters code is none other than ASCII 1 (SOH/Ctrl-A). It's known by the preprocessor symbol CTLESC in the Bash sources.


jesus christ is nothing sacred


Both of those examples end up being fine. When quoted the entire variable is passed as a single WORD and the shell won't interpret the `;` as the end of the command. Even without quoting the variable it would be fine in this case because of where you've put the quotes, though you should still quote the variables. As a simple illustration take a look at the following:

    function e() {
        echo $#
    }
    $ x="hello world"
    $ e $x
    2
    $ e "$x"
    1


I agree. Consistency wins over typing 2 extra characters. It's much easier (for me at least) to scan shell scripts and notice the variables when they are written consistently.. and that means using braces.


When I saw that recommendation it made be sad. I love seeing my variable names with a quick glance. Came to the comments mostly to see if anyone had the same preference.


Can't help but this brings to mind a recent article on medium about about "The irrational love for curly braces" (in the context of web frontend architecture and markup languages but still).

[1]: https://medium.com/@fagnerbrack/front-end-separation-and-the...


> [1]: "${foo} bar baz" v.s. "$foo"" bar baz"

Why would you quote the second option like that? You can just write: "$foo bar baz"


I don't think your example backs up your point about it being easier on the eyes. This would also work:

    "$foo bar baz"
Since there's a space after foo, the braces don't actually do anything and thus _can_ be eliminated. Your other points make a decent argument for _not_ eliminating them even though you can, but removing extra characters makes it much easier to read.


I always think — when a programming/scripting language requires this much bizarre knowledge just to write basic code that performs basic tasks, perhaps it is time for that language to be retired.

I really don't understand why bash still exists. I've switched to fish and am much happier with the change.


I agree with you 100%

It basically comes down to tradition, ubiquitousness, and stubbornness. Which is sad to me. We fix other stuff. If an exploit appears in the OS it gets patched. Yet here we have a system that basically defaults to bad behavior, un-secure. It should be redesigned/replaced with something that makes it hard to do it wrong. If not replacing then maybe adding a "safe" mode and slowly deprecating the non-safe way. Sure it may take years. So did moving from python 2 to 3 but it is happening.

Maybe it will take the powers-that-be to start having their data stolen though shell exploits to realize if they made the shell hard to get wrong by default they'd be helping themselves as well?


Because it is ubiquitous. You can virtually guarantee that bash will be found on any arbitrary unix-like system.


That makes it a good idea for a compiler target for a better programming language. Not using it, though. ;)


Someone thought it was a good idea to use a portable subset of the shell as a target for a worse programming language: layers of M4 macros.

They didn't back away with anything like "not using it though".

So now we have GNU Autotools.


At least the person behind that tragedy apologized in the Generation Lost in the Bazaar article's comment section.

https://queue.acm.org/detail.cfm?id=2349257

That one person did something damaging doesn't refute the point, though. Just avoid another M4 situation.


I keep seeing this statement but this absolutely does not apply to everybody.

How often can you just install fish or use other programming language instead of being forced to use bash?

It's sad people keep using the default just because of being afraid that the next system you touch might not have it and you waste tremendous amount of productivity without using something better.

I know a guy who used vim with default config for that reason. Utterly nonsense.


No one ever thinks of interoperability between people. Bash is known, to a variable depth, by pretty much every sysadmin (and a big part of being a sysadmin is being able to write shell scripts).

So I might prefer the fish, and who came before me might have had a preference for zsh. So now I have to deal with three different shells: bash (default), fish (for the scripts i'll be writing from now on) and zsh (because of compatibility). Congrats.

> I know a guy who used vim with default config for that reason. Utterly nonsense.

I started using less and less emacs/vi(m) customization and learn more and more of the defaults for the same reason: whenever I log on a client system I am instantly proficient with the editor without stupid complaints like "but on my box is different..." .

Anecdotal: I have seen people losing their editing speed/proficiency because all of a sudden they were in a clean vim session and had no one of their shiny and colorful plugins.


Why can you not modify the editor environment that you use often?

Hard to see I need to deal with environments where I can do nothing but keep the default every time I use it.

Once you get the speed that is tuned to your liking, the default sounds like you're walking with legs strapped.


> Why can you not modify the editor environment that you use often?

Because I am not the only one accessing those environments (please note the plural here).

Currently there are about 13 other sysadmins in my team and we manage clients infrastructures among other things (managed services). From time to time someone from another team accesses those environment (not a sysadmin, but still familiar with the bash shell). Sometimes the client accesses those systems (rare, we try to discourage and avoid that).

Can you even imagine what a mess it would be if we all started applying our own favorite settings ?

(edit: fix grammar)


How hard is it to create a user for each of the system admins?

It seems it's not a good practice to share a single account as it makes it hard to tell who did what.


Don't make that assumption. I worked on older Solaris installations, they did have bash but the production scripts were written in ksh 88. I had to dig up obscure features of this shell in because they didn't even upgrade to the newer version ksh 93 (yes, those numbers come from their release dates...1988 and 1993).


Same reason I always cook my eggs using a shoe.


Because a pair of shoes is always standing ready for cooking eggs in, in 99.9% of all homes?


As a CS history lesson I think bash is extremely well fit. The UNIX shell started off at some point, and in the meanwhile it became possible to use newlines and spaces and unicode, and to have that tool still proving its usefulness after all these years is not only incredible, I'd even go as far as say that the learning of bash makes you more aware of general weirdnesses whenever you attempt to use similar patterns in your own programming. CSV needs quotes, SQL needs quotes. Streaming data and delimiting it correctly isn't just a bash "problem domain". It's pretty much universal, really.

For a recent example of a related problem: https://bugs.chromium.org/p/chromium/issues/detail?id=533361


The shell is acceptable for configuring the build of a better language. The excuse there hinges around the argument that we don't want to require the user to already have an executable version of that language installed in order to build that language from sources. And we also don't want to require some competing better language X to build our better language Y, because that doesn't look good.


Absolutely agree. As fish site (jokingly) states, it was for the 90's, perhaps it's time we want one for the 20's next.


> POSIX mandates /bin/sh

Nope. On the contrary, it says: Applications should note that the standard PATH to the shell cannot be assumed to be either /bin/sh or /usr/bin/sh

Source: http://pubs.opengroup.org/onlinepubs/009695399/utilities/sh....

A pedantically-compliant shell script should not have shebang at all.


> it says: Applications should note that the standard PATH to the shell cannot be assumed to be either /bin/sh or /usr/bin/sh

It also recommends a script: "Installation time script to install correct POSIX shell pathname".

But I wonder how they execute this script. So this is all crap. They should remove that and instead put in something like "you have to test your script on the platform where you deploy.".

The article is missing the one big problem I encountered at most: bin/sh instead of bin/bash or the wrong version of sh|bash.


> But I wonder how they execute this script.

As I mentioned earlier, the trick is to have no shebang.

If the first line of a file of shell commands starts with the characters "#!", the results are unspecified.

[...]

If the execve() function fails due to an error equivalent to the [ENOEXEC] error [...], the shell shall execute a command equivalent to having a shell invoked with the pathname resulting from the search as its first operand, with any remaining arguments passed to the new shell, except that the value of "$0" in the new shell may be set to the command name.

Source: http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu...


What's wrong with `#!/usr/bin/env sh`?


I've worked on systems where env was installed as /bin/env and not as /usr/bin/env . (I think it was SunOS 4.)

For that matter, under Termux on Android it's /data/data/com.termux/files/usr/bin/env (but termux has a hack to make normal shebangs work).


https://www.in-ulm.de/~mascheck/various/shebang/#env mentions other OSes that have only /bin/env, although admittedly they are all quite old.


Termux rewrites scripts via termux-fix-shebang.


It now also does rewriting on the fly for certain shebang lines, as part of its C library, so unmodified scripts can run.


I didn't know that, thanks.


No problem. I wrote it because I version control my scripts and didn't want to fork them for running them on my laptop (Chromebook with Termux).


"/usr/bin/env" is relying on a binary being present in a specific location just as much as relying on a binary being present in "/bin/bash". For most distros and BSDs, "/usr/bin/env" is more likely to be present, but it's not guaranteed for neither "/bin/bash" nor "/usr/bin/env".


POSIX doesn't include shebang (#!) at all. It should, but it currently doesn't.


It doesn't include /usr/bin/ either. The "env" utility could be somewhere else.


> currently doesn’t

Yeah, we should be gentle with POSIX, it’s only 30 years young :)


Newest revision is from 2017, so they certainly had time to fix that, if they consider it a problem.


Try that where env ain't there.

Android, motherfuckers.


Distros that places symlinks to bash in all those locations solves this, and allows all POSIX-compliant shebangs for sh and bash. On Arch Linux, /bin/bash, /bin/sh and /usr/bin/sh all point to /usr/bin/bash.


If you reached a point where you need to require bash and not a posix shell and need to enforce these rules just use python or lua if you can or some scheme or whatever else... it's not worth the wasted time hunting bash cruft.

if you are on busybox with ash none of this is helping (except shellcheck which is great).


Sometimes you just want some 30-50 lines of piping a few commands and a couple of conditionals.

Python (the language with the most community traction to replace bash for scripts) is a royal pain to use for this without libraries that are present in no default system, and even with those it often ends up being more verbose than it should.


> Sometimes you just want some 30-50 lines of piping a few commands and a couple of conditionals.

maybe I was not clear - nothing against shell-scripting but doing some weird dancing like in the article is imho useless, also depending on bash is a stupid idea imho.

posix sh + shellcheck is all you need. if you can't solve your problem in posix sh rethink your code / approach and simplify until it will work.


I agree.

I've used this page successfully as reference for portable syntax:

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3...

Some sections on features I use heavily:

- Parameter Expansion (specifically :-, %, %%, #, ##)

- Special Parameters (specifically "$@", $#, $?)

- set --, this lets you set the $1, $2, etc. variables. I use this with "$@" for arrays, primarily for building command strings.

Here's an small shell script demonstrating some of them:

    #!/bin/sh
    
    # err function
    err() { echo "$1" >&2; exit 1; }
    
    # init variables
    unset src
    unset dst
    dry_run=false
    
    # get arguments
    while [ $# -gt 0 ]; do
        case "$1" in
            -d|--dry-run) dry_run=true ;;
            --) shift; break ;;
            -*) err "unknown option: $1" ;;
            *)
                if [ -z "$src" ]; then src="$1"
                elif [ -z "$dst" ]; then dst="$1"
                else err "unexpected argument: $1"
                fi
            ;;
        esac
        shift
    done
    
    # sanity checks
    ## TODO: print usage
    if [ -z "$src" ]; then err "source not specified"; fi
    if [ ! -d "$src" ]; then err "source does not exist"; fi
    if [ ! -r "$src" ]; then err "cannot read source directory"; fi
    
    if [ -z "$dst" ]; then err "destination not specified"; fi
    
    if ! rsync --version >/dev/null 2>&1; then
        err "missing rsync(1)"
    fi
    
    # build rsync command
    set -- rsync -aq "$src" "$dst"
    
    # log command
    echo "copying $src to $dst"
    echo "    $@"
    if ! $dry_run; then
        if "$@"; then
            echo "success"
        else
            # rsync will have printed an error message
            err "rsync exited with error code $?"
        fi
    else
        echo "dry run; not executing"
    fi

Also be sure to read the man page for test (the [ command).


For typical system scripts, ansible is much better than bash. It has all the commands that are missing from Unix and it's not subject to the bash problems mentioned here.

I agree that python is a pain. It's too low level for any task.


I still use bash a lot though, it's just handy, with lua and python you will have to deal with various of its packages sooner or later, for bash it is just one binary and you get 90% scripting tasks covered, zero dependencies.


if you need arrays use mksh - https://www.mirbsd.org/mksh.htm - if you need complex scripting with small footprint and standard compliance avoid bash and use mksh.


You can count on bash being present on almost any Linux system, other than tiny embedded ones. You can't count on mksh being anywhere unless you install it.

Doesn't matter if it's better or worse, it isn't omnipresent.


actually I do lots of embedded, busybox is about 2MB and bash is 1MB, unless the resource is extremely restricted I just install bash for scripting purposes.

python is about 4.5MB and Lua is about 200K, both run on top of a shell anyways, and need some packages to be fully useful.

To be fair adding 200KB Lua (no luarocks etc) on top of a shell is useful sometimes on embedded system, but I rarely need that.


In addition to bash itself being widespread, most of the common "bashisms" are also supported by zsh and most variants of ksh (and many systems actually use some variant of ksh disguised as /bin/sh). This is unsurprising given that bash itself was originally meant to be a clone of ksh. The places where they differ tend to cluster around uncommon/newer features (like coprocesses), certain syntactic quirks (but if you're aware of them they're usually easy to work around), and interactive features (which are mostly irrelevant for scripts). If you're writing simple scripts (or you're writing complex ones in a careful way), you can reasonably "target" bash and expect them to work on other common shells.

If you're going to be distributing shell scripts, it's probably a good idea to test them on a handful of common shells anyway. If portability between shells isn't something you're worried about, it's exactly as reasonable to ask users to install bash as it is to ask them to install mksh or zsh (or even python or lua or whatever) anyway. If you're most comfortable with bash, just use bash!

But for the love of god and all that is holy, please don't write csh scripts ;)


Bash is available on almost every system. The portability is desirable sometimes.


How many people need that portability? Most of the shell scripts are quick work that needs to run on small amount of environments.


Interesting stuff, but given bash's

- Relative unportability - Poor noncompliant sh interpreter - Poor performance (can be 4x slower than POSIX sh shells like dash for certain tasks)

I personally think bash is always the wrong choice. Use POSIX sh (or a real language). POSIX sh is 98% the same thing, there's no good reason to even use bash over sh in the vast majority of cases. It's just this blight that won't go away.

Given that bash is mostly sh anyway, most of this writeup applies to sh, too. AFAIK the only thing bash-specific here is arrays.


I stick with bash to avoid implementation differences due to POSIX ambiguity [0]. I know that all my bash scripts will run on at least bash-4.0 but I have no idea what shell /bin/sh is going to be for any given system. I haven't written a single shell script where the performance difference matters, nor do I expect to. And there are a few nice bashisms besides arrays. I'm not sure what portability issues you're referring to.

[0] https://stackoverflow.com/a/16376043


[[ alone is reason enough to prefer bash over sh on teams where you inevitably have non-experts making changes. Safe Bash is hard to teach, posix is far harder.


I agree with you, but there's one fly in the ointment. POSIX Shell does not support local variables! Writing functions without being able to have local variables? That's pretty damn gross.

So I use the `local` keyword, and now my scripts aren't strictly POSIX Shell any more.


A little heads-up: "$var" does what you think, but "$(cmd)" likely does not do what you think:

- The former just gives you a string whose contents are identical to that of var.

- The latter would do similar for the output of cmd, except that it strips away the trailing newline. This is often not an issue, but can be crucially important in some cases, and can catch you off-guard.

The point I'm making here is that it's actually quite difficult to get a string that literally has the contents you want. The fact that *nix lets you put pretty much any characters in file names (even newlines) means that, just like in Windows, your scripts can actually fail even when you try to quote things "properly".


> The point I'm making here is that it's actually quite difficult to get a string that literally has the contents you want.

It's not difficult, it's just tedious.

  foo=$(whatever).
  foo=${foo%.}


That's not general POSIX, right? I seem to recall it's Bash-specific? (P.S. I think you forgot quotes?)

The other problem (which I guess I accidentally brushed under the rug when I singled out "variables") is that having to do this actually means you need to put it in a variable. If you're nesting subshells, this gets pretty darn tedious, easy to forget about, and difficult to read pretty quickly... it seems you wouldn't consider that "difficulty" and think of it as "just tediousness", but I think if something is too easy to do incorrectly and too tedious to get right, that's also a kind of added difficulty.


> That's not general POSIX, right? I seem to recall it's Bash-specific? (P.S. I think you forgot quotes?)

Wrong. (And no, I did not.)

POSIX, or rather SUS (I've never had an access to POSIX), mandates ${foo%...} syntax and its three cousins. And assignment is not subject to word splitting for variable expansion.



Whoa interesting, thanks! Didn't know that.


Shellcheck made me a MUCH better Bash developer.

Also, I prefer using [ condition ] for tests instead of the less-portable [[ cond ]] syntax despite the latter being more feature-rich. Didn’t see that one in there.


Why do you prefer the former?


Portability


If you get to the point when you need to write

    IFS=$'\v' read -d '' -ra a < <(printf '%s\v' "$s")
it is a good sign that it's time to switch to some other language. For example in python, it will be just

    a = s.split("\v")
(and yes, sometime you have legacy system, or writing initrd script. But how often does this happen? Any why does your initrd has full bash anyway, as opposed to dash or sh?)


Most of this is fine, but not all. In particular, command failure conditions are a huge source of bugs, but `set -euo pipefail` is not going to solve all your problems. `set -e` is just as likely to cause problems because it can cause scripts to silently fail. And pipefail can pass through errors that aren't relevant.

These are great tools to have, but don't blindly invoke them as magic spells if you don't actually understand the implications.


> Gotcha: Errexit is ignored depending on caller context

It proves the point that it's a gotcha but those examples seemed sensible to me. As far as I understand it, `set -e` doesn't turn every unchecked, non-zero exit code into an exception because there's no way of knowing whether the function or sub shell you're invoking was written by you or pulled in elsewhere, and as a result you don't know if a non-zero exit code is a legitimate, show-stopping failure.

Those functions and subshells might as well be mini inline executables and in that context it makes sense to only check the final output. If that's horribly wrong and confusing, maybe there should be a move to make `set -e` the default so all error handling is explicit, but you've got other languages for that that don't involve throwing `|| true` onto the end of every unimportant command you run.

I also realise that this doesn't make a case for Bash being intuitive. Precisely the opposite. But I suppose you have that with a shell where it's more important to be adaptive to the person behind the keyboard at the expense of the purity of the implementation. Especially considering the history of it all.


Recently I've replaced most of my bash scripting with the python library invoke.

It's written by the same people who wrote the python ssh scripting framework fabric.

http://www.pyinvoke.org/

Works great!!


I'll have to try that one. I've done a few things with: Python process launching http://amoffat.github.com/sh


Be careful with sh library! It runs the programs under tty by default, so you get random effects like ascii color sequences and truncated git output. There is an option to change this, but it not default, and it is easy to forget it.


I worked with `sh` for a while. I strongly prefer invoke. It handles signal processing and TTY flawlessly.

You can even get it to ssh into remote terminals and open a console and it works.

It's also a CLI builder where you can chain commands together


Backticks are very error prone $(cat foo.txt) is much more explicit and visually clearer for command substitution.


And this way also allows nesting.


fish forces you to use braces. (Even without the dollar sign.)


A tip from me, based on a mistake I made yesterday: don't `source ~/.bash_history` instead of `~/.bash_profile`.

(Luckily it entered vim relatively soon, from where I could kill the process).


At least it's not recursive.


Bash strikes me as a bit of a mess, as in people threw the kitchen sink into it for 'portability'.

Things like being able to open a socket e.g. using the /dev/tcp/<ip>/<port> stuff give me the willies a bit.

I lean towards Ruby if it's going to be anything longer than a few lines, or requires anything but the simplest of logic/commands. Otherwise I was always told /bin/sh is likely to be the most portable, so tend to use that in the absence of any other good reason.


I've written two network/telnet game-bots in Bash. It's terribly hacky but fun in it's own way. Using the dev tcp trick and arrays and lots of bashisms.


nice write up, starred it for future hand outs to my friends.

I do find that once my bash script is going over a hundred lines or so it's likely a good time to move to python.

I love bash, it's great; at some point setting up a proper script (in bash) with arguments, options, logging and or validation you end up spending more time getting it to work than you do on solving the actual problem; enter your favorite programming language here.


The advice to double-quote everything is an interesting way to circumvent "detailed knowledge may be required". I wish it noted that you can't quote regular expressions, though:

  $ cat - > foo
  #!/bin/bash
  a="BCD"
  [[ $a =~  .C.  ]] && echo 1
  [[ $a =~ ".C." ]] && echo 2
  ^D
  $ bash foo
  1
  $


this is intentional, since you can store your regular expression in a variable:

    $ a=BCD; b=.C.; [[ $a =~ $b ]]; echo $?; [[ $a =~ "$b" ]]; echo $?
otherwise, interpolating variables in regular expressions as text would require other syntax (more confusing).

also, "cat -" is redundant, use "cat". this behavior is specified by POSIX.


I use “cat -“ so that my code makes more sense to people. I want STDIN declared somehow, and the dash is effective. Technically I shouldn’t have bothered with the cat at all in an HN code snippet, but it was a courtesy to provide a familiar environment for the block I wanted to convey. It worked so well that you linted it! I really appreciate the thought.


While I agree with the guide, there is one thing I was missing while writing bash scripts with 'set -e' and that was some kind of stack tracing. So I added a nice trap function to my personal bash script template. Be aware, that this version does not include all best practices described in the guide.

  #!/usr/bin/env bash
  #--------------------------------------------
  # Default Bash Script Header
  set -eu
  trap stacktrace EXIT
  function stacktrace {
          if [ $? != 0 ]; then
                  echo -e "\nThe command '$BASH_COMMAND' triggerd a stacktrace:"
                  for i in $(seq 1 $((${#FUNCNAME[@]} - 2))); do j=$(($i+1)); echo -e "\t${BASH_SOURCE[$i]}: ${FUNCNAME[$i]}() called in ${BASH_SOURCE[$j]}:${BASH_LINENO[$i]}"; done
          fi
  }
  
  SCRIPT_DIR="$(dirname "$(readlink -f "$0")")"
  #--------------------------------------------


I notice a pattern in all articles about BASH. Some people say it should die. Others praise its ubiquity and versatility at handling certain tasks. Of course, both sides are right.

From my point of view, BASH will never disappear, because it's not a living thing that runs out of food or habitat and dies. That doesn't happen, unless there's a major revolution in computing that makes current paradigms obsolete, something on the scale of the disappearance of the dinosaurs.

Until that happens, I welcome any projects that aim to decrease the amount of buggy BASH in the wild. I avoid it as much as possible, but if someone's going to use it, at least they'll have safety nets to reduce the possible damage.

PS: I have a feeling my metaphors are all over the place, but I hope that doesn't detract from the message.


Overwhelmingly a great resource, though I will nitpick at this one:

>Globbing is easier to use correctly than find.

For very simple purposes or small file trees, sure. Outside that: find is incredibly more powerful, and worth using in many cases. If nothing else, learning to use `find . -path ignore_this_path -prune -o *.ext -print` can change e.g. a Go project script from visibly-slow to instant. (e.g. the latest place I used this went from 1-5+ (hot vs cold) seconds to 5-50ms)


How does one properly quote an argument for a command that can contain spaces?

For example:

    parent_dir = "$(basename $dir)"
How to quote $dir here? What if it contains spaces or other special characters?

That's what I dislike about Bash.

Also, always start your scripts with

    set -e
This prevents script from running after error, without any messages though.

Also, I always make mistakes when using [ and [[.


You quote it like this:

    parent_dir="$(basename "$dir")"
Quotes are syntactic in bash, so it understands that the outer pair of quotes are surrounding the command substitution and the inner pair are inside it. This will work regardless of any spaces or special characters in $dir.


$(..) is its own context so you just put double quotes around any variables like you normally would:

    parent_dir="$(basename "$dir")"
[ and [[ can indeed be tricky. You may find ShellCheck useful, since it recognizes common problems like [ a=b ], [-e .bashrc], [ 1 > 2 ], [ -n var ], [ false ] and several others.


$( allows any syntax until a non-escaped ). It's Command Substitution[1], and they run in "subshells".

    echo hello "$(
        echo hello \
        | sed 's,hello,world,g'
    )"
[ is an actual binary file that gets invoked, called test. You can "man [" or "man test" to learn about all of its uses.

I use [ exclusively, but with && and ||, instead of -a and -o respectfully.

e.g. [ -e "$file" ] && [ -w "$file" ] instead of [ -e "$file" -a -w "$file" ].

[1] http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu...


You just add quotes around $dir. Bash understands the quotes properly/recursively.

    parent_dir = "$(basename "$dir")"


So I finally found the answer. Thanks.


Bear in mind (pointed by the article as well) that variable assignment doesn't allow spaces around the equals sign, and that in this case the outer parentheses are unnecessary anyway.


> Quoting inhibits both word splitting and wildcard expansion, for variables and command substitutions.

The result of a variable substitution isn't subject to wilcard expansion, whether quoted or not.

If your only reason to quote "$foo" is because you think foo expands to a globbing pattern, and no other reason is justified, you can drop the quotes.


It is subject to wildcard expansion. You can verify this with

    var="/*"; echo $var


Ooops, you're right! I somehow mixed this up with special contexts.

  case $var in
  '*' )
    echo asterisk ;;
  esac
But wildcard expansion is suppressed there unconditionally. Same as in assignment context: other=$var.


those both suppress word splitting too.


what the balls are you talking about?

    $ touch bad; a='*'; echo $a
    bad
I tested with shbot and this is the case in at least original bourne sh, ksh, mksh, dash, bash 1, and bash 4.4.


Indeed; see https://news.ycombinator.com/item?id=17070872

I'd erase my hasty comment if I could. I'm just praying it gets downvoted to invisibility now.


> Furthermore, set -u must not be used in Bash 4.3 and earlier.

This is a dangerous advice. set -u indeed requires more verbose array handling in Bash < 4.4, but it also catches code like this:

  rm -rf "${OOPS_UNDEFINED}/"


Step 1 to write safer shell scripts: use a safer shell. Zsh gives the user much more control, has safer defaults, and is itself quite portable (even if Zsh scripts are not portable to other shells).


unfortunately, zsh is installed on probably about 1% of Linux systems worldwide. maybe it can be installed almost anywhere, but the fact is it isn't, and you might as well use Python or something at that point.


And how many on the system that you actually touch? How hard is it to (ask to) install a new shell? Obviously if you're distributing your script publicly no one would write that in zsh compatible way.


Reason number 650 why I love Ruby: it offers ridiculously gradual transitions from shell scripts. One time I built up a pretty weighty conditional-heavy script in Bash and didn't want to keep adding to it. So I simply made all the bash calls use backticks. Took me all of a few minutes to get option parsing working again.

One day I'll learn the Ruby way of doing shell one-liners and that'll hopefully keep me out of man pages for that sort of thing forever.


Is this project ready to share?

It's not published to crates.io, and doesn't have a license or Cargo.toml


I don't agree with the advice to use arrays and set -u together. Sometimes you just need to get something done, and being pedantic works against you.

This advice only works in bash 4.4, and many common distros are on bash 4.3, like Ubuntu 16.02 LTS. (bash 4.4 was released September 2016.) Because of this bug, the section "how to begin a bash script" is version-specific and awkward IMO.

If you need to process untrusted filenames, use arrays, but otherwise it's probably more trouble than it's worth. [1]

I want my scripts to work on older versions of bash, and I think 'set -u' is important, so I mostly get by without arrays. It's not ideal, but shell is full of compromises.

A workaround is to use ${a[@]+"${a[@]}"}, which avoids the bug in bash 4.3, but that seems too ugly to recommend.

More details in this comment I wrote on the same article: https://lobste.rs/s/4jegyk/how_do_things_safely_bash#c_kmldw...

[1] Thirteen Incorrect Ways and Two Awkward Ways to Use Arrays http://www.oilshell.org/blog/2016/11/06.html


I gave up long ago on trying to learn shell scripting and remembering it. Switching to xonsh for my shell made shell scripting easier.

http://xon.sh/


For further portability you can convert bash shell to C and compile it into binaries.

https://github.com/neurobin/shc


if it needs to be safe, use python.

please don't down vote me


Safe bash is what Perl was invented for. Don't downvote me.


someday i might even add it to my toolkit


I think Python is made for other tasks than shell. Shell programs mainly consist of invocations for external command-line utilities that you can develop and generalize from interactive shell sessions and one-off automation scripts. Python OTOH is a general-purpose programming language.


Yes, Python is a general purpose language; however it has powerful libraries to safely interact with command line utilities. The main drawback is their verbosity. But you get a lot in the exchange. For small scripts of a few hundred lines, bash (or some equivalent) is the way to go. But anything more than that, you need something like python or perl (in my opinion, of course)


Why ever resort to shell scripts when we have languages like ruby / python for anything more complex than installs?

Clarity is king people


Because

  foo | awk '/bar/ {print $3}'
is more clear than

  import subprocess
  foo = subprocess.Popen(['foo'], stdout=subprocess.PIPE)
  for line in foo.stdout:
    if 'bar' in line:
      try:
        print(line.split()[2])
      except IndexError:
        print('')
Sometimes shell scripts are more clear. Especially for tasks that involve running lots of external commands.


Nobody doubts that pipes and languages like awk are great for one liners, but I think that's a little besides the point of this post, which is advocating for things like the use of bash arrays:

``` array=( a b ) array+=(c) if [ ${#array[@]} -gt 0 ]; then rm -- "${array[@]}" fi ```

``` array = [a, b] array << c array.map { |i| `rm #{i}` } if array.length > 0 ```

There's also nobody stopping you from using text processing tools like awk and sed, or bash one liners in ruby/python either, but I think we should leave the logic and arrays for scripting languages, no?


You're moving the goalposts! Previously it was "anything but install scripts". Now it's "logic and arrays".

I actually think we agree with each other, but we express it differently. I never write bash scripts, I write POSIX shell, so all the array juggling of bash is something I never deal with. As you say, by the time you need arrays, you should have switched languages already.

That said, I think there's a fairly large domain of problems - apart from install scripts - that are better solved with shell scripts, because of their clarity. Anything which relies on invokation of lots of other tools, and in particular problems that fit the pipe model (take output from this tool, extract interesting bits from it, and feed it to that tool, etc). And this I say as an otherwise almost slightly fanatical Pythonista :-)


    if [ ${#array[@]} -gt 0 ]; then
a better way to write it:

    if (( ${#array[@]} > 0 )); then


the idiomatic way:

    if (( ${#array[@]} )); then


I prefer not to use implicit boolean conversions. It's also less readable.


If your everyday language is JavaScript, use shelljs. It’s great. You’ll get portable scripts in no time. Sure it’s “slow”, in a way that likely doesn’t matter because of what you’re calling from the script.


From the author of "Russian roulette: how to probably not die", "Staying healthy with fast food" and "Self-immolation for dummies".

If you need safety don't use Bash.


Amen. Writing bash scripts (I have been programming in Unix since 1985) is an unnatural, fragile, and error prone endeavor. If is more than 5 lines long, I use Perl (which still far from ideal). When an article is almost solely about what not to do, that tells you something.



" Should I use backticks?

Command substitutions also come in this form:

    Correct: "`cmd`"
    Bad: `cmd`
While it is possible to use this style correctly, it looks even more awkward in quotes and is less readable when nested. The consensus around this one is pretty clear: Avoid."

This is how one can tell a rookie just learning to program in shell: usage of $() syntax is limited to Bourne family of shells which implement that particular POSIX specification aspect.

Backticks, on the other hand, although they incur a performance penalty since they spawn a subshell, make one's code instantly portable across all UNIX-like operating systems and even across disparate shell families, as they work exactly the same in C-shells. (Whether one should program in a C-shell family is a different discussion.)

The subshell performance penalty is negligible in 99% of the cases as this 1970's technology has tiny memory and processor overhead due to the fact that it's been developed on systems with small memory and a slow CPU, so it had been optimized for performance.

Over my 30+ years of shell programming, I know of only one documented instance where the $() construct which doesn't spawn a subshell made a difference, and was the only time it was actually a valid requirement:

https://www.joyent.com/blog/building-packages-at-scale

but even then, the author ended up using dash, not bash.

For maximum portability and closest adherence to POSIX, program in Korn shell, ksh93. (Modern versions of ksh implement ksh93 functionality.) Then you may safely use $() and be assured it will work in all Korn shells across different operating systems (even in ksh88).

Otherwise, DON'T avoid using backticks, because you will be giving away portability for no good reason. Don't program in bash, but in original Bourne shell (sh) for maximum portability across different operating systems; don't assume that you can use bash constructs in Bourne shell (as /bin/sh on GNU/Linux tells bash to run in Bourne shell emulation mode, but that mode isn't implemented completely or correctly, since bash constructs are still accepted). Always test your shell code on a traditional UNIX like HP-UX or a Solaris derivative like SmartOS if possible, with a real Bourne shell.


> "This is how one can tell a rookie just learning to program in shell: usage of $() syntax is limited to Bourne family of shells which implement that particular POSIX specification aspect. Backticks, on the other hand, although they incur a performance penalty since they spawn a subshell, make one's code instantly portable across all UNIX-like operating systems and even across disparate shell families, as they work exactly the same in C-shells."

The '$()' notation is a standard shell feature de facto. The fraction of people who care about their script working on all UNIX-like operating systems' default shells is very close to 0. The recommendation is fine - this notation is more readable and nestable.

> "Always test your shell code on a traditional UNIX like HP-UX or a Solaris derivative like SmartOS if possible, with a real Bourne shell."

Only if your script needs to be "original Bourne shell" compatibile. Which is almost never for most script writers.


“Long live the GNU/Linux hegemony and monoculture, the only truth and true religion”.

Lovely.


Long live the inventiveness and free spirit of contributors who brings us useful improvements and progress to what would otherwise be a cumbersome legacy computer interface.


Which interface? Are you sure you could not have made a broader generalization and a more nondescript statement?


I think you might be confusing $() with (). There is no functional difference between backticks and $(), the only difference is in how they are parsed with regards to escaping. There is no subshell involved in either case.


After I‘ve been professionally programming in the original Bourne shell, tcsh, ksh for more than 30 years, you took it upon yourself to tell me that I‘m confusing $() and ()? Only on “Hacker News”. Appalling.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: