A lot of people suggests python, I'd argue that python is a bad fit for what many bash scripts does.
A bash script is basically scripting what you do on your terminal. I feel like scripting in the same language that is your terminal interface is underrated.
You can basically take a look at your command history and wrap it up as a script.
Yes, bash is clunky as hell. It has some really ugly warts. But learning the basics will make you more proficient in the terminal. The synergies are really nice.
Interesting that a lot of the (general, not specifically here) commentary is "don't use bash, use python" which I find quite clunky, noisy, and boilerplatey to do the same thing as a shell script. Kinda Master Foo and the Ten Thousand Lines†.
Ruby's https://github.com/ruby/shell (which used to be bundled in Ruby itself) doesn't get much mention but is much more clear and useful to me.
# system commands are not defined by default, only builtins, for safety
Shell.def_system_command 'ls'
Shell.def_system_command 'grep'
Shell.def_system_command 'jq'
Shell.new.transact do
# simple redirs just work
ls | grep('test') > 'some/file'
ls | grep('other') | tee('foo') >> 'some/file'
# pops at the end of the block
cd 'some/dir' do
# nested redir for demo
(cat < 'foo.json') | jq '.[] | { bar: .bar }'
end
end
plumbum is nice, although the optional parens of Ruby makes things more shellish.
Also Ruby's Shell was part of the stdlib, so if a system had Ruby you could use it without having to go through system ruby `gem install` conundrums. I think they made a mistake unbundling that one.
As a linux admin, bash and python are my bread and butter.
I have two heuristics for choosing between them:
- If it needs more math than counting, choose Python
- If it's >200 lines (500 if I'm brave), choose Python
You could use both Bash and Python for tasks they are good at: concisely creating one-line shell pipelines in Bash and implementing complex logic that glues together these pipelines in Python.
Just skimmed through the book and noticed several examples where it doesn't use argument quoting. For example the first chapter about variables doing echo $name. In fact, there is no mention of argument quoting at all. This should be the number one lesson before touching any bash script. Otherwise your scripts will be vulnerable to command injection. I recommend running the whole book through shellcheck.
It also recommends wrapping variables with curly braces like ${name}, without explaining why, later on never does so itself throughout the rest of the book. There are a lot of such missing explanations in the whole book. It does make it short and fun to jump straight into the examples but i think it could benefit from some facts and reference material, at least to link to.
Anyway, seems like most people here are more interested in arguing over bash vs python than discussing the actual content. A little surprising to see this get so many points.
Using python and constraining yourself to only use a basic subset of the standard library modules so you can run the script in pretty much any environment is almost always a better choice than trying to write even one loop, if statement, or argument parser in a bash script.
bash script is "okay" I guess if your "script" is just a series of commands with no control flow.
Hard disagree. I've written plenty in both. They both have their strengths, but bash is just more efficient if you're working with the filesystem. The UNIX philosophy of "do one thing and do it well" shines here. Python is more powerful but it's a double-edged sword. If I want to read a file containing API endpoints, send a request to them for some JSON, and do some parsing, I don't want to need or want to deal with importing modules, opening file objects, using dictionaries, methods, functions, etc.
Why do that when I can literally just
```
N=0
while read -r URL; do
curl "$URL" | jq '.data[].someProp' | grep -v "filter" > "$N.data"
N="$((N+1))"
done < ./links.txt
```
The other thing is bash makes it exceptionally easier to integrate across different system tools. Need to grab something from a remote with `rsync`, edit some exif, upload it to a CDN? That's 3 commands in a row, versus god knows what libraries, objects, and methods you need to deal with in Python.
Libraries are nice, until you have to write the glue code between the modules and functions. But sometimes you already have the features you want as programs and you just need to do some basic manipulation with their arguments and outputs. And the string model can work well in that case.
> Why do that when I can literally just ``` N=0 while read -r URL; do curl "$URL" | jq '.data[].someProp' | grep -v "filter" > "$N.data" N="$((N+1))" done < ./links.txt ```
? That code's meaning is extremely clear: it reads from a list of URLs that return JSON lists of JSON objects. For each URL, it pulls out some property and checks whether each line does not contain the string 'filter'. Those lines which clear the 'filter'-filter are written to a file whose name is the line of the original input file which contained the URL pinged suffixed by the extension '.data'.
It's very easy to read and modify if you just write it out longwise, which is what I'd always do in some actual script. (I also like to put reading data at the beginning of the pipeline, so I'd use a useless use of cat here.) To illustrate:
It's a very simple pipeline with a single loop. It's not very different from pipelines you might write to transform some data using a thread macro in Clojure, or method chaining against a collection or stream in OOP languages like Java or Scala or Ruby or whatever you like.
That's really not that hard to add above. A lot of folks act like it's impossible to handle errors etc. in bash, but it's pretty straightforward -- certainly no more difficult than in any other language. The hard part, like with all languages, is deciding how to handle errors cases. The rest is just code.
On mobile so no idea if this a) looks good or b) runs (especially considering the command substitutions, but you could also redirect to temp files instead), but it's just something like this:
You forgot "-f" flag to curl, which means it won't fail if the server returns an error. Also "jq" returns success on empty input, pretty much always. Together, this might mean that networking errors will be completely ignored, and some of your data files will mistakenly become empty when server is busy. Good luck debugging that!
And yes, you can fix those pretty easily.. as long as you aware about them. But this is a great illustration why you have to be careful with bash: it's a 6-line program which already has 2 bugs which can cause data corruption. I fully agree with the other commenter: switch to python if you have any sort of complex code!
set -euo pipefail # stop execution if anything fails
cleanup () {
rm $tmp_file
popd
# anything else you want to think of here to get back to the state of the
# world before you ran the script
}
trap cleanup EXIT # no matter how the program exits, run that cleanup function.
But why bother? The moment you start doing all that, all the arguments of "oh look how much i can solve with my cool oneliner" goes away. The python version of that code is not only safe by default, it's also shorter and actually readable. Finally it is a lot more malleable, in case you need to process the data further in the middle of it.
for N, url in enumerate(Path("links").read_text().splitlines()):
resp = requests.get(url)
resp.raise_for_status()
prop = resp.json()["data"]["someProp"]
matches = (line for line in prop.splitlines() if "filter" not in line)
Path(f"{N}.data").write_text("\n".join(matches))
I'm sure there is something about the jq [] operator i am missing but whatever. An iteration there would be a contrived use case and the difficulty to understand it on a glance just proves I'm not interested. As someone else mentioned, both curl and jq requires some extra flags to not ignore errors, i can't say if that was intentional or not. It would either way be equally easy to solve.
There's real value in 'one-liners', though. A 'one-liner' represents a single pipeline; a single sequence of transformations on a stream of data— a single uninterrupted train of thought.
Chopping a script into a series of one-liners where every command in the pipeline but the first and/or last operate only on stdin and stdout, as far as is reasonable to do, is a great way to essentially write shell in an almost 'functional' style. It makes it easy to write scripts with minimal control flow and easy-to-locate effects.
Such scripts are ime generally much easier to read than most Python code, but I don't find Python especially easy to read.
Choice is a privilege you rarely have in a day to day job. Bash most of the time is already there and you have to live with it.
Also, I've been forced to work on a huge SCons based project and I guarantee python can make your life quite miserable when used for something it's not supposed to.
I'm not suggesting you build a a whole build system with python (which is basically bazel and it seems to be good enough for google.)
A lot of originally little automation/dev scripts bloat into more complicated things as edge cases are bolted on and bash scripts become abominations in these cases almost immediately.
Bash may be native, but alot of the programs you'll want to call may not be, or will differ between platforms in subtle ways. Although this won't be a concern for small/trivial scripts, but if we're talking about python as an alternative, my point probably still applies.
This. People using bash extensions and util-linux as if they're standard are my bane.
If you can't do it in POSIX (sh and utilities) and don't want to do an extensive investigation of portability for what you need, pony up for Python/Tcl/Perl (all in MacOS base, by the way).
True, though there’s a whole world of people who will yell at you for using Bash-isms rather than pure posix precisely because Bash (at least up to date versions) isn’t everywhere either.
I agree, but I've been having a hard time even with python recently. I had a small script (50-100 lines) to format a data drive on boot I refactored, 3 or 4 obvious undeclared variables and who knows how many more I didn't notice - mypy found 0 issues.
I was looking up statically typed alternatives and stumbled upon Ammonite and Scala-CLI for scala. I haven't used them much, but Ammonite bundles some basic libraries including command line parsing, http, and json, which are probably 99% of what I used in Python too? And Scala seems like an interesting language too with decent editor integration.
> I had a small script (50-100 lines) to format a data drive on boot I refactored, 3 or 4 obvious undeclared variables and who knows how many more I didn't notice - mypy found 0 issues.
To make mypy strict enough to compare your dev experience to a typed language, you have to declare all sorts of configurations, otherwise there are huge swaths of things it’ll allow compared to most typed languages.
I use below, and only when necessary use ignore comment pragmas when third party libraries are not typed.
As a person who’s been doing shell programming for 35 years and Python for 15 years, I completely disagree.
Bash scripts and Bash control flow has been and is used in highly critical scripts all over the place, including other planets.
We’ve been writing reliable, well-designed scripts for many decades. Some of my scripts are several hundred lines long and older than the system engineers currently occupying their positions.
Python is fine too. Use the right tool for the right job.
> Used to be a viable strategy until they started to drop modules from the standard library at every single release.
That’s a bit of a ridiculous statement, there’s a small number of very-long deprecated modules removed in 3.12, and some more recently deprecated modules in 3.13. And these things are old, largely or completely unmaintained, and usually complete obsolete.
I’d be surprised if anyone has a script that’s been adversely effected by this, and if they did it’s because they stopped maintaining it years ago (and also chose to both silence warnings and upgrade interpreter versions without reading the release notes).
Consider that the python foundation absolutely has the resources to put a developer to maintain them.
If they don't is because they don't want to.
> and usually complete obsolete
The amount of modules I've had to patch to keep working on 3.12 tells me they aren't as obscure and unused as you think they are.
> I’d be surprised if anyone has a script that’s been adversely effected by this
I'd say that over 99.9999% of python users do not download python from python.org. They use whatever is on their system. Which means that updating an LTS distribution will create mayhem. And that's considering that most modules have already been patched by the distribution maintainers to fix all the brokenness introduced by the new python version.
Also, a bash script from 30 years ago still works fine. A python script from 5 years ago doesn't start.
>Consider that the python foundation absolutely has the resources to put a developer to maintain them.
The resources to pay someone doesn’t mean that someone with interest and knowledge exists, especially for modules that were formally deprecated in Python 2 and which will never be reinstated. Lots of this stuff is just cruft, most of which has an obvious replacement, and if it doesn’t there’s a decent chance it’s not been used in years by anyone and if it ever had a reason to be in the standard lib, that reason is long gone.
> The amount of modules I've had to patch to keep working on 3.12 tells me they aren't as obscure and unused as you think they are.
If that number is at all significant, where are the issues pushing back against deprecation and removal? It’s not like there hasn’t been a formal process for all these modules. What got deleted in 3.12 was well documented and easily caught just by catching DeprecationWarning… anyone getting surprised by these modules going missing isn’t doing due diligence.
> I'd say that over 99.9999% of python users do not download python from python.org. They use whatever is on their system. Which means that updating an LTS distribution will create mayhem.
And I’ll pretty much guarantee you that 99.9999% of those users haven’t heard of, much less imported, any of the modules that have been removed.
> And that's considering that most modules have already been patched by the distribution maintainers to fix all the brokenness introduced by the new python version.
But again where are the issues and hands being waved that these issues are wide-spread enough to halt or reverse the deprecation process? If distro maintainers are simply patching everything for users who are constantly advised to leave their system Python alone and they’re not reporting the issues then those distro maintainers are harming everyone.
> Also, a bash script from 30 years ago still works fine. A python script from 5 years ago doesn't start.
I’ve written plenty of Python scripts that are still running on the interpreter and stdlib they were authored for, decades later. I’m also keenly aware that most of those scripts could not be written in Bash without reimplementing a significant portion of the Python standard lib and ecosystem, none of which was materially affected by the 3.11>3.12 removals.
For instance, some fairly commonly used Linux apps like ulauncher, autokey, fail2ban and xpra depend on pyinotify which hasnt been maintained for the last 6 years or so, which is why fedora, arch and nixos now includes patches to make it 3.12 compatible. I don’t find it very unlikely that your inhouse script could be using it too.
> The resources to pay someone doesn’t mean that someone with interest and knowledge exists
That's why you can pay people. So that despite their disinterest they will read the code and acquire the knowledge needed.
> especially for modules that were formally deprecated in Python 2
??? I'm talking about modules removed now, in 2024. They were not deprecated since python2. Please don't steer the conversation to different topics.
> Lots of this stuff is just cruft, most of which has an obvious replacement
distutils? Is it cruft? The thing used to install modules? Can you tell me which stdlib replacement it has?
> it’s not been used in years by anyone
Why did I have to patch over 10 modules?
> and if it ever had a reason to be in the standard lib, that reason is long gone
Is installing modules no longer a thing then?
> those distro maintainers are harming everyone.
Aaah yes, the evil distro maintainers that keep things working instead of breaking everything. They're the real culprits here… really?
> I’ve written plenty of Python scripts that are still running on the interpreter and stdlib they were authored for, decades later.
Decades later? That's at least 20 years. If that were true they'd be written in python2 and I can promise you they wouldn't work with python 3.12. So I'll consider this statement a lie.
Please try to be more honest when you interact with me the next time. This hasn't been pleasant.
I think the point GP was making was that you restrict yourself to only the bundled standard library, which covers most of the basics needed for scripting.
This is why you force yourself to use nearly zero dependencies. The standard library sys, os, subprocess, and argparse modules should be all you need to do all the fancy stuff you might try with bash, and have extremely high compatibility with any python3.x install.
I am new to bash and always struck at using the tools like `awk` and `sed.` You need to be good at regular expressions to write a good script using these tools.
I've coded in a bunch of languages over my career, but somehow managed not to use bash. But a few weeks ago I found myself needing to create a somewhat involved script using Bash. Using GitHub Copilot, I was able to generate a script that worked great and it included comments, arrays, arguments, case statements, loops, conditionals, menus, etc--basically covering the first dozen chapters of this book. I guess my point is that we need to teach people what's possible and to help them explain what they want clearly. Memorizing syntax will be a less relevant skill at some point.
I recently was running a variety of compressors over the course of days, and using Bash was great because, it doesn't parse the whole file before it starts running - I could edit and reprioritize next steps while the script was running. "Oh, that one's not working, skip the slower version" or "oh, space is an issue, work on freeing that up." It's a weird hack I wouldn't suggest in production, but it worked well for me.
Many people are saying that ChatGPT or Copilot could write great scripts which is indeed true, but the eBook was written way before those LLMs were a thing. It is still good to learn the basics though.
Can someone name me examples of where a bash script is actually the _correct_ tool for the job in modern times? Or is it still used only because it’s the easily available tool?
There's a server. You can ssh into it, but you don't have root. You don't know if it'll have python (2 or 3), perl, ruby, node, whatever. You need to write a cronjob that'll run every 5 minutes, or a script that'll do something on-demand, or just a one-off to clean some stuff up or grab particular log files or something. Performance isn't going to be a problem. User input isn't going to be a problem.
If you never run into this problem, fair enough, maybe you don't need bash.
I've used linux brew a couple of times to escape this problem, able to get a newer version of python or ruby instead of bash: https://docs.brew.sh/Homebrew-on-Linux
Nix is great for this, if you have the disk space. It's nicest if Nix is already installed on the box by a sysadmin, but you can use it in a pivot_root environment if you don't have a cooperative sysadmin.
Any time you're running lots of CLI utilities directly is a good start. Bash is very very good for that, and other than perl, most other things aren't particularly good at it.
Unless you need to run programs with arguments with spaces in their name.
Not that long ago, I needed to write a script that would grab the last command in a shell file that had a list of commands, add an argument or two to it, and run it. Which proceeded to barf because the command had quotes, for some of the arguments had necessary spaces in them, and I couldn't figure out the way to get the necessary string in a way that bash could properly execute the command. My eventual solution was to just use python's shlex module to parse the line and output an executable command (and remark that I should have just started the script in Python in the first place)...
I have a variable, say $X, that contains a command, and the command has quoted arguments with spaces, let's say <<echo "A B">>. Now how do I execute it? If I do $X, the result <<"A B">>, not <<A B>>. If I do "$X", it fails, because there's no filename <<echo "A B">> to execute.
In a program with proper arrays, I can instead parse X to the strings ["echo", "A B"], and then run execve to run ["echo", "A B"] without any issues. And adding another argument to make ["echo", "A B", "C"] is damn trivial.
No, bash doesn't handle arguments with spaces just fine. If you do the basic, obvious way to do things, things break, and you're expected to know magic invocations to do the right thing to handle potential spaces.
I see your point, but I still have to disagree. Your task would be inherently complicated for any language - imagine trying to properly add another argument to a function call written in python. In fact it has nothing to do with spaces in arguments, but rather re-parsing strings as shell commands which is always a tricky task.
>In a program with proper arrays, I can instead parse X to the strings
Well, if I were to write it in Java or Javascript, both of which have excellent support for arrays (the shell does as well, actually - via "set"), it still wouldn't get me an inch closer to achieving the goal. It seems to me that the real problem is lack of a first-class shell grammar parser, which affects most languages, not just the shell itself. Python just happens to have a shell grammar parser that is shipped with the installation.
Shell is good for running a bunch of other programs and doing file operations. In that niche it seems unbeatable, and outside that niche it's generally a terrible choice.
I don’t know if it’s the correct tool but my llm used it to replace a weather widget iFrame hazard on a static site. It’s a cronjob that lynxdumps the weather forecast, writes the values into an html block and then sshs into the clients server that has the site (but no lynx) and replaces the html block in all html files. Runs every hour and might work for years, we shall see.
Write scripts in whatever tooling you have setup for your app. In a typescript codebase, I write typescript scripts. In a go codebase I write go scripts. It's so easy to write a couple wrappers to make execing something a one liner and then the advantages of bash have disappeared. You already have a toolchain for running a program in your language, just use that to build your scripts.
Debating the pros and cons of different languages misses the point IMO. Shell scripts are useful not because of the language but because the interpreter is already available absolutely everywhere.
For anyone wondering whether or not you really need to know bash: Yes. Yes, you do.
Somewhere in the middle of every stack is a bunch of bash and makefiles. Don't replace that or avoid it. Embrace it. It's not perfect, but it's important to know.
I love when people proclaim this as if it's some kind of sage wisdom when it's actually just hypermyopia. For anyone wondering whether this is true: of course it's not since there are plenty of stacks that don't run in environments that are anywhere near a shell.
Do you think your phone runs a shell for IPC? Like passes strings to and fro? How about the runtimes/hypervisors/etc that manage vcpus on AWS? And about that embedded dev - do you think that that's a small niche industry?
I'm pretty sure Android has a bunch of shell under the hood, and I don't have a jailbroken iOS device to check but I'd be a little surprised if it was different. AWS may or may not, I don't know enough to comment. And yes, about embedded dev - there's the very very low level stuff of course, but the moment you get above tiny microcontrollers it's all Linux again and there's shell everywhere. Actually, even for the really tiny embedded stuff I'd still expect to find a mess of shell scripts, just in the build environment rather than on-device.
And the kernel also doesn't run shell, but the claim was that you would struggle to find
> software that doesn't involve shell at some level
which would seem to me to encompass more of the stack. Like... okay, Android IPC isn't pipe-based. Does that release anyone from touching shell? Anyone working on a ROM is going to get their hands dirty with the rest of the OS. And I struggle to believe that any app developer is avoiding it (it's always the build step that does you in). Approximately nobody is working on just IPC in isolation.
Just curious: where exactly is the shell in `cmake -GNinja`? Or is CMake not a build system hmmm? Nevermind that some people use bazel or meson or something ... other than shell...
Shell scripts in your build system are a code smell. In cpp projects it's covering up for people that don't how to use the actually correct tool for the job: the CMake scripting language.
> Just curious: where exactly is the shell in `cmake -GNinja`? Or is CMake not a build system hmmm? Nevermind that some people use bazel or meson or something ... other than shell...
Right above it; cmake replaces make, but something still runs it. And IME that something is usually either a developer running it from a shell (ideally eventually with another shell command wrapping it to automatically build on file changes) or a CI system running it which is always a massive pile of shell scripts.
To be fair, I don't doubt that if you really tried you could build a dev environment that used zero shell. But you would have to really try, and half of the things you had to build to make it work would inevitably converge to an informally-specified, bug-ridden, slow implementation of half of Common Lisp^w^wBASH.
> In cpp projects it's covering up for people that don't how to use the actually correct tool for the job: the CMake scripting language.
That’s like saying that people use lotion because they don’t know how to correctly jerk off with sandpaper.
Bash is ubiquitous and can be used everywhere, CMake scripting language can be used only in CMake, guess how many people know one better than the other?
Shell pops up in a ton of places even where it's not used directly. Build systems are a good example. I have yet to come across something that doesn't touch shell at some level unless it also doesn't touch an OS. But maybe that's just me. Phones aren't an exception to that, though.
Those things might not run bash/sh, but I’m willing to bet a hundred quid that nearly all of their development environments had at least one tiny shell script that did some transformation of data.
Whether they run in environments where a shell is nearby has no bearing on whether they are built or tested in environments where a shell is near and sufficient for the task.
Pasting what I responded to the other person talking about build systems:
Just curious: where exactly is the shell in `cmake -GNinja`? Or is CMake not a build system hmmm? Nevermind that some people use bazel or meson or something ... other than shell...
Shell scripts in your build system are a code smell. In cpp projects it's covering up for people that don't how to use the actually correct tool for the job: the CMake scripting language.
That's a good thing, isn't it? It's not too scary for someone who's not well versed in shell scripting. And you can later move on to other resources (TLDP is great!).
I looked at it this morning and I think it's helpful for those brand new to the shell. My only criticism of it is that it needs a once-over edit by a native English speaker. The exercises seem fine to me.
I know basic bash and can read most bash scripts, but never spent much time on it to become an expert. However I found ChatGPT to be extremely valuable -- given the description of a task, it can create very high quality scripts with best practice and comments, often correctly using third party tools. That's probably what I'll do going forward -- my time is better spent creating a good prompt and reviewing output.
Same, and another great thing I have been using since ChatGPT made it trivial to do so is systemd. I was using pm2 quite a bit for running scripts but high quality bash (or python or php or whatever really) plus systemd is really killer.
I’ve also been doing way more with google cloud shell. I was always such a point and click guy when it came to setting stuff up in GCP but automating and scripting with bash in cloud shell using ChatGPT is so much faster and repeatable.
And most humans have MUCH MUCH lower eval scores on bash, yet they write it anyway, and the rest of us get stuck with trying to use and understand and debug and maintain that crap, that bursts into fire and explosive diarrhea whenever it hits and edge condition or file name with a space in it.
Depends on the platform. macOS has a terminal and ChatGPT is highly Linux biased, and it's quite happy to spew trash when you point out "hey bozo I said macOS useradd isn't a real command on macOS".
A bash script is basically scripting what you do on your terminal. I feel like scripting in the same language that is your terminal interface is underrated.
You can basically take a look at your command history and wrap it up as a script.
Yes, bash is clunky as hell. It has some really ugly warts. But learning the basics will make you more proficient in the terminal. The synergies are really nice.