This is a pretty good collection of small utilities but I would recommend using GNU Parallel[0] instead of the included parallel. It contains a ton of extremely useful additional features such as distributing jobs to remote computers.
GNU Parallel will also infloop if any process on the system (run by any user) has an empty argv[0] (which can actually happen; parallel scans /proc to find instances of itself). When I reported the bug, the author refused to fix it on the grounds that this behavior is a "malware detector". Between this weirdness and the citation-interface weirdness, I'm not keen on GNU Parallel.
One big turnoff with GNU parallel is that the first time it is run on a machine, you must interactively accept a EULA. There are ways around it, but as a rule of thumb I avoid command-line tools that do this, because this inconsistent runtime behavior is annoying at best and can easily break and complicate automation.
And no, I don't want to set an environment variable on a fleet of machines to suppress behavior which shouldn't be enabled by default to begin with! <cough> .NET telemetry </cough>
A real shame, because it is an otherwise useful tool IMHO.
jobflow[1] is what I'd be suggesting unless there was a very specific reason to use something massive like GNU Parallel. It is used as part of Sabotage Linux's package manager.
Suprised that nobody mentioned 'xargs -I{} -P N' yet. It is a GNU extension, but quite handy as it comes preinstalled pretty much on every Linux machine.
I propose a new project: lessutils.
You post about some program that you use for some simple task that's not among the "standard UNIX utils". (ed, sed, AWK, lex, etc.)
Then we show you how to do the same task with only the standard utils. (i.e. no install needed) -- inevitably someone shows us how to do it in AWK and makes all feel stupid. :)
If we are successful, you get to eliminate one more dependency from your system, not to mention reducing attack surface.
Yes.
letsencrypt.sh is a nice case study, spending a lot of effort to parse json with essentially bash & grep. It works now but is full of assumptions ("this array of hashes will not contain [ anywhere").
There is nothing in POSIX that's well suited to manipulating json in the way that jq is.
Unless the programs are setuid, is that a real problem? I mean, anyone who can call one of these utilities with some arguments can also call "sh -c ...", no?
My point is that if you have a meter-wide hole in your parachute (sh and other interpreters you can call to execute any code you want), also having small rips in the fabric probably doesn't really matter - if you have to rely on it, you're screwed anyway.
I'm a fan of the unix tools philosophy, but I sometimes wonder if there's much room for new tools to be added to that toolbox. I've always wanted to come up with my own general-purpose new unix tool.
Just yesterday, I asked on lobste.rs [1] how I could take my small utility to compute the minimum, maximum, and expected value of a dice roll expressed in D&D notation and make it a better Unix citizen. The commenters were helpful, and with few changes I was able to make the program usable in a pipeline, even if I don't really expect that's going to be a use case for me.
As an aside, my utility is also written in Rust; nice to see more of these small programs written in Rust.
Once you realize that awk, while useful, is really just a terrible programming language with terrible syntax that you have to write in a string, with random unreadable environment variables to try to hack in useful functionality, then you are finally free.
awk is a full blown scripting language too. It has a versatile implied top-level control structure + code golf qualities (which is a plus for livecoding). But it's also crazy that we switch a language when its top-level structure is a good fit.
sponge only writes the file once all input has been read, and - if the file already exists - tries to do the operation atomically by writing to a temporary file and renaming it over the old file.
Good utility, but I would reccomend using the GNU ones as mentioned by another user although it will infloop the proccessess, you can use command-line tools to go around this.
From the lckdo manual:
"Now that util-linux contains a similar command named flock, lckdo is deprecated, and will be removed from some future version of moreutils."
Also check out combine. It allows you to combine lines from two files using boolean operations:
combine file1 [OPERATION] file2
Where [OPERATION] is one of [and, not, or, xor]. It allows you to quickly pull out interesting data from your files (the input files don't need to be sorted).
[0]: https://www.gnu.org/software/parallel/