Hacker News new | past | comments | ask | show | jobs | submit login
Four features that justify a new Unix shell (oilshell.org)
242 points by diegocg on Oct 23, 2020 | hide | past | favorite | 180 comments



I'm sorry to say to for me, shell scripting has lost. All my scripts are written in Python, which a thin shell script wrapper to launch it.

Look at Oil. It tries hard to be bash compatible with optional buy-in to extension. Just use a superior language and give up on shell scripts, that's my advice.


The author addresses this argument:

> However, Python and Ruby aren't good shell replacements in general. Shell is a domain-specific language for dealing with concurrent processes and the file system. But Python and Ruby have too much abstraction over these concepts, sometimes in the name of portability (e.g. to Windows). They hide what's really going on.

From: http://www.oilshell.org/blog/2018/01/28.html#i-dont-understa...


Even something that should be as simple as piping output from one process to another in Python is a nightmare, and fraught with opportunities to shoot yourself in the foot with buffering deadlocks and other nonsense.

The consequence of this is that many of my projects which require these overlapping domains (think stuff like build automation) ends up being the two interleaved— either an outer shell script that calls small Python programs (for parsing, URL fetching, and basically any error prone stuff where sane recovery/cleanup may be necessary), or it's a Python program that renders little templates of bash into strings and passes them to bash stdin.

Or in some cases, it's CMake that's the outer orchestration piece, using configure_file to fill templates and then invoke them either at configure time (with execute_process) or at build time (with add_custom_command).


> Even something that should be as simple as piping output from one process to another in Python is a nightmare, and fraught with opportunities to shoot yourself in the foot with buffering deadlocks and other nonsense.

This hits the nail on the head for me. What I'd like is something bash-like to handle running programs, pipes, tab-complete for file names, and so on, and something like Python syntax for control flow and strings (quoting / escaping), and access to something powerful like Python's standard math libraries (and ability to import other stuff like requests). I don't know how you'd roll all that into a single shell, but as close as you can get is what I'd like to see in a new shell.

(I've briefly looked at Oil before, but it has seemed a bit complicated to merit trying to switch full time at this point. I'm definitely following it for future developments though.)

Edit: just found Xonsh from another user's comment, which seems almost exactly what I'm looking for, if a bit hacked together at first glance. Going to try that out.


> Even something that should be as simple as piping output from one process to another in Python is a nightmare, and fraught with opportunities to shoot yourself in the foot with buffering deadlocks and other nonsense.

This is true, but when I am writing a script in Python I generally don't need to pipe as often as I do in bash. Python offer many replacements for things that in Bash you need to call another process.

That is, of course, if performance isn't necessary. When it is bash is still unbeatable because of the pipes.


> Python offer many replacements for things that in Bash you need to call another process.

True, you're never going to pipe something to grep in Python when you can just capture the output and use the inbuilt regex capability.

That said, it can be surprising how much worse performance-wise Python can be than shelling out to external utilities. Consider the simple case of downloading and extracting a tarball. I found that check_call("curl path/to/thing.tar | tar -x", shell=True) was significantly faster than anything I could figure out how to do with requests streaming, httpx (async), and the inbuilt tarfile module.


Here's how to pipe two commands in Python:

    import subprocess

    subprocess.check_call("command1 | command2", shell=True)
Depending on your use case there are other options e.g., fabric http://fabfile.org plumbum

Use shell for what is really good for: concise DSL for running commands (one-liners). Leave complex logic (branches, explicit loops) for sane general purpose languages such as Python.


Right, and I do that, but it's an obvious hack; it causes a separate sh process to spawn, you open yourself to shell injection issues if you're not careful, teeing the pipe is trickier than it should be, etc etc.


you might have known this but the practice shows that It has to be repeated: there is no shell injection issues with literal string in your own code.

YMMV but for those rare cases when I care about the shell injection, I just don't use shell and run the command directly in Python. Combination of shell one-liners and the main logic in Python works well in practice.


Oh yeah, for sure. I think it's just frustrating having that one more piece of mental overhead. Like, it should be that there's one sane way to do all this stuff in Python. Instead it's literal blobs of shell in some cases, and subprocess.STDXX pipes linked up in other cases, and maybe sometimes you use shlex to sanitize your arguments, or you give up on streaming and just use communicate() to get the whole result in memory at once. Blah.


I don’t see it as an overhead, I see using shell syntax as just another DSL like regex.

There are many different use cases related to running processes—it is natural that different solutions may be preferred for different cases (check_{call,output}, with/out shell, run, Popen, pty/PIPE, threads/Asunción—all may be useful. And it is just stdlib ).


That’s not a very convincing post because it’s a bit too shallow:

> I encountered a nice blog post, Replacing Shell Scripts with Python, which, in my opinion, inadvertently proves the opposite point. The Python version is longer and has more dependencies. In other words, it's more difficult to write and maintain.

The Python version uses only the standard library, so it doesn’t make sense to worry about dependencies if you aren’t counting every program your shell scripts call similarly. Having just one thing providing a consistent baseline beats learning that your script needs to upgrade RHEL to get a feature which would make something safer or easier.

Similarly, the Python script is longer because it does better work and consistent, useful error handling. That adds lines of code but it’s usability and correctness, not overhead.

It’s similarly incorrect to treat line count as a proxy for difficulty, as anyone who’s ever had to deal with quoting or data structures in shell scripts knows. The use of pathlib or os.path sometimes gets grumbles for extra characters but I’ve found it nearly inevitable that the grumblers will have something fail or destroy data because they hit a filename or argument with a space or special character in it. On one notable occasion, that resulted in an `rm -rf something something /`. (If you’re writing a shell script, shellcheck is mandatory)

The best criticism I’d make is that it could be easier to replace a shell pipeline. Python’s subprocess makes that pretty easy but you still need a minimum of 3 calls per command plus whatever you need to actually do with the output. Since subprocess.{run,check_output} handles most of my needs this isn’t a big deal and there are excellent modules if you don’t mind dependencies.


This.

The Bash script actually has more dependencies, it relies on a number of external programs (ps, kill, mkdir, sort, ls, cp, echo). What versions are on your system? What versions on the systems of the people running the script? Do they support the same features? If you're running on a Mac are you using the Mac-shipped programs or the GNU coreutils from homebrew?

Also, that Bash script that the author is defending has a subtle bug in it, because Bash is subtle. The bug is here:

       $APP $filename >"${output_dir}/summary_${name}.txt"
       if [ $? != 0 ]; then
           echo "Error $? in app"
       fi
The return value of [ ... ] overwrites the $?, so the inner $? is the result of the test, not $APP.

Which is exactly the kind of crap that everyone's talking about when saying that we should avoid Bash scripts. It's full of landmines like this.

Bash is great for interactive work on the terminal. It is not great for writing correct, maintainable programs.


Good catch: I’ve written many thousands of lines of shell scripts over the years and that’s exactly the kind of rake in the grass which is too easy to miss. Using set -eu -o PIPEFAIL helps as does shellcheck but it’s still easier than it should be to have unexpected behavior from something you thought was well tested.

About 15 years ago I switched to Python for anything which doesn’t fit on a single screen or uses any advanced feature and have had zero reasons to regret that decision. It’s especially good for anything you use with other people since you spend your time talking about features rather than how to trick the shell into working correctly.


> However, Python and Ruby aren't good shell replacements in general. Shell is a domain-specific language for dealing with concurrent processes and the file system. But Python and Ruby have too much abstraction over these concepts, sometimes in the name of portability (e.g. to Windows).

Ruby has pretty good thing wrappers around running shell commands (e.g. using backticks to run a command and get the output as a string or calling system() on a string to execute as a command and get back the error code), and my understanding is that it doesn't really have great Windows support. My impression here is that the author of this FAQ is probably not super familiar with Ruby based on the way they seem to equate the abstractions in both languages.


Sounds like we need a new standard!


There is a better option and its name is Perl. Perl mixes the ins of shell scripts like easy argument and output passing with complex and easy to use control structures.

You could use Python, but having been forced to use Perl extensively Perl is the superior choice for a complex shellscript-like workload. There's less boilerplate in Perl. There's a couple Python projects that come close like Fabric, but Perl was literally made for this type of workload.


There is nothing in the (IT) world I fear more then... Perl projects. The shear madness it will leash upon you when the dependencies fail is maddening. Never will I voluntarily touch anything written in Perl ever again in my live.

I've seen the power of Perl, the beauty and elegance of it, but everything was ruined by the absolute shit show when it comes to its packet managers.

/rant.


I have 20 year old Perl code I run in production. I had to re-install it our move to Amazon Linux 2. My 10 year old install process for the cpanm setup still worked. My 10 year old Python had to be moved to python 3; but the 20 year old Perl still runs


I’m a big time python fan (80% of the code I write is python) but growing up I loved Perl (after starting out as a PHP dev) as a scripting language and I hate how much flak people give it. Though I do understand since there’s a million ways to do everything in Perl if you aren’t familiar with the language it can look daunting. But I’d argue the same can be said about JavaScript and I still love that language as well.


I think the OP is mostly talking about "perl one liners" on the shell like this: https://catonmat.net/ftp/perl1line.txt

It is not that hard either (unless perl is completely foreign to you).

Ironically just today I wrote a python script (using concurrent.futures) to batch re-encode 10gb of mp3 podcasts into 3gb of opus files to save space. Before I deleted the old mp3's, I did a quick file count of mp3 and opus files only to have 5 extra opus files... I beat my head really good trying to glue a bunch of bash stuff together but I could not find where the 5 extra opus files were coming from.

I defected back to perl. All I needed was an alphabetical list of all the files (there were many sub-directories), perfect for a little perl oneliner:

  find /podcasts/ -iname "* \.ogg" | perl -nle 'if( $_ =~ /\/([\w-]+.\w\w\w)$/) { print "$1"}' | sort >ogg.txt
I did the same thing for * .mp3 files and then diffed the two files. I quickly noticed the extra 5 opus files were from me doing test encodes a couple of weeks ago that I forgot about.

Sure it looks hideous, but the explanation is easy: for 'perl -nle', the 'l' means process each line from stdin individually, the 'n' means no auto printing of the line, and the 'e' is basically the perl 'one liner' mode. Either way, each line comes in and is stored in '$_', I use a capturing regex to grab the file name without parent directory names, basically everything after the last '/' in a path that ends with a three letter extension (\w\w\w). The "captured" part of the regex ends up in '$1'. Also see 'perldoc run' for all the runtime one liner options.


If I've not wildly misunderstood what you're doing with that line (and apologies if I have), could you not just stick `-printf '%f\n'` on the end of the `find` command?

(By definition, since you're looking for `.ogg` files, everything `find` finds will have a path that ends in a three letter extension.)


Ah, that is a good one! I've never even messed with the -prinf options.


It's one of those things that's immensely helpful but difficult to discover organically, I think, since a lot of people will go "find outputs this, now I pipeline to transform that, this is The Unix Way" and never realise that `find` can do a bunch of transforms for you for free.


It would be simpler (and match "everything after the last '/'") with

    perl -nle 'if( $_ =~ /([^\/]+)$/) { print "$1"}'
I don't quite get how can stream of '* \.ogg' proceed through /[\w-]+.\w\w\w$/ filter. Notice space.

And it allows ",ogg" (dot not escaped so it is not extension), thankfully find provides /\.ogg$/


Space was because HN uses it as italics and I don't know the escape. And yes there are cleaner ways, but the point was a real world example of having something not work out in pure shell but yet quickly barfing out something working with perl. And yes, looks like I didn't escape the dot, funny!


Got it. Sure, I like one liners and I like piping. Ruby is strongly influenced by Perl:

    $ find . -iname '*.ogg' | ruby -nle 'puts $1 if $_ =~ /\/([\w-]+\.\w\w\w)$/'
I think the reason people badmouth Perl is because it is safe. People claim PHP improved but all I can see is language without design. JavaScript has lots of bad parts and we should not talk about them. I like AWK matchers though not its functions. I am not familiar with Perl but it is designed.


Or, in actual posix compatible way:

    find /tmp/test -type f -iname '*.ogg' -exec sh -c 'for f; do printf "%s\n" "${f##*/}"; done' - {} +


Why am I not surprised that the PERL part could be solved with bash in the same space. Why are you so eager to use lang on top of lang, in such simple cases?


Feel free to demonstrate. Take a long path with nasty characters and give me the ending file name. I want "track1.mp3" returned from:

  /podcast/with/nasty'--c h a r$/and/stuff/track1.mp3


If you have that in a variable, which you presumably would if you're reading them one at a time, you can use "remove longest matching prefix" substitution.

    $ i="/podcast/with/nasty'--c h a r$/and/stuff/track1.mp3"
    $ echo "${i##*/}"
    track1.mp3
[Edit: 'thezilch beat me too it by a couple of minutes]


This part is really cryptic, I would prefer

    $ function remove_longest_matching_prefix () { echo "${1##$2}"; }
    $ remove_longest_matching_prefix "$i" '*/'
    track1.mp3
awk -F/ accepts stream as in perl example


You still get +1 for providing a pointer to the explanation while remaining clear and succinct.


$ basename "/podcast/with/nasty'--c h a r$/and/stuff/track1.mp3"

track1.mp3


  $ path="/podcast/with/nasty'--c h a r$/and/stuff/track1.mp3"
  $ echo "${path##*/}"
  track1.mp3
  $ basename "${path}"
  track1.mp3
  $ echo "${path}" | awk -F/ '{print $NF}'
  track1.mp3


In zsh it would be

    echo “${uglypath:t}”


I hate everything there is to hate about CPAN. I also hate how Perl libraries are handled. I have to have like 5 lines of code related to it in ~/.bashrc or ~/.bash_profile. Bleh. Perl libraries are always a problem when I want my program to run on other machines. No thank you.


I find this to be a weird statement. I've never really fought with "cpan" but have had huge fights with node and python packages.

The old school "cpan" command has not been recommended to be used for nearly a decade now. Maybe that is your issue?

Everyone just uses cpan-minus: http://cpanmin.us/ or "cpanm". It will install anything, if you have write access to the installed perl location, it will install globally, otherwise it installs packages into "~/perl5" then just do something like "export PERL5LIB=~/perl5/lib/perl5/" in your bashrc.

Or maybe you are talking about having to compile code in certain packages? Alot of stuff using ssl has C code which links against openssl which is always a pain.

Not sure what other issues you might be alluding to... There are some "packaging" tools to try to bundle and pin libraries, kind of like venv for python. I've never messed with them though. We just compile the latest perl binary and install all our needed cpan modules globally. Never worry about pinning since nothing is ever updated on cpan anymore...


> The old school "cpan" command has not been recommended to be used for nearly a decade now. Maybe that is your issue?

I did not know that it was not supposed to be used, so that might explain my issues, but at the same time it did work for me on my machine, but had a hard time running my Perl script on a VPS, because I had to set everything up again and for some reason I ran into some errors.

I often ran into packages that failed to compile, too, yeah.

> Never worry about pinning since nothing is ever updated on cpan anymore...

Nothing makes it kind of sad. :/


Sheer as in "unmitigated" and shear as in sheep

But I agree with this, except for python


I notice you did not provide any details, depriving us of the ability to determine whether the shit show is PEBCAK or genuine.


A lot of people have a knee-jerk negative reaction to the name perl without understanding that perl chewed up and spit out this particular problem space. You probably don't want to develop your next webapp with a team of perl coders, but the language has its niche.


I've switched most of my projects to golang these days, but even as recently as a couple of years ago I was writing pretty big sites with Perl.

Using https://metacpan.org/pod/CGI::Application to provide the framework, along with similar small frameworks - e.g. a perl version of sinatra.

Golang managed to persuade people to write tests, by making it easy and self-contained. I'd suggest with Test::More that Perl did something similar. Almost all the modules you'd find on CPAN are full of test-cases.

There might well be thinks we can argue about with Perl, but it is definitely true that it has/had one of the biggest and most consistent set of extensions out there. CPAN has held up pretty well, compared to some of the later alternatives (such as node modules, ruby gems, etc). One reason that I was able to work with Perl so easily was because I could find Stripe integrations, and similar with ease.


I mean, I worked in the .com era with perl coders who made some pretty compelling products out of Perl, for the time, with some discipline, and TBH... it wasn't terrible, and when they "Did the right thing" and rewrote the whole thing in the new hotness (J2EE) it _was_ terrible.

Perl will let you hang yourself, and I hated it at the time, but it also got the job done.


I think Perl needs a real pipeline syntax, among other things: http://www.oilshell.org/blog/2018/01/28.html#are-you-reinven...

That said, Oil is somewhat influenced by Perl: https://www.oilshell.org/release/0.8.3/doc/language-influenc...

Also note this part of the original blog post, Python/JS/Ruby vs Perl/PHP: http://www.oilshell.org/blog/2020/10/osh-features.html#oil-l...

I recently bought a copy of the Camel book and noticed the exact same thing that Steve Yegge pointed out here:

Perl's references are basically pointers. As in, C-style pointers. You know. Addresses. Machine addresses. What in the flip-flop are machine addresses doing in a "very high-level language (VHLL)", might you ask? Well, gosh, what they're doing is taking up about 30% of the space in all Perl documentation worldwide.

https://sites.google.com/site/steveyegge2/ancient-languages-...

This is really a huge, unnecessary wart in the language.

The other wart I pointed out in the post is that Perl 5 does dynamic parsing of its own code (parsing that depends on the value of variables), despite the book complaining about the same issue in shell. In contrast, Raku and Oil are statically parsed.

----

I have a lot of respect for Perl, but there's a reason that Raku exists (and again from the FAQ: Raku and Python 3 are both worse shell-like languages than their predecessors.)


> Perl's references are basically pointers. ... This is really a huge, unnecessary wart in the language.

Call me crazy :), but I count this as a pro for perl! It makes thinking about data structures in perl very "regular" for me.. there's a few rules to learn, but then I find it simple to construct (or de-construct) nested data structures while applying those rules. As opposed to other scripting languages, where the difference between "value" and "reference" frequently seems to be more hidden from sight...


> In contrast, Raku and Oil are statically parsed.

I don't know about Oil, and if I understand your use of "statically parsed" correctly, Raku is not statically parsed. A "use" statement can affect how Raku parses source code from there on, see e.g. the OO::Monitors module that adds a "monitor" keyword.

But even simpler, adding an operator changes the grammar. For instance, adding a postfix ! operator for faculty, can be as simple as:

    sub postfix:<!>(\value) { [*] 1 .. value }
    say 5!   # 120


OK interesting, I guess the question is if "use" requires running (not just parsing) arbitrary code? I guess if it's like Python's "import", it does.

Changing the grammar could still be considered static parsing, as long as the change doesn't depend on the values of variables at runtime, e.g. your argv array or something.

I recall Larry Wall saying that Perl 5 was at times confused about the language it was parsing, and the goal was to fix that in Perl 6. I don't have a lot of experience with it, but yeah that claim could be wrong, or at least un-nuanced.


The process of exporting symbols is done in an EXPORT subroutine that can be provided by the module developer. This subroutne is supposed to return a Map of symbol names and what they refer to. This Map can be constructed depending on external factors such as an argv array or an environment variable, although I have yet to see this in the wild.

So I guess one could say that Raku parsing is usually static, but it does not need to be.


i find that ruby combines the best features of perl and python in that respect. makes shell-like scripts really convenient and easy while still providing the means to add structure.


Both Ruby and Python are too slow for me, and neither is Lispy enough.


Slow in what regard, for what use case?


Performance is not a factor in shell scripts where the majority of the work is done by C-programs and the script merely shuttles parameters to the desired places.

It's not as if bash were fast, mind you. Lisps already exist.


nim!

It has full macros, even!

and it's super, super, fast


Nim's not Lispy either, and macros alone are not enough. To make macro use feel natural because they're not much different than using the core language, the language needs to be homoiconic, which to my knowledge Nim is not.

In any case, macros are not what draw me to Lisp, but its unparalleled combination of ease of use, readability, and power... along with 70 years of development and tooling.

So I could just use a Lisp or Scheme, which I do. They can be plenty fast too (see Chicken, for instance, which -- like Nim -- compiles down to C).


It's lispy in what it can do, not in having a rat's nest of parens, true.

I'd love a concrete example of a macro you could express in lisp and not in nim.

But it's macros absolutely operate on a syntax tree - refer to https://nim-lang.org/docs/tut3.html#introduction-the-syntax-...


I agree. Generally I find the approach to have a few tools that work well in their specific niches to be superior to having one tool for all jobs.

I use a shell for shell stuff, I use a scripting language for most other stuff, and I might use a systems level compiled language (or a scripting language that calls into a compiled library) for more performance specific needs. If you're already within a specific area and only need to venture into the other for a minimal aspect of the current project, it can be useful to stick with what you're in, but you quickly reach the point where it's better to choose a better tool because of diminishing returns from using a tool for something it's not good for.

Maybe the hammer in the hand is fine for prying the single board off fence or wall, but if you're going to be doing it to ten or twenty boards, maybe walking over to the shed to get the crowbar will save a lot of time and effort in the end.


the approach to have a few tools that work well in their specific niches to be superior to having one tool for all jobs.

Exactly, that is the point of shell.

Shell scripts are meant to invoke Python or Ruby programs.

sibling comment: https://news.ycombinator.com/item?id=24875932

earlier comment: https://news.ycombinator.com/item?id=24083764


Have you tried xonsh? It's lovely. If you have python, you can install xonsh. It's packaged in apt, brew, conda, and probably more.

It's pretty seamless to flow from bashlike pipelines to pythonic imports and functions.

Autocomplete out of the box is better than bash, comparable to fish or ipython.

https://xon.sh/


If your shell scripts are self-contained pieces of logic, you can use any language to write it. If you want to leverage external programs available in the shell, shell scripting is still the best glue language.


Python has a REPL but it isn't sufficient to be a shell. Therefore, I can't write a program in the Python REPL just as I'm doing stuff normally and copy it into a file for further editing and polishing. I have to port whatever I did in the shell to whatever other language I'm using, which is more porting than I want to do for most shell scripts, especially ones which are mostly pipelines.

Heck, Python isn't even a good language for one-liners. Not saying it should be, but it further demonstrates that I can't use Python live the way I can use zsh live.


How is python not a good language for one liners? One liners are literally the pythonic way.


By one-liners I meant things like Perl one-liners:

    perl -pe 's{<b>([^<]+)</b>}{<em>\1</em>}g' < foo.html > out.html
Python would never let you get away with treating it like an improved sed.


(author here) I do use Python. In fact I wrote something like 30K lines of Python for Oil.

The question isn't shell OR Python.

My shell scripts call Python scripts, many of white I wrote myself. That is working as intended.

They also call C programs, C++ programs and R programs. And put JavaScript programs in various places.

----

I guess this is a huge misconception about shell that I have to write a blog post about.

https://news.ycombinator.com/item?id=24083764

I'm often asked this about Oil [1]: Why do you want to write programs in shell?

That's not the idea of shell. The idea is that I write programs in Python, JavaScript, R, and C++ regularly, and about 10 different DSLs (SQL, HTML, etc.) And I work on systems written by others, consisting of even more languages.

I need a language to glue them together. A language to express build automation and describe deployed systems. Most big systems consist of more than one language.

Shell is the best language for that, but it's also old and crufty, with poor implementations.

When you program in shell, gcc, git, pip, npm, markdown, rsync, diff, perf, strace, etc. are part of your "standard library".

-----

If you want some concrete examples, look in the Oil repo. There are dozens of shell scripts that invoke custom tools in Python, R, and C++.

https://github.com/oilshell/oil/tree/master/benchmarks

For example, to generate this report on how much we speed up Python by translating it to C++: https://www.oilshell.org/release/0.8.3/benchmarks.wwz/mycpp-...

The tables are manipulated by R code, and shell/Python/CommonMark generates the HTML.

-----

Another example is that the release page is essentially a huge shell script: https://www.oilshell.org/release/0.8.3/

This page and all the linked pages are generated by: https://github.com/oilshell/oil/blob/master/devtools/release...

------

If you don't understand why shell should call Python and C, I recommend reading The Art of Unix Programming

http://www.catb.org/~esr/writings/taoup/html/

https://www.amazon.com/UNIX-Programming-Addison-Wesley-Profe...


>When you program in shell, gcc, git, pip, npm, markdown, rsync, diff, perf, strace, etc. are part of your "standard library".

I don't think this is a good characterization at all... Shell does not have git functionality (for example) built in. That is a dependency just like it would be for a python or rust project.


What I mean is that shell speaks paths, pipes, and processes natively, and you can creatively use tools that are "just there". Example: http://www.oilshell.org/blog/2017/09/19.html

It's an analogy, not a precise statement. It could be made more precise by using Oil as the center of a container-based "semi-distro", i.e. a distro that does everything that's not hardware related.

I guess a little like the complement to what CoreOS was doing (or is?). (This is a project I've been thinking about for awhile; anyone should feel free to contact me if they've done something like that or have ideas. It's related to the dev tools problems described in the blog post.)


> I'm sorry to say to for me, shell scripting has lost. All my scripts are written in Python, which a thin shell script wrapper to launch it.

I've seen, and still see, plenty of Python scripts launched by a shell script.

9 out of 10 times, Python just causes more problems than it solves, and problems that are trivial to solve with plain old shell scripts.

The only reason Python "won" is because we see a constant inflow of rookie developers who so far did practically only used Python ever, and that's pretty much the only thing they know and the only tool of their trade. They see all problems as nails, and thus insist in using their little hammer all the time without even looking at the current infrastructure or even the toolbox.

I have seen shell scripts launch Python scripts. I have seen powershell scripts launch Python scripts. I have seen npm launch Python scripts. Hell, I have seen Python scripts launch Python scripts by firing up a Python interpreter as a child process.

Python did not won. Python outnumbered. It did so because people who don't know better happened to know Python.

Don't let that be the reason why you make terrible technical decisions.


This is ridiculous. Python won scripting because it offers a sane way to do sequential, shell-ish things, without having to wade through "man bash" or searching stack overflow for the umpteenth time about the syntax to do something that should be trivial but is anything but.

Saying that younger devs only know python is like a FORTRAN engineer in the 90s saying young devs only know java. No one needs to apologize for growing up learning better mature readable shit.

The number of gotchas and tricky nonsense in bash could (and probably does) fill books (array indexing, string comparison, quoting, toggling 'set -e', many more). I don't doubt that there are clever grey beards that are wizards that know the arcana. That doesn't mean arcane should be what you build an engineering culture around.


> This is ridiculous. Python won scripting because it offers a sane way to do sequential, shell-ish things, without having to wade through "man bash" or searching stack overflow for the umpteenth time about the syntax to do something that should be trivial but is anything but.

You've inadvertently supported exactly my point.

You just advocate for the lazy way out. You know Python, so that's what you use and nothing more. God forbid you check out a reference. I mean, your lazyness leads you to believe that having docs.python.com on speed dial for a dozen different major or backwards incompatible releases is ok, but oh God forbid you check up a single man page of a tool that exists since the beginning of time.

Ridiculous, and all of this just because you believe you know Python, and that's all you have to offer.

Shell scripts are standard, omnipresent, reliable, and readily available infrastructure. There's no way around it. There's no excuse, let alone laziness and refusal to learn, which in this field is outright incompetence. The only reason someone refuses to use shell scripts for this sort of job is dereliction of duty and outright incompetence, and that is not winning in anyone's books.


So you say we should choose the hard way, because the easy way is lazy?

Also, your point can be turned around - you could use python which produces readable and maintanable code, but instead you choose the easy, lazy way of using bash, which you already know really well.


Cross-platform Powershell (specifically with Azure though) has become oddly enjoyable.


I'm curious what scripting tasks you do with python? I've found bash to be more than enough for everything I'd like to script, except for maybe stuff with heavy json processing (still doable with `jq`)


Anything with a non-trivial program flow, e.g. "do thing A, B, or C, and optionally mixin arguments D and E, with caching in between the shared steps."

Lots of deploy scripts look like this.


Many things that talk API and are writing/receiving json. Also I greatly prefer argparse.

jq is nice but also a system dependency.


I've actually started to transition my shell scripts to eLisp for better integration in to Emacs and eshell.

As a Lisp, eLisp is not the greatest, but I'd still much rather use it than Python.

I also don't want to sit and twiddle my thumbs while a Python script takes its sweet time in loading. Slow startup time is the kiss of death for most shell scripts.


Python startup time is slow for you?? It’s basically instantaneously on every machine I’ve ever used, unless there’s some seriously poorly written imports.


You find Emacs to have a faster startup time than Python?


I'm not the original poster, but judging from the answer it looks as though they'd already have an Emacs session open and can execute it without any extra startup-time; in which case the answer is yes, because there's no startup time.


Doesn't TCL (and tclsh) handle these top four features quite elegantly? I understand why people shudder at TCL, but they are mixing up the elegance of the language with the dysfunction of the situation it was pulled in to address.

Yes, when you start gluing random stuff together it gets ugly. This is true in real-life too. That's the situation.

TCL, from the "commands and strings" handles this quite elegantly with a kind of Haskell-like power. And you can just use it as your shell, because it is designed to run commands! And if you are on macOS or Linux, there's a good chance you even already have it installed. So go give it a try.

I think perl comes at this from "the other end," making a general-use language into a shell-convenient package, but I don't have enough experience on it to comment. Nonetheless, I suspect the horrors of perl are somewhat similar in nature: Too many blackbox systems that must be glued together.


TCL has a very simple model of a shell "REPL": interpret string as expression and evaluate.

In contrast, a Unix shell has job control, stderr, pipes, etc.


I’m struggling to understand how you don’t get these features with tcl / tclsh.

You get job control, familiar access to your file system and the usual conventions around command execution of binaries in your PATH (like ps, kill, etc), you have stderr, you have pipes...

...but you also have a language that is immediately polyglot because of the way it uses strings. Each string reference is analogous to a file-descriptor; a function pointer that can be redirected and executed in whatever context, while also feeling like native context scopes in any languages you might want to use to interpret/compile them due to their use of the {} delimiter.


But you can't connect programs together via pipes from within TCL, except by doing a terrible hand-rolled implementation that does half the things that bash gives you for free. And arguably, the ability to connect programs together like that is the most powerful feature of any modern shell.


`open |myProgram` returns a file-descriptor for a pipe (that you can explicitly open for read, write, or read/write), and `exec ls |grep {myPattern}` will pipe the output of ls into grep...okay nevermind I do see your point.

    p1 | p2 | p3 > file
is certainly nicer for interactive use, although I would argue that a hand-rolled pipe operator to give you the same in tcl comes with some pleasant fp benefits, and it’s only a few lines to define it to get the above, or supply alternative syntax like

| {p1} {p2} {p3} {[open file w]}

or if you want to also define the redirect,

| {p1} {p2} {p3} {> file}

where each portion of the pipe can be much more clearly manipulated in-stream, and with whatever interpreter, before returning the pipeable.

This gets especially nice for gluing when you want to run scripts in multiple interpreters but they are short and you want to maintain the context for readability between multiple execution contexts:

    | {node -e {
        console.log(“start”)
      }}
      {DSL1 {
        MyDSLScriptHere
      }}
      {grep {whatever}}
      {> file}
Sometimes the extra context hurts readability, but sometimes it’s invaluable.


OSH is probably a good incremental improvement over bash, but I also enjoy using the significantly more tradition-breaking Powershell with its object oriented nature.

It feels a lot more like programming and actually gives you useful suggestions right inside the terminal!

On Linux the auto completion behavior is luckily less obnoxious than on Windows and it doesn't have the multi second startup delay either.


As a software developer, not a sysadmin, I have real trouble with Powershell. They didn't tradition-break enough and it's like some bastard child of .NET and Bash. I think most people have an easier time understanding https://www.cs-script.net.

It's powerful but the syntax and semantics are maddening.

It's going to be entrenched in the Windows world for the next 20 years and prevent anything better from coming along.


Yeah, I had trouble with PowerShell’s auto-magic array handling more than once. I see how it can be convenient in the REPL, but it’s just terrible when writing scripts.

Your array logic returned just one element in this folder? Why, we’re unwrapping that!


I've been using Powershell for a few years now, there are some minor warts, but those can fit in a single page. I found it rather nice for shell scripting.


I use and recommend powershell as well but only because it is best in class at interacting with Windows components. I think there are better general solutions.


I should turn this into a FAQ, but PowerShell is natural on Windows, where the OS provides objects (either via the .NET VM, or COM and .DLLs, etc.)

A Unix shell like bash or Oil is natural on Unix, where the OS uses text files. And in distributed systems where data is JSON, YAML, XML, protobuf, msgpack, etc. not objects.

So basically shell is a "situated" language, and the easy of accomplishing any given task depends a lot on the environment.

Windows is more tightly coupled and trying to provide something nice for you. Unix is messier but doesn't limit you, and it's what basically all big systems are made of these days. A major strength of shell is to glue things together that nobody thought should be glued together.

---

Good example here from Paul Bucheit: http://www.oilshell.org/blog/2020/01/simplest-explanation.ht...

http://paulbuchheit.blogspot.com/2009/01/communicating-with-...

However, we needed a way for Gmail to make money, and Sanjeev Singh kept talking about using relevant ads, even though it was obviously a "bad idea". I remained skeptical, but thought that it might be a fun experiment, so I connected to that ads database (I assure you, random engineers can no longer do this!), copied out all of the ads+keywords, and did a little bit of sorting and filtering with some UNIX SHELL COMMANDS

I then hacked up the "adult content" classifier that Matt Cutts and I had written for safe-search, linked that into the Gmail prototype, and then loaded the ads data into the classifier. My change to the classifier (which completely broke its original functionality, but this was a separate code branch) changed it from classifying pages as "adult", to classifying them according to which ad was most relevant. The resulting ad was then displayed in a little box on our Gmail prototype ui. The code was rather ugly and hackish, but more importantly, it only took a few hours to write!

----

Also, use whatever's best for you, but Unix people tend to hate PowerShell:

https://medium.com/@octskyward/the-woes-of-powershell-8737e5...

PowerShell feels like it was built by people who had heard about command lines a long time ago and tried to recreate one based on stories passed down through generations of their ancestors

The fact that they cargo-culted operators like -eq, -le, -lt, while not maintaining compatibility is just silly to me. In Oil it's "x == y" for string equality.


> I connected to that ads database (I assure you, random engineers can no longer do this!), copied out all of the ads+keywords, and did a little bit of sorting and filtering with some UNIX SHELL COMMANDS

One of PowerShell's nifty tricks includes being able to walk a SQL DB like its a file system, see https://docs.microsoft.com/en-us/sql/powershell/navigate-sql...

PowerShell is insanely flexible.


There's nothing inherent about an object-based design that limits what you can glue together. Yes, it might require a bit more effort to transform object formats, but today's untyped shell also creates problems that don't exist in a typed environment.


> JSON, YAML, XML, protobuf, msgpack, etc. not objects.

JavaScript Object Notation doesn't map well to objects?


An object should have identify, state and behaviour. JSON only encodes state.


On the other hand, for interoperability, passing around behavior is horrible. It's either a security risk or a compatibility (forward/backward) risk or both.

We should try to pass around just state and keep objects strictly for code.


Right, I'm not really familiar with PowerShell, but my understanding is that the objects in object pipelines are literally .NET objects with methods on them.

So the entire shell script is confined to the .NET VM?

In that case, I would hesitate to even call it a shell in the traditional sense.

Shell has a kind of code <-> data <-> code <-> data architecture, i.e. programs in different languages processing standard language-independent data formats (lines of text, JSON, HTML, QSN, etc.)

It's more like functional programming, where functions stand alone. (And note Oil is in a very OO style, because it deals with significant program state, so I'm not against objects. Right tool for the right job.)

-----

Other questions:

What if I want to pass some data to R and plot it? Or throw away some outliers with a little formula? I now have to figure out how to serialize those objects. Or do I have to write an R interpreter for the .NET VM? :)

What about splitting the pipeline over two different machines? I can do "ssh user@host find / -type f" trivially in shell. And I do this in practice, e.g. in Oil's continuous build: http://travis-ci.oilshell.org/jobs/

"Distributed objects" have proven to be a bad idea. "Real" (large scale, deployed) distributed systems are architected more like Unix than Windows.

The point of shell is to integrate disparate tools, so if there are some privileged tools in the .NET VM, and then some other tools that require a lot of work to get at, then that misses the point IMO.


I think you need to go actually spend some time with powershell before you make more assumptions about it and it's behavior.

> Shell has a kind of code <-> data <-> code <-> data architecture, i.e. programs in different languages processing standard language-independent data formats (lines of text, JSON, HTML, QSN, etc.)

Powershell natively understands these with builtins. You can convert from and to json, xml, html, csv, and excel.

> What if I want to pass some data to R and plot it?

Just pipe it to your R program like you would any other shell.


No, have you seem what kind of mish mash of a scheme most systems output? They'd be back to text processing anyway just making sure it returns the expected values.


Hadn't heard this story. Wouldn't they already have had some sort of ad relevance engine for Search?


> feels a lot more like programming

When I want to program a script, I can use a scripting programming language like Python with its object oriented nature.

When I want to interactively command my computer I can use a shell like bash or zsh.

Do one thing and do it well.


There's an enormous grey area in-between them though, and something that bridges that area can be insanely powerful. Making it easy to go from doing something by hand to fully automating it creates value. It's one of the main strengths of a command line interface.


I wouldn’t call it enormous — Python can be used as an interactive shell (there’s even a toy “operating system” that uses it as the only text shell), and traditional shells can write (simpler) versions of software that’s normally made in Python https://news.ycombinator.com/item?id=23643096


To clarify: I didn't mean the tools, they do indeed overlap in functionality. I meant the activities you refered to, "program my computer" and "interactively command my computer".


>Do one thing and do it well.

That's how we got in today's mess.

Do the things people want in a coherent, cohesive, well thought out way - don't combine N independent programs that can barely took to each other with stringly typing and ad-hoc parsing...


Sounds like you’d prefer Python.


I don't think you'll find universal agreement that python is "coherent, cohesive, well thought out"...


Well, I do Python for decades :-)

But I would be OK with

(a) shell programs that talk to each other through pipes with something structured -- e.g. depending on the program with something like a pandas dataframe(binary) a parsable tree (e.g. json-like) or csv output and input -- as opposed to random text people try to parse with cut -f, regexes and so on.

(b) shell programs that share the same flags for the same things (not e.g. -o/--output for output in one program and -w/--write-file in another) and only specialize in the behavior they need...

and several other things besides...


Agreed... PowerShell is actually interesting and it feels a bit more like you're inside a shell-like REPL rather than just a fancy shell.


It's interesting until you run up against speed limitations both left and right. I still use it for some work and enjoy it, but it is by far the slowest technology I've used.


Believe it or not you have to disable the progress bar to speed some things up. Downloads alone are at least 10x faster if you disable progress bar.


File read and writes are also glacially slow by all the common cmdlets. Even if you use .NET in your code and make it far more verbose it CRAWLS.

Someone on here previously showed a trick to get some speedups, but it is still slow.


Are there any transpile to bash languages?

I like the idea of a language that allows for a better experience in writing shell scripts -- and produces shell scripts with fewer bugs on the output side.

But i wonder if a transpile approach wouldn't be better in the long run


(author here) Yeah I've been asked that before, and there are some things you can fix with a transpiler, but somethings you can't.

That's why the tagline to Oil is now "our upgrade path from bash to a better language AND RUNTIME".

The shell runtime needs to be fixed too, e.g. the error handling mentioned in the blog post and that I follow up on here: https://news.ycombinator.com/item?id=24872986

You might be able to compile that to some really long bash, but I don't think it works in general, because editing deployed scripts to fix stuff is useful (yes in production, but also on your own machine. Think of all the shell that people put on it, like virtualenv, nix-shell, rustup, etc.)

You don't want to end up with the JavaScript problem: you have a dynamic language, but ALSO A BUILD PROCESS, which is basically the worst of both worlds.


So is the normal usage path to start using osh for interactive use and then use osh in a bash compatible mode to write scripts that are less buggy but can still be run on bash?

Or is it more common to write scripts that tend to then require the osh runtime exist in whatever context they are deployed to?


Right now, I use osh interactively in a very bash compatible mode.

Then at the top of my scripts I put this if I still want to run with bash:

    shopt -s strict:all 2>/dev/null || true
Or this if I don't need to run with bash anymore:

    shopt --set oil:basic
Example: https://github.com/oilshell/oil/blob/master/test/spec.sh#L9

docs in progress: https://www.oilshell.org/release/0.8.3/doc/oil-options.html

Having the OSH runtime exist everywhere is something we should work on: https://github.com/oilshell/oil/issues/463


What do you think produces the traditional ./configure scripts that you run to configure an open source project prior to building it? Did some human write these ./configure scripts manually?


M4 doesn't really satisfy:

> better experience in writing

It will happily transform a small mistake into garbage. Then you have to debug both the transformed result and the original script to match up what you intended.


Autoconf isn’t flexible enough for general purpose scripting though, is it? So not really relevant I think.


Tangentially you can use Dhall to get many types for Bash. You can have your options/configuration typed, then have Bash run a loop over it or whatever you need ho do.


Two ergonomic issues in POSIX/bash not mentioned that I’d like to see a new shell improve on:

- splitting pipes to separate flows (e.g one flow for stderr, one for stdout). Think of it as a graph.

- native parallelization; I have to look in man every other time I use parallel.


Yes absolutely! That's been on the table for along time, and we even have a nascent xargs from a contributor (that has languished -- I need help!)

Let's discuss on the issue if interested: https://github.com/oilshell/oil/issues/843

Oil is easy to prototype (it's just Python code), so anyone who wants to make this feature their own should feel free to dive in :)

A lot of people have ideas, which may or may not be implementable with fork(), wait(), pipe() and dup(). Anything that can be done safely with those syscalls is basically viable.

The syntax is generally the easy part compared to the runtime execution. Oil has first-class blocks already which will help a lot.

----

Related: I also want to fix redirect and here doc syntax (in a compatible way):

https://github.com/oilshell/oil/issues/841

https://github.com/oilshell/oil/issues/832


The first is interesting. I have a hard time imagining what that syntax would look like. You'd need a way to refer to your graph nodes, so I don't think you would get much better than what named pipes (aka FIFOs) already provide.

Totally agree about GNU parallel. The syntax is not badly-designed IMO, but I don't use it very frequently and it's different from everything else so I often get it wrong. Using it as a drop-in replacement for xargs is easy enough though.


> native parallelization; I have to look in man every other time I use parallel.

Can relate, the syntax is very strange, at least to me. And I wonder where it came from.


> - splitting pipes to separate flows (e.g one flow for stderr, one for stdout). Think of it as a graph.

Could you elaborate a little on what you mean by this? bash already has the ability to redirect STDERR and STDOUT independently, and when you introduce named pipes into the mix things can get really fancy (probably too fancy, this is about where your script stops being a script and starts being a program written in perhaps the most inconvenient language available.)


I imagine an easy/intuitive way to set up FIFOs for stdout/stderr with redirection, maybe? A way to say: for this following section, I want stderr to go trough "|sort -n|logger -t DEBUG -f -" and standard out to go through (... Other pipeline)?


command 2> >(sort -n|logger -t DEBUG -f -) | other pipeline


I am very sympathetic to this. I like the concept of a shell (there's a place for light scripting that doesn't involve python or perl or ruby), and appreciate what bash can do, but after encountering enough of the "gotchas" and just ugly behavior, I'm not really motivated to master it. The biggest reason to learn it seems to be the fact that everybody uses it and has done so for years.

We need something akin to Jupyter notebooks for unix.


Yes, I would like for that to happen:

Shell as an engine for TUI or GUI https://github.com/oilshell/oil/issues/738

C++ code should be embeddable in another program https://github.com/oilshell/oil/issues/822

I think you can already use bash in Jupyter, but the integration is necessarily kind of coarse.

I don't know exactly what a good integration is, but I'm looking for people interested in UIs to help me figure it out. At the very least, you can use Oil's parser to provide completions:

http://www.oilshell.org/blog/tags.html?tag=interactive-shell...

(which is a separate integration point that running commands)

Also, I mentioned in the blog post that there's some interest from maintainers in combining the fish shell and Oil. However someone has to do that work, and there's a lot of it!

https://github.com/oilshell/oil/wiki/Where-To-Send-Feedback

----

Also, I use shell scripts as a productivity tool like a Jupyter notebook:

http://www.oilshell.org/blog/2020/02/good-parts-sketch.html#...

I think we need to come up with a clever name for it -- some people call it "Taskfile", "go script", and I call it "run.sh".

The basic idea is that I don't remember how to do things with computers, because there are too many things to do.

I just remember WHERE I WROTE IT DOWN in a shell script. Some people write their notes in text files; I write them in executable shell scripts.


> I don't know exactly what a good integration is, but I'm looking for people interested in UIs to help me figure it out.

Ignoring the language bit, in write impressed with both AutoIT and PowerShell GUI integration. They really allow you to create script-like utilities.


The biggest reason to master the shell (I think) is that if you will discover a shocking amount of things you can quickly accomplish piping between the standard Unix binaries (essentially the standard library).


IMO as soon as you feel the need to create a file for your shell commands, it’s time to use another language. Bash is great for quick jobs like running ffmpeg on every file in a folder or counting the lines in a file but it sucks as a programming language.


It sucks as a programming language because it’s a shell. No I’ve written rather large scripts that were robust but had no need to be written in a programming language. If your gluing programs together shell script are literally designed for that. If you’re doing lots of arithmetic and huge amounts of variable logic then, yes, you should switch to a programming language. Each tool has its place but I’ve seen people with this attitude spend a day writing a program in python to solve a problem we could do with a couple of shell functions in half an hour.


^ and the Python script likely has many more bugs.

One of the best things about gluing together battle tested Unix programs is that you'll only have bugs in your glue, not in the tricky logic that's more error prone.


I haven't tried it personally, but there is a notebook-style terminal that might interest you, aptly-named Shell Notebook.

https://shellnotebook.com/


There are some good ideas in here, but `--qsn` is described[1] like this:

> Print filenames ONE PER LINE. If a name contains a newline or other special char, it's QSN-encoded like 'multi-line \n name with NUL \0 byte'

This is a bad idea. There is a solution which already works. Just terminate every filename with NUL and you're done. Trying to make something "sorta kinda human readable" is a mistake when dealing with what is effectively arbitrary byte sequences which can contain anything except for NUL. Consistency makes the consuming code much simpler.

That said, QSN is a great idea for human input.

[1] https://www.oilshell.org/blog/2020/10/osh-features.html#safe...


The paragraph right below that mentions that Oil has "read -0", which consumes the find -print0 input.

I also link to my Git Log in HTML post [1] from 3 years ago, which is ENTIRELY about the NUL byte solution :)

-----

Shell scripts can use both formats, but the advantage to QSN is that it preserves the line-based nature of shell.

Say I want to use wc -l, awk, or grep. Then the QSN-lines format is better than the NUL format.

Also, you can transmit a series of say SHA256 checksums in binary format with QSN-lines, or some other binary format.

Even entire JPG files, audio clips, wasm files, whatever. It's "8-bit clean" (and so are Oil strings).

[1] https://www.oilshell.org/blog/2017/09/29.html


> The paragraph right below that mentions that Oil has "read -0", which consumes the find -print0 input.

Yeah, I read that ("read -0", for the record, is an excellent idea). QSN for filenames still a bad idea. Your wc, awk, grep etc commands will now have to decode the stream after splitting it. Taking it to the extreme, it's like mixing JSON and XML (or CSV and TSV) because some things work better in one or the other.

Right now, even in Bash, I can use newline-terminated tokens and accept that newlines are going to screw up everything (which is fine if I control the filenames), or I can use NUL-terminated tokens and force my code to handle absolutely all possible inputs. No extra parsing library necessary.

As long as filenames can contain any character except slash and NUL, terminating them by NUL is a simple solution (at least when using GNU tools).

> Shell scripts can use both formats, but the advantage to QSN is that it preserves the line-based nature of shell.

Except when your tokens can contain newlines, which we're stuck with for the foreseeable future.

> Say I want to use wc -l, awk, or grep. Then the QSN-lines format is better than the NUL format.

Can't agree, for the reasons above.

> Also, you can transmit a series of say SHA256 checksums in binary format with QSN-lines, or some other binary format.

> Even entire JPG files, audio clips, wasm files, whatever. It's "8-bit clean" (and so are Oil strings).

That's a completely different problem set. You could use existing simple encodings like base64. "Why use QSN over base64" would be an interesting blog post.

In any case, I admire the courage to try to improve the state of the art! I would absolutely love to see something better than POSIX/Bash as the baseline. At this point I suspect we'll need to ignore those and go for something radically different like PowerShell to regain sanity.


A bunch of points:

(1) Except when your tokens can contain newlines, which we're stuck with for the foreseeable future. -- not sure what you mean here, because QSN strings are defined not to contain literal newlines. They're escaped like '\n'.

The invariant of QSN is: EVERY BYTE STRING, including those with newlines and nuls, can be represented on a single line. I guess I should put this in the documentation.

(2) A QSN decoder is very easy to write. For example, here's a ~6 line regex that validates all of QSN:

https://github.com/oilshell/oil/blob/master/qsn_/qsn.py#L498

True, you need like ~20 lines of code to decode it, but that's very easy too. You can also make a QSN decoder from a JSON string decoder. It's basically changing your tests and moving a few statements around.

(And yes, QSN is a regular language [1])

(3) Oil should grow [2] an awk-like dialect [3] that understands QSN and QTSV.

(4) Although you also don't have to decode it for it to be useful.

1. I can use wc -l on a stream of QSN strings

2. If I know the strings are QSN-encoded, I can search for NUL bytes with fgrep '\0'

It's basically like the UTF-8 philosophy. ASCII is valid utf-8. QSN lines are lines of text.

(5) base64 is bad for humans at the terminal, because it makes everything unreadable. QSN preserves all printable ASCII and unicode.

-----

I don't think it's going to solve all problems, but it will solve some. It's there if you need it. In typical Unix style, the solutions will be heterogeneous. You can absolutely use the NUL format in Oil, and it now has support for it.

[1] http://www.oilshell.org/blog/2020/07/eggex-theory.html

[2] That is, if I get help :)

[3] http://www.oilshell.org/blog/tags.html?tag=awk#awk


> (1) Except when your tokens can contain newlines, which we're stuck with for the foreseeable future. -- not sure what you mean here, because QSN strings are defined not to contain literal newlines. They're escaped like '\n'.

The tokens in question are filenames, which are (and will continue to be) allowed to contain newline characters. Sorry that wasn't clear. This was never meant to be a critique of QSN, I'm just saying QSN is unnecessary and unnecessarily complex for between-process communication:

- In case of filenames, just use NUL. It already works, is supported by a bunch of tools, and will never have the overhead of encoding/decoding.

- In case of arbitrary binary streams, just send the stream unaltered. Why encode/decode it? What benefit is that? In the very rare case that I'm actually manually inspecting a stream of bytes (debugger/printf) there's always some easy way to encode those bytes for easy readability such as `printf '%q\n'`.

> (2) A QSN decoder is very easy to write. For example, here's a ~6 line regex that validates all of QSN:

It would take me at least a day to write enough test cases to ensure that that regex in fact does what it says. But the point is moot: encoding and decoding should only happen at the interface with a human, not with another process.

> (3) Oil should grow [2] an awk-like dialect [3] that understands QSN and QTSV.

I couldn't comment; I don't use awk unless I absolutely have to. Once I need to use awk a shell script is no longer the best tool for the job, except for quick once-only processing.

> (4) Although you also don't have to decode it for it to be useful.

> 1. I can use wc -l on a stream of QSN strings

I can already use `wc --files0-from=-` on a stream of NUL-separated tokens.

> 2. If I know the strings are QSN-encoded, I can search for NUL bytes with fgrep '\0'

You can already `grep` for NUL bytes[1].

> (5) base64 is bad for humans at the terminal, because it makes everything unreadable. QSN preserves all printable ASCII and unicode.

But filenames are neither ASCII nor Unicode - they are bytes, which can include 0x80 through 0xFF (not valid ASCII) and things like 0xFF (not valid UTF-8).

I still can't see a single use case where I'd want inter-process communication to use QSN. Human input, sure, I'd love something more convenient that `$''`. Human output, also sure, humans can't read NUL characters or tell the difference between spaces and tabs after all.

[1] https://superuser.com/a/612336/2259


If you've ever used JSON between processes, then QSN is the exact same idea. JSON strings are also encoded (but they can't represent byte strings).

There are probably people who never use JSON, and that's fine, but there are a lot more who do! The goal is absolutely for tools like grep to adopt QSN support. [1]

Another argument is that in networking framing, you have a few choices:

(1) delimiter-based (NUL bytes)

(2) escaping (QSN with \)

(3) length prefixed (netstrings)

So Oil supports #1 and #2 now. Oil doesn't support netstrings but it actually does make sense, and avoids encoding/decoding. Though QSN encoding can be made extremely fast, just like it's been done with JSON [2].

[1] GNU grep actually has the ASCII/binary detection problem that was brought up on HN awhlie ago. Honestly it would be worth making a "slow correct utf-8/QSN grep" that avoids this.

https://unix.stackexchange.com/questions/19907/what-makes-gr...

[2] https://github.com/simdjson/simdjson


> If you've ever used JSON between processes, then QSN is the exact same idea. JSON strings are also encoded (but they can't represent byte strings).

I don't see how they are anything like the same idea. JSON is an object serialization format. These objects may contain strings, and the encoding of those strings unfortunately is not binary-complete. But most importantly objects have structure. QSN is a binary string encoding format, and has no structure outside of the sequence of bytes. In any case I'm not sure why you keep bringing up JSON.

On the other hand, if it wasn't for the fact that JSON is so strongly tied to JavaScript QSN might've been a good string encoding format for it.

> There are probably people who never use JSON, and that's fine, but there are a lot more who do! The goal is absolutely for tools like grep to adopt QSN support. [1]

I know what the words mean (and I've been using JSON and grep for years), but I don't understand what that sentence means.

> Another argument is that in networking framing, you have a few choices:

I don't know what "networking framing" is. If you mean actual network packet structure, I don't expect you'll be able to convince a single network engineer that QSN is a better choice than netstrings.

In any case, I'm going to have to shut down this thread now. I don't expect anyone else is reading this, I'm not learning anything new, and I can't convince you that QSN is a bad idea for IPC.


I think having both is a good idea. Sometimes maybe you get your list of path names from a Here-document, for example.


For sure we need both human- and machine-readable stream formats, but I don't really see how QSN has any advantage in that space over `printf '%q\n'` and NUL-terminated strings, respectively.


I actually considered that, but it doesn't work.

https://github.com/oilshell/oil/wiki/Shell-Almost-Has-a-JSON...

Single quoted strings in shell can't represent arbitrary strings either.

And the %q format is actually different from the format ${x@Q} emits. QSN is a well-specified format.


I love the direction this wants to go in. So many of these are reasons why we've banned new shell scripts on our projects and instead make python scripts.


Are there ever good reasons to choose a shell script over a Python/Ruby/etc script?


Working with large amounts of files or analyzing the content of large files. Not only will you be faster in glueing commands together with pipes but tools like find, grep, sed, awk, cut and so on will also perform much much better than interpreted languages.

Also, I'd never prefer using the stdlib of Python/Ruby/etc for system commands like cp/chmod etc. It takes way more boilerplate code than a oneliner.


You can without too much work build a Python DSL which makes stuff like executing those oneliners as easy as x("cp", "foo", "bar"). And then you never have to worry about quoting command arguments again.


Certainly not impossible but I don’t often have to deal with large amounts of files in circumstances where the performance hit matters to anyone. And while I know sed, awk, etc are powerful, I don’t know them & I’d have to learn how to use them, whereas I do know Ruby & Python.

At the very best this is an edge case where one would have reason to choose shell scripts over scripting languages.



> It won't work. It would be like trying to convince people who are paid to write PHP not to write PHP. Many people have wasted breath on that, but important sites like Wikipedia are still written in hundreds of thousands of lines of PHP.

If they're gonna keep writing bash, why would a new shell language help?

> Even if a new line of shell never gets written, there will still be a huge installed base of shell scripts that you may need to understand (e.g. when they don't work).

If the old scripts are in bash, why would a new shell language help?

> Shell is still the best tool for many jobs. Most new "cloud" projects rely on Linux system images, in VMs or containers, and shell is the ideal language for creating such images.

What's good about shell for this task?


Because you can run your bash scripts with Oil. The tagline is:

It's our upgrade path from bash to a better language and runtime. [1]

It's basically the same as JS -> TypeScript, or PHP -> Hack. It's a saner language (and runtime) that runs existing code.

-----

Shell is good for creating Unix systems because of the tools it provides. Ones that deal with the file system and heterogeneous processes (i.e. stuff you didn't write in different languages).

It's hard to explain, but if you work in that area, you'll very quickly see it. You could also do something like Linux From Scratch [2] and it will be very clear why shell is used.

[1] http://www.oilshell.org/

[2] http://www.linuxfromscratch.org/


Seems like a big part of why shells are so much better for certain tasks is because of a failing of more general purpose languages. Working with the filesystem, spawning processes, and orchestrating io streams. Shells could almost be considered DSLs for these things, plus some session / state management.

So I think a good question is this: why can't we make these things equally easy to do in a more general purpose language?


Because on the shell side you want simplicity and on the programming side you want control. Shell pretty much has to default to "foo" meaning "execute command foo from $PATH" or almost every line will have useless overhead. GP language needs to make that explicit or it will be a footgun where you don't know what's a command / variable / function. And that's before we get into how the file descriptors / redirects are handled.

One shell which tries to merge those is ipython with the sh profile https://ipython.readthedocs.io/en/stable/interactive/shell.h...


TCL, Powershell, and Rebol are probably the closest I've seen in this area.


Ok well #1 does not apply to me because I don't write a lot of shell. Ditto #2. So it would be better to understand your case for #3, when is shell the best tool for the job when writing a new script? Why?


Python is a bit clumsy by default when the primary purpose is to run other programs. But, it is easy to write a "run()" wrapper function that works more like a shell. There are modules like "sh" as well to smooth over the bumps.


Just in case people are looking for something really new in a shell: besides Oil, check out fish shell. The killer feature for me is its autocompletion -- its basically psychic.


Fish is amazing as an interactive shell. I've had it as my default for a couple years now. Oil isn't the only game in town for object shells, there is also NuShell and Elvish which follow the fish model of abandoning POSIX compatibility for ease of use. I think OP had mentioned it elsewhere on this thread, but I've found object shells mostly useful for scripting that involves a lot of calling and processing the output of programs that produce structured data. I mostly use PowerShell for this, though Nushell/Elvish/Oil seem to be actually usable now. I'm pretty excited that the Unix community seems to finally be moving on from defending the old unstructured text model of program interaction to embracing the benefits of object shells. Whatever you think of its implementation, PowerShell is a great idea.


can't recommend fish enough, the autocompletions and the intelligent history scrolling absolutely made the difference to me.

probably an unpopular opinion but I really like fish scripting more than bash, the syntax is way more intuitive and I don't find myself missing bash at all.


Yes, fish and Oil are very complementary (see the note about fish in the blog post)


How about Powershell

https://github.com/PowerShell/PowerShell

I can't call myself a fan of powershell but if everyone switched I'd get used to it.

sh and bash both seem like they should die in a fire. They are full of foot guns that end up costing millions in breaches and lost data. The space think even bit Apple back in the day, they had some OS upgrade script that ended up deleting your entire hard drive if there was a space in the volume name.


sh and bash both seem like they should die in a fire. They are full of foot gun

Right, that is the point of Oil. The post describes 4 footguns that you can now avoid. Moreoever, you can run your existing shell scripts with Oil first, and gradually move away from the dangerous style.

Rewriting even 500 lines of shell in Python is not a pleasant task (even if you think Python is better for the task).


PowerShell is pretty damn awesome once you get the hang of it. The way objects are passed via pipe allows for really cool things like get-foo | where {$_.bar -eq 1} and you can serialize the output of most commands with get-foo | convertto-json. I use this feature all the time. Get-ADObject | convertto-json will give you a dump of your entire AD.


Powershell is a bit better, but got it's own disadvantage:

Despite the consistent naming, I can't remember correctly the modules names.

Running Remove-NetFireWallRule without arguments delete all the firewalls rules despite a flag "-All" not being set.

The learning curve is too hard, or it's too complex to learn on the job: It takes me at least half an hour to write semi-complex commands despite the number of hours I struggled trying to write commands with.

And it still suck as a scripting language.


> And it still suck as a scripting language.

Why do you think that? I find PowerShell quite neat and more consistent than what bash and alike has to offer.


It does "resume on error" by default.

It got script languages issues: The last line of your huge script may be syntaxically invalid. Here you start thinking "why I didn't wrote it in my favorite statically compiled language"

Yes, it's still more consistent than what bash and alike has to offer, but it's still behind if you compare it to any language that you can write to a file.

The only advantage it got against them, it's the huge API on windows.

Sadly I wanted to use Powershell functions from C#, tought it would be easy because Powershell use .NET Core, but it's far from easy.


I usually avoid scripting in bash or sh and use python instead on the linux side. But yea, powershell is a pretty solid shell and seems more intuitive than bash/sh. The object oriented part is pretty awesome and lets you introspect objects easily so you know what kind of properties and methods are available.


The Simple Word Evaluation on its own is a significant improvement.

bash and posix sh will continue to have their place where portability is necessary, but this seems like a better day-to-day tool.


New shell or not I hope we see updates to existing shells to do better error handling too


(author here) Yes, I actually proposed that in the comments [1]. Some should add command_sub_errexit to bash:

    echo $(date %x)  # can you tell what's wrong here?
    echo 'script should fail before this'

    local d=$(date %x)
    echo 'script should fail before this'
And process_sub_fail:

    diff <(sort left.txt) <(sort /oops/error)
    echo 'script should fail before this'
As mentioned in the comment, bash doesn't even wait() on process subs!

The 'run' builtin to fix the "if myfunc" problem should also be adopted by other shells.

POSIX basically calcified broken language semantics and needs to be fixed.

I also document stuff like shopt -s simple_word_eval for other shell implementers: https://www.oilshell.org/release/0.8.3/doc/simple-word-eval....

-----

BTW Oil's code is very short (see comment), and I'm looking for help :) Let me know if you can't get a bin/osh working in 1 to 5 minutes. (If you have a Debian/Ubuntu-ish machine it should take about 1 minute.)

This is a pure Python program that you can quickly modify/prototype, and then for the release it's translated into C++ for a 30-50x speedup [2].

https://github.com/oilshell/oil/wiki/Contributing

[1] https://lobste.rs/s/qfiki1/four_features_justify_new_unix_sh...

[2] http://www.oilshell.org/blog/2020/01/parser-benchmarks.html


I still have high hopes for Lush, the Lisp Universal SHell[0].

[0]: http://lush.sourceforge.net/


Is this shell actually written in python2 in the year 2020?

(see: https://github.com/oilshell/oil/tree/master/Python-2.7.13 )


Some fun idiosyncrasy going on

> It's written in Python, so the code is short and easy to change. But we automatically translate it to C++ with custom tools, to make it fast and small. The deployed executable doesn't depend on Python.

Love it ;)


These things might justify making a non-Bourne shell, but they don't justify making a greenfield shell: the fish shell already has all these features, and since we all benefit by reducing shell fragmentation, I think the author ought to work on hacking on fish more before making yet another shell.


Shall I declare what you "ought to work on" in your free time too?


fish and Oil are very complementary (see the note about fish in the blog post)


Fish is incompatible with bash...


Every shell that changes how quoting works is incompatible with bash


There are different degrees of breakage and you should read his blog.

One of the modes is compatible with 99% of the bash scripts used out in the wild.


Yes, you're right. A closer examination of Oil yields something much more interesting than I imagined.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: