Hacker News new | past | comments | ask | show | jobs | submit login

I am always impressed how powerful and efficient shell scripts are but the syntax is just a killer. Is there a strong reason why all shells have this crazy syntax? Is it not possible to create a shell with a more expressive syntax like Java, C# or whatever?



I've thought about this a lot, but assuming you want to be pleasant for interactive use, you're subject to a tremendous number of constraints. Just to start with:

    rm foo bar
    rm -rf *
must be invocations of the rm command with the appropriate arguments.

You need bare strings, you need variable interpolation, you need redirects.

I think there's some room to break a few idioms and make something nicer, but it really is a hard problem. The choice to break things has to be based on how common an idiom is, how painful it is to not use it, and how much benefit you can get.


I think you've hit the nail on the head there. I've been experimenting with writing my own shell and the original syntax was a lot more like Javascript but having to encapsulate parameters in parentheses, comma delimited and quotation was a massive pain in the arse. So I dropped the requirement for quotation marks, then made commas optional, then I dropped using periods to pipe (ala methods in OOP) and then eventually ended up rewriting the entire parser to follow a more traditional POSIX syntax. As it turns out, as ugly as Bash et al is, it actually makes a lot of sense for writing one liners in an interactive shell (as you stated yourself)

So where I decided to deviate was areas I felt could be enhanced without breaking the POSIX syntax too significantly:

* shell configuration,

* error handling (something traditional shells are appallingly bad at)

* and support for complex data structures (eg so you can grep through items in a minified JSON array as smartly as you can with a traditional stream of lines.

I always respected the power of POSIX shells even before embarking on my pet project. However I never quite appreciated just how sane it's ugly syntax was until I attempted to replace it with something more readable. I mean sure there are still some specific areas I really don't agree with but that can always be argued as personal preference.

As an aside note, one thing I didn't quite appreciate until writing my shell is just how much shells have in common with functional programming. Yes I know it's a far cry from LISP machines; but if you take away the variable interpolation and environmental variables then you're left with a functional pipeline that take something from STDIN and write it to STDOUT and STDERR and does so in a multi-threaded, concurrent, workflow by default. Weirdly this still makes POSIX shells more efficient for some types of data processing than writing a monolithic routine in a more powerful (or should that be "more descriptive"?) programming language.

So there is some surprising elegance amongst all that ugliness.


On the data processing with bash part, Joyent Manta always looked interesting[0]. Reading about data processing via bash makes it sound pretty fun.

[0]https://github.com/joyent/manta/blob/master/README.md


Why don't we have something similar to vi modes to jump between different 'interpretations'? I believe scsh had something to that effect using chars to switch modes ala quote/unquote.

Maybe something based on Color Forth's ideas?


Because that wouldn't work for copying and pasting code nor writing shell scripts. If you're going down that route then really you need a printable character to prefix the command line (or similar solution)

For what it's worth though, most shells (including my own one) do support switching between different text entry / hotkey modes - such as vi - even if the syntax is consistent.


So how is your shell coming along? Are you planning to release it as open source, or at all? I remember you mentioning it a few times before on HN.


Pretty well thank you. I now use it everyday as my primary shell, but the documentation isnt where I'd want it yet so I can imagine it would trip new users up in areas I've chosen to break from POSIX / "Bashisms". There are also a few areas of subtle bugs I'm still running into, but that is to be expected with anything on this kind of scale.

It is already open source but I'm the only contributer currently. Which is fine as it's a personal project anyway. So anything beyond that is a bonus.

The readline API I wrote for it is pretty nifty though. I'm thinking of spinning that off into its own repo since I've been a little disappointed with the existing realine APIs for Go so I think other people might genuinely benefit from that even if they're not particularly interested in a semi-Bash compatible alternative $SHELL.


Cool. Good luck with it. Interesting about the readline stuff. Can you share the link to the shell project? I'd like to look at the code, with a view to learning something about the techniques involved, as command-line tools and shells is an interest of mine.


Thank you.

The source can be found at https://github.com/lmorg/murex

Happy to discuss any questions or comments you might have on it.


Thanks. Will check out the code over a period, and ask you if I have any questions or comments.


jq is pretty neat for dealing with JSON in a shell script.


jq is a great tool but I wanted something that was more seamless than piping data to jq with a jq script quoted as a parameter. I wanted something where the shell itself could fragment and understand structured data (and not just JSON but YAML, TOML, HCL, CSV, XML, etc - though XML is proving more a great deal more troublesome than I'd hoped).

I'm not suggesting the work I'm doing is any better that jq though - there's a lot of areas where jq will run circles around my shell. I see it more as different design goals but with a fairly large area of overlap.



Yeah. Personally I hate it. But it's what Terraform uses so sometimes it's handy having another tool to pull bits from Terraform scripts.


In OSH I use the technique of "lexer modes" to handle all the cases you mention:

http://www.oilshell.org/blog/2017/12/17.html

I have a few more blog posts on that topic I haven't published, but hopefully that gives the idea.

I probably used to agree with you (if I'm understanding correctly; it's not entirely clear what the conflict you see is). But after writing the parser in this style, I don't think it's too big a deal.

I plan to use the same technique to parse the Oil language, which is a new language without the legacy. The command syntax is largely the same, but compound constructs like function, if, while/for, etc. are different.

The parser is regularly tested on over a million lines of shell:

http://www.oilshell.org/blog/2017/11/10.html

Note that Python, JavaScript, and Swift all need something like this too, now that they have arbitrary code inside string literals, just like shell:

    print(f'three = {1 + 2}')  # Python f-strings
    
    console.log(`three = ${1 + 2}`)  # JavaScript

    println("three = \(1 + 2)")  # Swift


I call it "failure by success". Unix is so incredibly successful that we are bound by decisions made 40 years ago. There is an unbroken chain from the Thompson shell circa 1972 to modern day bash.

The language wasn't designed; it was accreted over decades.

Looking at the difficulty of the Python 2 -> 3 transition might give you an idea of how hard it is to make incompatible changes, so that's what we're stuck with.

It's a little bit like asking why C has so many warts -- e.g. the insecure standard library, "holes" in the type system like silently converting pointers to bools.

But it's easier to "clean up" compiled languages than interpreted languages, and it was somewhat done with C89, C99, etc. You can have different compilation modes, and the user is more likely to fix things that a compile error flags.

With something like shell, you don't have a good chance to surface errors and clean up corner cases. It just evolved over time until it was out of control.

I'm working on fixing that with my Oil project: http://www.oilshell.org/blog/2018/01/28.html


> Is there a strong reason why all shells have this crazy syntax?

Stephen Bourne was a fan of Algol.

The source to the shell is… interesting. Via a bunch of preprocessor macros, the (C) code looks like:

        IF !letter(*cp)
        THEN    return(FALSE);
        ELSE    WHILE *++cp
                DO IF !alphanum(*cp)
                   THEN return(FALSE);
                   FI
                OD
        FI
        return(TRUE);



We have a little hubris too. I spent years talking about how dumb Perl was. Then I learned Awk, Bash, and Unix and saw that Perl was a natural and evolutionary powerup with a massive script archive. Perl wasn't dumb, I just didn't understand the design decisions. Bash and many dynamic languages are similar. On another note, when I use Java/C# (languages I have little experience in), I always can't help but notice the lack of expressivity. I mean it takes 1/2 page of code to read a text file and print all lines that say "hello". In the Linux shell this could be:

cat filename | grep -i "hello"

Someone else will probably point out a better way. It is also only a couple of lines of Python.


> Someone else will probably point out a better way. It is also only a couple of lines of Python.

grep -i "hello" filename # ;)

> I always can't help but notice the lack of expressivity. I mean it takes 1/2 page of code to read a text file and print all lines that say "hello"

I think you hit the nail on the head there. I've used just about every build system known to man and we're currently using cake at work, which is uses c# as a scripting language. Everything is 10 times more verbose than the equivalent shell/makefile would be. Things like concatenating sql files are 30 lines instead of code instead of one liners.


It's just

    grep -i hello filename


Haha, thanks! I always forget :)


It takes half a page because you get to choose things like buffering. Error handling is also much nicer.


scsh (https://github.com/scheme/scsh). Plus, it has the best Acknowledgements section of any technical documentation ever (https://scsh.net/docu/html/man.html).


Was glad to look up his CV and see that he's still around.. I'm not sure that 90s grumpy nerd humor has aged well into 2018, but I do appreciate the throwback to a different time


Alternatively, Is there any reason today to not just use a language like Python/Ruby/JS for where shell languages would have been used?


Unlike (ba)sh, those languages are not a DSL for manipulating processes and files. (Ba)sh is more accurately framed as an ancient DSL than as a general-purpose language that has now been exposed as subpar.


A shell script is designed with file operators, environment variables, file handle redirection, process control, pipes/ipc, etc as first class primitives. It's just far easier to write a shell script if all you need to do is manipulate some executable programs.


Their interaction with processes is very awkward compared to shells or even Tcl.


Number one reason is availability. Do all of your current and future targets have the right version of Ruby installed? What if you have a mix of python 2.7 and 3.6?


It really depends on target environment. At work? Sure, every single machine comes out of provisioning with Python 2.7 installed. For something that gets distributed to be used by other people? No way I'd trust anything but POSIX, and that's pushing it some days.


I'm strongly considering making it a point to do just that: include Ruby in all the images I can, just to not have to use Bash. For one-liners it's fine, but I don't get ten lines into a Bash script before missing scripting-language features. However, if someone were to try to talk me out of what they consider to be an absurd notion, I would be prepared to take their criticism seriously.


I have multiple ~10 line POSIX shell scripts in my dotfiles. I also have a bunch each of Perl and Python ones [1]. I'd rather not rewrite my shell scripts with neither of that languages, or Ruby (which I like and use a bit).

When scripting, I find general-purpose languages are better when you have structured data (i.e. some JSON coming in) or you want to use libraries, and shell scripting better when you're wiring up multiple programs to work together, or starting up a program after setting up the environment for it.

IIRC, the FreeBSD sh(1) man page was a good entry to writing POSIX compliant, thus quite portable shell scripts.

[1] Compare the ones here: https://github.com/cadadr/configuration/tree/master/bin


Shell is like a LISP with less parentheses. And it's inside the text/files/executables environment.


http://ammonite.io/#ScalaScripts - this might look crazy (probably because it is), but it's there, it works, it has static typing, a nice scripting library ( http://ammonite.io/#Ammonite-Ops ) it can be used as The Shell too.

And now that JigSaw landed with JDK9 and with single file running coming in JDK11(?) and with precompiling bytecode to native, it becomes easier and easier to deploy good quality code as quick and fast scripts, tools, utilities, etc.

Yes, of course, I still use bash, because it's there. But habits are easy to form and very hard to break. (Because why form re-form an old habit into a new?)


I will give Microsoft credit that Powershell is a very pleasant to use shell language. There are a few quirks, but I encounter them a lot more rarely than in POSIX shells.

That said, my brain does still think in POSIX scripting, so it always takes me a little while to get back in the swing of Powershell.


I gave up on PowerShell when I found out that the return value of a function is the console output. How s....d is that?


not stupid. Imagine if you wrote:

    get-content file.txt
and the result was the file content visible on screen. Then you were happy with your code and you put it in a function for reuse:

    function test {
        get-content file.txt
    }
and now there's no output. PowerShell pipeline is not stdout, get-content doesn't display anything on screen, only the output formatters at the end of the pipeline do. If the function has no output by default then you would see nothing. That would be an annoying behaviour the other way. In PS, braces {} make anonymous codeblocks and they're used often, e.g. in code like

    # square the first 5 numbers
    1..5 | foreach { $_ * $_ }

    # rename some files
    Get-ChildItem | where { $_.Name -notmatch '[0-9]' } | Rename-Item -NewName { $_.Name + "2" }
That would be really annoying if you always had to 'return' results out of scriptblocks. And it would be annoying if scriptblocks in functions behaved differently to scriptblocks in filterscripts or calculated properties.


Yeah that's fair.

You can work around it, but you really shouldn't have to.


The common workaround is to pass in a dictionary that's then used for return values. Workable, but seriously? Is that the best they could do?


Do you have any examples? This doesn't sound familiar.

Dictionaries are often used for return values, as they can be cast to PSCustomObject for a quick way to make objects for the pipeline. But in that use case they aren't passed in, and that's not anything to do with the `return` keyword or all output from all commands becoming function output.


That is... not the way I generally work around it or see it.

Normally you just use Write-Host when you need it and otherwise just collect return values inside the function by assigning them to variables.

It's basically solved if you never call a command without storing the return value somewhere. Which is not optimal in a scripting language where you just want to get things done, but it's not the end of the world.


My biggest issue with PowerShell is the lack of a good, web-indexed, resource for exposing what SHOULD be expected in a normal operating environment. It makes looking for features that are expected in a standard library just not worth the effort. By comparison every other language I like programming in actually has a very strong set of documentation in an easily found repository.


What do you mean by "a normal operating environment"?

Different versions of Windows came with different versions of PowerShell. Some can be patched to have the newer language features (Windows Management Framework 5.1 can be installed on Windows 7), but that won't bring in all the same cmdlets as Windows 10 has, because there aren't the required internals in Windows 7.

If you have different things installed (Hyper-V, ActiveDirectory, any big Windows role/feature) you'll have different modules available, and if you install things like RSAT (Remote Server Administration Tools) then you'll get server management cmdlets on workstation operating systems.

It's not so much like you download Python 3.6 and get one standard library everywhere, its origins are more in "companies managing their own Windows server estate", so your environment is not standard, it's whatever environment you have.

The standard library, though, is mostly .Net, so depending on your Windows and PS version, it's ".Net Framework 4.5" (or 3.5 or etc.) for .Net framework library features accessed directly from PS.

https://github.com/powershell/powershell is PowerShell 6 / Core, which is a lot more like downloading Python 3.6 - there is everything a PSv6 environment will have, open source on Github and documentation in https://github.com/PowerShell/PowerShell-Docs


I was going to agree entirely as I had been mucking with Powershell ISE and was surprised that it didn't have their normal standard of documentation, but I did find an okay reference online.[1]

Looking more, it's kind of bad, because they break the sections in the reference up by the modules, and it's utterly mysterious what the modules are for. I mean, they all seem to be for doing incredibly obscure stuff in Windows, which I'm sure is useful, but I can't figure out obvious things like the syntax or types.

This may be because the intended audience is liable to just copy and paste stuff and hammer away it until it works...

[1]: https://docs.microsoft.com/en-us/powershell/


Over/under on that link working in five years' time?


That is definitely true. Whenever I see a link to MSDN I assume it will probably be dead or redirect me to the homepage.


The fish shell was primarily created to be a shell language with a nicer syntax. I tried it for a few months a few years ago, and while the syntax is nicer, it misses a lot, and I mean a lot, of features. Adding those features would have, of course, complicated the syntax and gotten us back to something like zsh, so... I've been back on zsh ever since.

Really, the only complaint I could make is why chose something silly like `fi` or `esac` to end compound statements, but such an issue is too superficial to abandon everything else that comes with these shells.


Out of curiosity, what features were you missing in fish? I've been using it for a while and haven't noticed anything, but I am probably not in need of heavy on shell scripting. I've heard of plenty of people choosing zsh instead of fish, but I started on fish and never felt the need to switch, so I'm kind of wondering what I might be missing out on.


This could get very long, but to keep it short:

Substitution via `=()`, `<()`, and `>()`. `=()` save's a process output in a temporary file that I don't have to mess with explicitly and substitutes itself with the path. An example of use is `viewnior =(maim -s)` which shows in an image viewer a window that I select; `diff -u <(...) <(...)` or `cmp`, or `comm` instead of `diff` compares the output of multiple processes without having to save them to files; `tee >(...) >(...) | ...` feeds the output of one command to multiple commands without needing to save files

Subshells. I've used them interactively, and I prefer to not have troubles with quoting doing `fish -c '...'` in my commands.

Extended globbing. It's far more concise and less error prone than using a combination of `find` and text processing tools.

Dynamic directories. ~some_project/ for me expands to ~/work-for/client/some_client/some_project/ and works great with completion.

What attracted me the most to fish was the ability to work with multi-line commands as a single command in the history. However, I've seen that zsh has better support for this.

What killed fish for me back then was that their pipes were fake back then. This is something that they've since fixed, but back then, instead of running all commands at the same time with their inputs and outputs linked, I think fish would run the first command, save it to a file and then ran the second command with that file as input. I mean, the behaviour that I saw was that I wouldn't see any output until the first command finished, and some of the commands that I wanted to run took very long to finish or never were supposed to finish without me seeing some of the output first. I'm talking about watching file changes under a directory and manipulating the presentation of the output of the file watching command (e.g. `inotifywait -m ... | awk ...`) or looking for specific events in the system log as they happened (e.g. `journalctl -f | grep ...`) or looking for specific system calls of a never-ending running process (e.g. `strace -fe trace=file -p $pid | grep ...`). That made fish pipelines useless for me.

Anyway I wrote a more lengthy post once describing the differences between bash and zsh that I liked:

https://news.ycombinator.com/item?id=16963856

However, there must be more differences between fish and zsh. I can think of history syntax, right now. I can get an argument from any command I've ever typed based on a substring by doing !?substring?%. If I find myself calling two commands in succession multiple times, I can combine them in a single command without retyping them by doing `!-2; !!`. Then I only have 1 command to re-run next time instead of 2. If I typed a command and realize I want to include the last argument of the previous command, I can type `!$` to include it.


>Is there a strong reason why all shells have this crazy syntax

For most things no. They could just as well have proper hashmap and array structures for example, instead of the clusterfuck they have.

Or how about exceptions and try/catch (instead of traps and the like).

Or the side effects of the bizarro [ command (it's a command, not just the syntax for an opening bracket).

And tons of other things besides -- e.g. better argument parsing.

Better ways to add auto-completion instead of the monstrocities of bash completion and the like.


> (...) but the syntax is just a killer

What's so bad about the syntax? I find it rather beautiful. Can you give a (simple) example where you find the syntax ugly, and how would you prefer it to be?


> more expressive syntax like Java, C#

Really? Those are your best examples? The answer is Python in any case.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: