Hacker News new | past | comments | ask | show | jobs | submit login
Goto in Bash (2012) (bobcopeland.com)
93 points by davidcollantes on Aug 3, 2023 | hide | past | favorite | 58 comments



This reminds me of the Thompson shell goto, which is an external(!) program that messes with its parent's file descriptors looking for `: label`.

See manpage at https://etsh.nl/man/_goto.1.html and source code at https://github.com/eunuchs/tsh/blob/master/goto.c.

See, history can give you a more inert syntax. And maybe a new way of thinking about how to make this thing... I would love to have a more robust version to do C-style goto cleanups.


:(1) was also an external program at least in v2 unix


Since he mentions this being for a work thing, how I've handled a similar situation (long running shell thingies) is:

Break up the script into steps, and place them in order in a dot-d style folder, like 1.foo.sh, 2.bar.sh, 3.baz.sh. Have a separate _context.sh which supplies the upfront variables or whatever are needed for each step. Ensure that each step can run on its own with just the context sourced. Then have a run.sh in the folder above that sources the context and the steps in order.

Now, that said, typically these kinds of things are dataflow or even buildsystem-adjacent, so I've also experimented with doing them in CMake (intermediate products managed with add_custom_target) or as Nix derivations (intermediate products are input-addressed in the Nix store), but I'd love to be aware of other tools that are better for expressing this kind of thing.


I'd write run.sh like this: Put the names of the files from that dot-d style folder into some ".cc" file (if it doesn't exist yet); then loop over the lines in the ".cc" file, executing each line if it doesn't start with '#' and then prepending '#' with sed to the just executed line. After the loop, delete the ".cc" file (maybe with a diagnostic messages if all lines in it were #-prefixed). Maybe throw in some traps for reliable SIGINT-handling as well.


Use the source![0]

1) Use bash PS4 to show the line number currently executing (perhaps append start of line with #).

  PS4='# ${LINENO}: ' bash -x script
2) Log to a script transaction file.

   unset PS4 before a function, reset after function call to restart at function call instead of restarting within a function.

  trap signal error & dump variables to state file (via env or set command)
3) check for 'transaction file' on restart of script, if exists,

   a) restore saved state file settings;

   b) pick up where to start via sdiff of original script and transaction file (aka first line not starting with #
   sdiff -v ^'# script transaction_file | sh -
[0] : https://stackoverflow.com/questions/17804007/how-to-show-lin...


I would just use Make, or write a checkpoint file for each step, and wrap them all in if statements so they don't run if their checkpoint is there.

Or, if I were doing it from scratch, I would not have a bash script that takes days to run, that sounds like some Real Programming, and I'd do it in Python, where it's easier to manage complexity, in case there was some more fine grained caching that could be done, maybe even between runs.

Maybe I need to make another coffee or something but I really don't understand at all why they wanted Goto for this.


Hah, I've done something a lot like this. The numbers make the order nice and obvious.

You can also use flag files to track progress and resume where you left off.

reset.sh:

    #! /bin/sh
    
    [ -d DONE ] || mkdir DONE
    find DONE -type f -delete
run-all.sh:

    #! /bin/sh
    
    for s in step-*.sh
    do
        [ -f "DONE/$s" ] && { echo "*** (skipping $s)"; continue; }
        echo "*** running $s"
        ./"$s" && touch "DONE/$s" || { echo "*** $s failed!"; break; }
    done


What's the motivation of this complexity over, say, a more modern programing language that can handle cases like this with ease?


Having it run on (basically) any linux machine without downloading extra languages.

I can see it being useful in certain situations, but I would think most use cases would benefit most from a better language.


I feel like there has got to be a better common denominator if you're willing to stick with just Linux.

Yesterday I was trying to track down why a bash script that worked on pretty much every platform I'd tried and across a bunch of different bash versions was not working on a new operating system. Turns out the script was relying on "non-standard" behavior in tr(1). Definitely not the first thing you think of when you see bash complaining about undefined variables.

A bit further back I had similar fun with FreeBSD's installer now that they've ripped perl5 out of the base system. The installer is, of course, an unholy mix of C and sh.


Well, there is Perl, which is what actually was used to be for "can we do this in something that is not shell but also portable?" kinds of scripts for quite some time. But then again, it's Perl.


Why is this a thing still? Who is out there saying "We want you to develop an app but we won't let you use Python"?

Granted, I really wish there was a python-lts that didn't break stdlib stuff every few years, but it seems like everything remotely modern breaks compatibility constantly...


Not many people are saying that, but there's a few cases it could make sense. Like if you want to setup something on an embedded machine with no/restricted internet, then bash scripts will be easier.

Some people also like not having to do extra installs to run something simple.

And some scripts are simple enough that it's just easier to use bash.

But I agree, these cases are pretty obscure and most of the time the benefits of using a real language is worth the investment.


He writes "prepare to cringe" and he is not wrong. As far as I understand, this technique implements GOTO by reading the source code (again) into memory, filtering out everything before the target label and evaluating the remaining source code. I think this doesn't preserve state, variables etc. so not really a GOTO. But interesting technique.

edited for clarity


> I think this doesn't preserve state, variables etc.

Why wouldn't it? It's calling eval, not exec.


The more frightening part (at least for me) is the reddit thread that points out this is how windows batch files implement goto, and shows how to see it happen.


If that's the scariest thing you know about Windows batch files, you've lived a blameless life.


I’m just a babe in the woods


Does that mean using jumpto to jump to a label above that invocation results in redeclaration of the same code?

Seems janky even by the low standards of the author.


It's a funny trick, but you could probably also use setjmp and longjmp with ctypes.sh :-)

https://github.com/taviso/ctypes.sh


Great heavens! I assumed at first that your "dlopen" was a separate executable, but you are implying you allow calling arbitrary C functions within the memory space of the bash process itself!?


Yes, exactly... :)

You can also create callbacks (i.e. function pointers to bash code) that you can pass to qsort() or bsearch()... or pthread_create? That last one was a joke, I mean, you probably could but I don't know what would happen - I don't think bash is reentrant :)


Wow, I thought the original article was awful tho good, then I saw the lseek goto which seems much worse, then I see this dlopen for bash and I find it's the worse. But impossibly cool.

Did you have a usecase for this or it was just fun?


Instead of:

   sed -n "/$label:/{:a;n;p;ba};"
I think it's more idiomatic to do:

   sed -n "/$label:/,$ p"
Or even:

   sed "0,/$label:/,$ d"
Which deletes the label itself, so you don't need the subsequent `grep -v ':$'`, but then you also aren't allowed to put any statements on the same line as the label.


Honestly, I love this.

Precisely because there's too much gatekeeping in programming and learning etc. Give people sharp knives and let them break things


I do LOTS in bash, and I have never missed Goto, haven't even thought about it. And BASIC was my first language...


BAT files writing other BAT files was often the only way to get scripting workflows done on primordial Windows versions, but chaos typically ensued if you attempted to rewrite the file you were presently running, as it appears that cmd didn't do anything fancy like read the whole file into memory before processing it (presumably due to the draconian memory limits people laboured under back then)


This is going off of memory as I'm not at my computer. Bash does something similar. It will read and execute the script line by line so if it's modified before bash gets to that line then weird things can happen. However, functions need to be read completely so the trick is to create a main function and call it at the end of the script.


> but chaos typically ensued if you attempted to rewrite the file you were presently running, as it appears that cmd didn't do anything fancy like read the whole file into memory before processing it

Neither does bash! That’s why you should always wrap things in a function and have the entry point of the script be the end of the file.


> It runs sed on itself to strip out any parts of the script that shouldn’t run, and then evals it all.

How I have done this is:

1. Put all the steps in functions.

2. Have a main function that calls all the other functions in order.

3. If given an argument, the main function skips all the functions up to that one.


Or just use make, and rerun the command on failure once clearing the error to continue where you left off.

Make is one of the most versatile pipeline tools out there.


Yup I've done that too which also gives you free parallelization and as a bonus if the rules don't have a natural product you can always touch a sentinel file in each rule so that make can just pick up where it left off.

TIMTOWTDI.


If you're going to do that, you might as well go the whole hog and implement INTERCAL's "come from"[1] for maximal evil.

[1] https://en.wikipedia.org/wiki/COMEFROM


Oh, but we have that construct in our modern high-level languages already, it's just customarily been called "catch block" instead.


Hah! But COME FROM is more insidious - no amount of following the chain of calls will reveal that something is lurking elsewhere in the program ready to redirect the control flow


the author mentions in a note that bash was complaining and that they might put the labels in comments to dodge the issue. They might also be able to change the label format to `: label`. `:` just returns true regardless of what parameters you pass it, so it could still look "labelish" without having to use an actual comment.


That's one of the few places where it would be appropriate to store the current execution point somewhere in /var/cache or /var/lock and write the script so that it would look there at launch and dispatch accordingly.


"GOTO is considered harmful". This kind of thinking just makes me want to run off screaming into a paper bag. Language developers are, in many ways, the modern aristocrats telling us what we can't or should not do. I still miss assembly language.


GOTO originally allowed one to jump across functions, into the middle of the loops, etc.

And nobody stops you from writing in assembly! You'll just spend more time on writing (and debugging) your programs but sure, go ahead, it's not like your time on Earth is finite.


Fwiw, in C, switch allows you to jump in the middle of loops


In C, goto allows you to jump in the middle of loop [1].

[1] https://stackoverflow.com/questions/6021942/c-c-goto-into-th...


It feels pretty broadly accepted that overuse (or even moderate use) of goto becomes a nightmare for readability. Plenty of modern languages still support labels in loops to break to to be able to break out of multiple layers, which is the only main case that I can think of where a goto would be handy.


I created a GOTO mecanism in rxjs once (typescript) but I felt doing the wrong thing


"Language developers are, in many ways, the modern aristocrats telling us what we can't or should not do."

This is offensive nonsense and breathtaking entitlement. They're providing you free tools to try to help you, usually not even being paid for their work.

Edit to add: I just got copypasta trolled, didn't I ...


What's the original copypasta?


I had mostly done some short stuff in bash that does not need much flow controls and loops etc; most programming I will use C instead, which does have goto and is sometimes useful even though usually the other flow controls are better. My own programming language designs do have a goto command because it is sometimes useful; the Free Hero Mesh programming language has both goto and gosub, in addition to the structured flow controls (if/then, begin/while/repeat, etc).

The way is done in that article doesn't seem best way to do; modifying the shell itself to seek the shell script file for it seem to be better in my opinion, or perhaps using the "enable" command to add your own built-in "goto" command (although I don't know if the API supports that). Another message on here mentions an external program messing with the file descriptors, but does bash use a consistent file descriptor number for the shell script file with which this will work?


That's neat but doesn't `case` support fallthrough? So I expect you could just put your script in one big `case` statement and skip to the branch you need.


I didn't think so but it turns out it was added with Bash 4.0 (released Feb 2009). Instead of terminating the case with ";;", you terminate it with ";&" for fall through.

https://git.savannah.gnu.org/cgit/bash.git/tree/NEWS#n1267


Make first argument your step/label. Each step can have its own command line. Exec yourself. Accomplishes same thing without being scary.


And I did cringe, and then I thought it looked kinda fun. It would literally never have occurred to me in a million years to try to start a shell script half way though - so trapped am I in the paradigm of the familiar.

As for the script that takes several days and often breaks half way through... sounds like what Makefiles are for to me.


Thanks I hate it


Some men just want to watch the world burn.


This is silly. Whatever can be done with this approach can be better written with just functions.


after many years, I learned that usually shell scripts are only good for the most basic of uses. here is a similar program in Go that doesn't require any hacks:

    package main
    import "os"
    
    func main() {
       var x int
       switch len(os.Args) {
       case 1: goto start
       default:
          switch os.Args[1] {
          case "foo": goto foo
          case "mid": goto mid
          }
       }
       start:
       x = 100
       goto foo
    
       mid:
       x = 101
       println("This is not printed!")
    
       foo:
       if x == 0 { x = 10 }
       println("x is", x)
    }


Does this also works inside if and while blocks?


This is the best worst thing I’ve ever seen


This was my thought exactly. I am impressed, terrified, and fully expect an intern at a company somewhere to use this to implement recursion.


what have you done…




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: