Hacker News new | past | comments | ask | show | jobs | submit login
The classic Unix horror story (wsu.edu)
80 points by nar on Aug 20, 2010 | hide | past | favorite | 53 comments



I've never actually been bitten by rm (or done anything remotely this badass) but in college I did have a bad Unix moment:

Late at night, I was tarring up a finished project right before emailing it to my professor. I ran:

    tar -cvf *.cpp *.h something.tar
instead of:

    tar -cvf something.tar *.cpp *.h
It happily wrote most of my project into a tar file on top of the first source file, then complained that something.tar didn't exist. I overwrote "array.cpp" or something like that, and had to rewrite it (this was before I used source control for anything, and a large part of why I learned to use source control).

Writing this comment, I had to look up the order of the parameters to tar; I still can't remember. Stupid tar.


Writing this comment, I had to look up the order of the parameters to tar; I still can't remember. Stupid tar.

The -f (--file) flag takes an argument. So you could do:

    tar -cf *.cpp -f something.tar

It is easier to remember order if you know what the flags mean.


It's true, and I'll probably remember it now. But, since I have every single time used the -f flag, it would make a lot more sense for it to say "you asked me to compress a file that doesn't exist and ends in .tar; I bet that's what you meant the -f parameter to use" and fix it for me.


I'd rather not have my tools perform magical heuristics to try to figure out what I meant to do.

The "-f <filename>" flag is perfectly clear, and useful in all modes (c, x, etc). If you remember "tar czf" as a magical incantation, then you're missing something vital. When you understand what each argument means there is no ambiguity. I don't mean to sound condescending, but in my entire life I've never encountered the problem you described.


It is too late to edit my above comment but I should point out that there is a typo there. That should be:

    tar -c *.cpp -f something.tar


> Writing this comment, I had to look up the order of the parameters to tar; I still can't remember. Stupid tar

    ; tar c *.cpp *.h >something.tar
    ; tar c dir | gzip >something-else.tgz
Problem solved! Also:

    ; tar c dir | gzip | nc -q0 -lp 4000 &
    ; [another computer]
    ; nc $otherhost 4000 | gnuzip | tar x


Man, that is scary. These days with Dropbox and github and TimeMachine, I rarely worry.


I think personal experiences with shell command typo disasters and hard drive failures are probably the most common, and most effective reasons why people start using version control and backups religiously. The Unix command-line in particular is dangerous: incredibly powerful when used right, but one slight mistake and you cut off a finger.

Fortunately, they are only virtual fingers and we can grow them back on demand with adequate preparation, unlike real ones. :)


Version control is also fantastic because you can think a lot less about making drastic edits. Gone are the days when I would keep bits of code commented out "just in case" I needed it back at some point.


Mostly, although I take issue with the "never leave commented code around, that's what version control is for" as an absolute. If it's something that is not adequately replaced, and is likely to save some time for someone in the future then it's better to keep it commented out rather than removing it since future devs may never know it even existed.


Is there a "social news sites classics" directory yet? I would love to see reposts of oldies no longer justified due to a good directory of these things, prominently linked.

Pretty please do not create such a directory in this comments thread.


http://catb.org/~esr/jargon/html/ The Hacker Jargon page would probably be the best index of the so called "classics", although it hasn't been updated in years.


Too narrow. I was referring to stories that crop up on social news sites over and over, not UNIX arcana.


Man, how many times is this story going to be submitted?


As often as it's still badass.


I get goosebumps every time I read it, and I read it every time it is submitted.

You'll know that "head-in-hands" feeling if you've done it before, and it is quite the experience.

That's why this story will draw you back every time.


I might have misunderstood the story, but if this happened to me today on my gear I would simply turn it all off, connect that half-erased hard disk to another machine, mount it, and pull all the data off. And only after a lot of googling to see if some of the lost files could be "undeleted".

Salvaging the disk by recreating a minimalist system on it is heroic and hacky, but was that the only way?


When the story mentions "Alternatively, we could get the boot tape out and rebuild the root filesystem" and "VAX", then "a lot of Googling" was out of the question.

I'm too young for the VAX era, but was there "a" hard disk which could be easily swapped and if they had another (which the Ethernet comment suggests they did) then would it have spare connectors? and free disk space? How fast did files copy back then?

If so they'd still be faced with shutting both down (knowing they couldn't start this one up again, and what was the procedure for shutting down / starting the other?) then be without both while they concocted a bodge recovery, then be faced with telling everyone to use the other and how to find their work - assuming if could take that many extra users and they had enough terminals for that.

That's sounding like a day's downtime of two systems and several days of people disruption followed by more disruption when they had to move users back.


Back in the dinosaur days I worked with DG minis running AOS/VS, a Multics descendant. From time to time an operator would run the equivalent of cd /; rm -rf, halting everything. The neat thing about the delete command was that would zap the PMGR, peripherals manager, quite early on; unable to communicate with the disks, the delete would fail. Generally one could then locate an older version of :PMGR to rename, then recover. Failing that, one could boot off a "systape" and recover.

Not as hard-core as the recovery through emacs, but greatly welcome under stress.


I don't agree with the following comment:

"Great programmers spend very little of their time writing code – at least code that ends up in the final product. Programmers who spend much of their time writing code are too lazy, too ignorant, or too arrogant to find existing solutions to old problems. "

I mean so... who's writing the code then? Who's writing the original code? I would call a clever programmer a great programmer. Just clever. and sometimes clever can get you into trouble. But just sometimes :)


They write code for things they can't find an existing solution. Its not that they don't write any code, they just don't write very much, and what they do write is good enough to be found and used by others.


My terrible UNIX, or really SunOS/Solaris moments have been:

  o typing 'halt' into the wrong xterm, shutting down well planners
  o and likely on the same network, jacking up the NIS+ on Solaris
It's been a constant itch that I never did manage to understand what I was doing wrong, such that NIS+ wouldn't restart correctly.


My marquee move is to shut down the ethernet interface of a box I'm SSHed into.


I used iptables to block all incoming traffic. While SSHed into the server. Thankfully, Linode's out of band console works (slow and painful, but it worked).


I did this as a result of following a (poorly-conceived) iptables tutorial.

'Step one: configure iptables to deny traffic by default: iptables -P INPUT DROP'

There's also the time that I learned why you should think twice before setting a default policy of DROP or REJECT - iptables -F will clear all rules that allow you in, but not the default policy that keeps you out.


I've started using CSF instead of iptables since it's default adds the IP you installed it from to it's whitelist. Along with it's "testing" mode which clears the iptables rules by default every 5 minutes, it's pretty hard to lock yourself out with it.


funny but that sounds like a rite of passage when learning how to configure iptables :)


kill -9 0, forgetting of course that I'd su'ed. Long drive to the colo.


I still remember my "rm -fr /" episode after almost 20 years. It was slightly different, something related to chmod'ing some files under /etc on an AIX server:

   $ cd /etc
   $ chmod 600 some_file * # instead of some_file.*
oops... Thank God for boot diskettes.


I was playing with FUSE, and had managed to mount a loopback filesystem. And then, for some ill-considered reason, I tried to delete the directory that I had mounted it inside.

I managed to Ctrl-C it before it ate any files in /home, but I still had some nasty cleanup work to get the computer back in working order again. Thank god for LiveCDs.


In the same spirit, I once removed a chroot tree, forgetting I bind mounted /etc into it.

It took me half a day to recover from that.

Another one, even more stupid, is to recover a "backup" of /var, done without the proper rights to it. I noticed the problem after a lot of weirds errors cropping in. This one I did not recover - after a few hours, I ended up deciding that reinstalling linux on my machine would be faster and more reliable.


I've found it's useful to look through the man pages of commonly-used tools for an option to prevent traversing filesystems.


Oh, how many of us (including myself) experienced such things, and yet we're unwilling to admit that there is something fundamentally wrong with tools that just do what we tell them to.


They have their uses. When the Unix user group at my college sent out invitations to a social event, they concluded with "... and if you don't want to come, log on to Unix and type rm -r *".


I'd say this is a classic Unix success story, told in the context of an almost horror. It shows how a serious understanding of Unix-foo can save the day when someone makes a mistake.


While part of a team of students at a hacking competition I was VNC'ed into an owned 2003 server, I was trying to get to the network settings properties menu, when my VNC session lagged and I clicked at the same time that my mouse reached the Disable interface.

Yeah ... I didn't live that one down, lost my team some points.


I've seen that done on live remote Windows servers at work - lag causing someone to hit disable on the network, causing someone to hit 'bridge' on the network card, causing someone to hit 'shutdown' instead of 'restart'.

One reason I really prefer text commands, as the sequence is preferred, so they can be typed and "queued" to execute in order when they arrive.


This reminds me of the faithful day when I finally stopped working as root all the time.

That happened right after typing

    cp backup.tar.gz /dev/sda
That was fun. Also, back then I had no idea how easy it would have been to at least get the contents of that backup file back using dd


The Unix Haters' Handbook is worth looking up. I think it is out of print but the PDF is available for download.

My favourite from USENET is cleaning out .o files after compiling C code but fat-fingering the SHIFT key and instead of typing

% rm -f \.o

,typing

% rm -f *>o

which gives you an empty directory and a zero length file called "o"

ah yes....


This makes Unix's orthogonality really clear. I'm fascinated the system was still able to run after taking that kind of a hit. Imagine what would happen to WIndows (immediately) if half the system (including key system binaries) were destroyed.


I'm going to guess, if they're in use you wont be allowed to delete them, and then you'd be in a similar situation - things like notepad would be missing, various programs wont start due to missing dependencies.

In fact, we can see from Youtube:

Ubuntu: http://www.youtube.com/watch?v=D4fzInlyYQo

Windows XP: http://www.youtube.com/watch?v=0aSo8-VDS8E

XP appears to hold up slightly better - it doesn't ruin the fonts, and it pops up a System File Protection dialog indicating it has noticed the broken system files and asking for an install CD to recover them from.


My overall point is that Windows (in my experience) breaks at the slightest registry change or missing DLL or config file. Linux/Unix try to be as loosely coupled as possible, which means you can still run your in-memory gnuemacs (as in the article) even when your /bin directory (presumably with supporting binaries for emacs) gets wiped out.

That kind of design takes a lot of thought and, as far as I know, doesn't hold true in DOS/Windows.


My personal favourite is "rm -rf * .swp" instead of "rm -rf *.swp" or such.


One time I did something like

$ chown -R terra_t.terra_t /

on a production system. Fortunately I was able to recover the permissions of most of the system files from the rpm database, and do the rest by hand...


An old story but still shocking.

One should be careful making such damage threats though, may end up with male appendage replacing rabbit foot on someones' keychain


Thank god Windows has the stupid and annoying file locking!


Windows has its own share of del /s or format c:. Once I did del /s * instead of del /s *~ to clean up temp files.


frequent automated offsite backups are your friend


I once did the old "rm -rf *" in my home directory in college. It wasn't a system I used a lot, so I wasn't missing much, but I went ahead and asked for a restore from tape anyways. I didn't hear anything for 3 days. Finally I caught our head sysadmin in the hallway and asked about it. He said, "the good news is: we're getting a new backup system." The bad news? "Your files are at about gig 7 of the 6 gig tape."

Even this week I've reminded people at work that if you don't test your backup (and restore) system, then you don't have a backup system.


Yes, but too bad that frequent offsite automated backups were still at least a decade off in 1986. At the very least, the network bandwidth just simply wasn't there. Heck, at this stage of the game, 1Mbps was considered to be blazing fast on a LAN. Even Internet backbones were 56k.


i know for a fact that backups were possible in 86 and that's all that would be needed to turn an accidental deletion into a non-disaster

but you're right, networking speed/bandwidth has gotten better since then, but that's irrelevant and I never claimed it hadn't. my point was that there is a simple, well-known solution/palliative for this, so no need for this kind of drama going forward.


the article clearly describes how they used backups to turn an accidental deletion into a non-disaster, in 1986.


yes, and it's funny how your statement is consistent with my assertion that backups existed in 86, and that backups are a good thing. i love "arguments on the Internet" sometimes! :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: