War story time. Long ago, I worked for an interesting company that insisted on r...

abrookewood · on Oct 27, 2020

That sinking feeling and cold panic when you realise what you've done. God that is horrible.

gomox · on Oct 28, 2020

My favorite version is when that UPDATE or DELETE SQL query that you expected to finish instantly takes a few seconds before giving you your cursor back.

beefield · on Oct 28, 2020

If someone just gave me a tool to show me the expected wall time of query before actually running it, I would be quite happy. I would not even need that much of accuracy, anything up to one order of magnitude would be useful, and even up to two orders of magnitude I would use occasionally.

yencabulator · on Oct 28, 2020

https://www.postgresql.org/docs/current/using-explain.html ?

beefield · on Oct 28, 2020

Nobody has ever been able to give me a function query cost -> wall time with any accuracy.

dasyatidprime · on Oct 28, 2020

You probably knew this already, and there's probably better solutions if you're not in the manual sysadmin world, but after I did that on a personal machine a few decades ago (I think it was?), I got in the habit of using `--one-file-system` when doing major recursive rm operations that weren't meant to cross filesystems. Or `find -xdev … -delete` for anything more selective.

dmurray · on Oct 28, 2020

It seems better to alias rm to "rm --one-file-system", assuming major cross-filesystem deletes aren't something you do all the time that should be made as ergonomic as possible.

coredog64 · on Oct 28, 2020

Similar story, except we were using an NFS appliance that took hourly snapshots. As soon as we figured out what was happening, we had the storage team save off the latest snapshot. It was 1TB of data (a lot for the time) and took a week for us to restore.

aprdm · on Oct 27, 2020

A lot of companies still work in a similar fashion to what you described, maybe with root squashed, but still, very possible to have something like that happen now a days!

I remember someone hit a bug with docker exec --rm years ago where it started deleting some NFS files that it shouldn't...

Huggernaut · on Oct 28, 2020

This reminds me of a time when a colleague and I were investigating some persistent D-State processes that were occurring when container processes were being exec-ed.

Once on the box, we wanted to create a container with utilities in the fs but didn't want to download an image tarball or look through the rootfs layer directories for one to use, so we just bind mounted host root onto another directory, beside the config file we were using.

This worked like a charm. Until we rm -rf'd the config directory and deleted host root in the process.

In our case, fortunately the consequences were minimal as all workloads were stateless. The container scheduler moved all the workloads to other hosts and the host scheduler noticed this VM wasn't responding any more and rolled a new one. The whole thing resolved itself in about 5 minutes with no interaction from us - so that was pretty neat.

stmw · on Oct 27, 2020

That's a very sad worry story, hope it turned out OK. Sorry you and the users had to go through that.

eljimmy · on Oct 27, 2020

Oh man - this one is anxiety inducing. I feel like this would haunt me for years.