You can have his fortitude in the face of all your files being deleted if you have a robust, automated backup policy.
One of my professors used to say that he should be able to destroy your laptop, buy you an equivalent new one, and you should be up and running again within a few hours. Hard drives fail all the time and computers get lost/damaged/stolen. Losing your home directory on a computer should be expected, and definitely not the end of the world.
I prefer the raid5 approach, I leave my junk strewn all over a sufficiently large number of computers- odds are I'll still have a recent copy somewhere if my laptop explodes.
You might be interested in Syncthing [1] with the option to keep previous file versions enabled. I use it as one layer in my backup policy by having it on my computers+phone. If I take a picture on my phone, it is synced within minutes to my desktop at home. If I make a document on my laptop, I have a copy on my phone in case something happens to the laptop.
I've been using actual raid5 (via adaptec raid card) for years and very recently had one of my trusty 5TB HGST drives fail (after 3+ yrs of uptime).
Fortunately the rebuild worked, but there are so many horror stories of raid5 rebuilds NOT working it has me contemplating going back to simple mirroring.
If you are going to use hardware raid, please do make sure you have a spare raid controller, same firmware, same model. If not, you will be SOL when your controller dies.
RAID5 these days (with our very large disks) is basically asking for trouble - the odds of a second disk failing during the reconstruction are very high. But I guess you already know that!
I used to run with simple mirrored drives. One time the master drive experienced some corruption and wouldn't boot. In attempting to fix the issue I mixed up the identifiers of each drive and ended up hosing the mirror as well.
Now I use a more sophisticated RAID10 setup and really appreciate the way failed and replaced drives automatically rebuild themselves without my dumb interaction.
Mirrors are generally better because you get shorter resilver times and you don't stress all the disks in your pool when you resilver (which is why a lot of disk failures happen during RAID5 rebuilds -- the rest of the pool's disks are old). The downside is that it uses more disks for the same level of redundancy.
I would also mention it's probably a good idea to use something like ZFS rather than a hardware RAID card. In general, free software RAID is better audited and you don't need specific hardware for the recovery process. But also, ZFS has proper checksumming, so you won't have to worry about all sorts of silent corruption (which RAID blissfully ignores -- most implementations will just recompute the parity in case of a mismatch, which is the wrong thing to do (1-1/n) of the time).
You probably want to bump that toRAID 6 at least. Rebuild times on arrays that big can be long, especially if it takes a while to get a replacement drive. Plus most RAID controllers can protect against bit rot when in RAID 6 because it determine if bits get flipped. On RAID 5, you don't have that option.
RAID is not a replacement for backups. In the case with Steam the top level comment mentioned, all files writable by the user were deleted (including mounted drives in /media). RAID might protect you against hardware failure, but you also have to consider software (bugs/hackers/ransomware).
The author of that issue said they had backups, on an external drive.. that Steam also wiped.
You can say 'robust' but there are limits of what I think one can reasonably expect from a user. That they had backups at all is not necessarily the norm.
Right, that's why I said robust. But I'd push back against the idea that a normal user shouldn't be expected to have backups that are safe in such an event.
If you want to guard against rogue software (or clumsy fingers in the terminal), you'll probably need to have remote backups. It sounds like copies on the cloud saved this user, and it's not unrealistic to suggest users backup to the cloud (I think many already do with OneDrive/iCloud/Dropbox). If you're a Linux user who likes to tinker, you can set up a Raspberry Pi with a hard drive attached and use restic over SFTP (or any of the other numerous choices).
> I'd push back against the idea that a normal user shouldn't be expected to have backups that are safe in such an event.
That is fair. I was commenting from the perspective of what is rather than what should be. Alongside making software as safe as possible, we should also be encouraging and expecting people to do this.
Cloud is not backup! Rogue software can erase you cloud data too. You can't call Google or Dropbox and ask for last week's version of your cloud files.
Not sure if rsync can do local encryption these days? So I guess 'for the paranoid' (as Tarsnap's tagline is), Tarsnap with write-only keys might be better.
I used to set the append-only extended file attribute on important files
No one can delete the files afterwards, not even root, without removing the attribute first
It worked well with Mercurial, which is also append only. So I could commit as usually to my Mercurial repository, but it could not be deleted.
Ironically, I stopped doing that, because it messed my backups up. When running the backups as root, some tools would add the flag to the backuped files, but then later backups as non-root could not replace the file with new versions.
Today with github+google drive+steam or whatever flavors all you should be limited by is download speed. I wipe my hdd every 6 months or so just to get a fresh feeling of no random junk. The biggest chore is downloading all the dev environments for all the programming languages one uses at the same time
> One of my professors used to say that he should be able to destroy your laptop, buy you an equivalent new one, and you should be up and running again within a few hours.
Like the sibling comment, switching to a mental mode where a file not backed up doesnt exist. So anything on my machines is indeed in a Next cloud or Resilio share. Files outside of that might a well be in /tmp.
Secondly, treat all machines as cattle. No customization, unless done programmatically or repeated easily, and absolutely essential.
In practice, I have a tiny Dir in a Resilio share from which I bootstrap. It contains some cfg files/dies, a bashrc, passwords and share keys, some written notes for customizations that are not possible or unreliably so to automate. In my notes you will find for instance a package list for fresh installs, notes on which Firefox extensions to install, how to configure certain software that have only a GUI to reliably do so (and I test such instructions before I consider the tool ready for use), a zip of my thunderbird profile, and so on.
I started this way of thinking when my dad accidentally formatted my drive in 2000, and it has been bullet proof since. When I use new software, I do not consider it usable before I made a note or cfg backup in my 'bootstrap' share. I do not rely on my memory and I do not rely on a particular machine for anything, and it costs barely any effort other than being disciplined in never customizing a machine without recording and testing how to repeat that someday. If that is too much work? Then I will not use that software, apparently its not worth it.
I use Syncthing [1] and restic [2]. I use restic to backup the important things to several backends, including a home server (Raspberry Pi equivalent) and cloud storage. If you worry about getting your software/operating system up and running in the same way as before, you can use a declarative operating system such as NixOS [3].
I do have everything backed up but I doubt I would exhibit anywhere near that equanimity. I get annoyed when a kernel upgrade leaves me with a low res login screen.
https://github.com/valvesoftware/steam-for-linux/issues/3671
Featuring probably the most unflappable human in the history of time as a bug reporter. One day I hope to have the fortitude this man does.