Hacker News new | past | comments | ask | show | jobs | submit login

Not relying on storage is a too general advice. I've never seen in practice a backup system that would survive your filesystem silently corrupting your data at random.

(Yeah, I know they exist. I've just never been in some place as big as to actually use them.)




You don't need anything big; it's just a matter of saving an hash of the file, then occasionally rehashing the files and comparing. If it's different, fetch the file from another location and update the local copy.

I use git annex, which has this built-in; just run "git annex repair" and it'll verify & fix any damaged files.


Forgot to add something relevant to other commenter's claim about needing to be "big" or whatever. I was one of last holdouts of 1 logical function = 1 physical hardware as I liked to customize software for function (esp security TCB). On a related note, having lots of redundant systems for protecting files takes all kinds of servers and is expensive, right? Especially with all that hashing and crypto?

They're called VIA Artigo's:

http://www.ebay.com/sch/items/?_nkw=via+artigo&_sacat=&_ex_k...

Stumbled onto them looking for accelerated crypto & cheaper x86 boxes. So, 6x Artigo at $300 (back then), 3x UPS at $50 each, 2x network switch's at $50, and 12x cables at $5 each = a very high availability solution for 100GB+ data for $2,110 plus tax. Breaking 1TB just took an extra grand or so. At that time, getting a similar level of storage with only two-way redundancy from "bargain" server vendors was more like $10,000-20,000.

Just gotta invest wisely and your problem turns from "cost too much" to "wow I'm sinking a lot of time into this project." The latter is more fun, at least. :)


You nailed it. And there were typically utilities to do it on every 'NIX box I encountered. Along with scripts to mostly automate the process. :)

Then there was my evolved scheme of using diversity where I had different OS's or filesystems with stuff handled at application layer. More complex to set up but decreases chance any one set of code is going to mess you up. Safety or security through diversity is a powerful technique. Harder to do today with so much code reuse: things might only be different on the surface with same bugs lurking underneath. All these different web security schemes using the same OpenSSL library is a good example.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: