Why is "written to hard disk" considered to be "durable"? Isn't that just a high...

forgotusername · on June 26, 2012

This is actually a fair and valid point. On all but the most delicately configured systems, the standard configuration of basically all SQL databases does not occlude committed data loss during power failure due to the hard disk's write cache. See for example http://www.postgresql.org/docs/8.3/static/wal-reliability.ht...:

    When the operating system sends a write request to the disk
    hardware, there is little it can do to make sure the data has
    arrived at a truly non-volatile storage area. Rather, it is the
    administrator's responsibility to be sure that all storage
    components ensure data integrity. Avoid disk controllers that have
    non-battery-backed write caches. At the drive level, disable
    write-back caching if the drive cannot guarantee the data will be
    written before shutdown.

Which almost nobody does, because performance falls through the floor unless you're pimping a $500 RAID card (don't even mention software RAID).

The SQLite documentation goes into further detail surrounding the atomicity of writing a single 512 byte sector during a power failure (on old drives, on 4096 byte drives, and on SSDs). Few people seem to account for any of this stuff, yet sleep soundly at night regardless.

Dylan16807 · on June 26, 2012

While backup power might be expensive in a hard drive situation, it looks like even cheap SSDs are available with power loss protection. And as far as I'm aware you're better off using SSDs for databases anyway.

jharsman · on June 26, 2012

Note that writing data to a single disk (or SAN array, or RAID controller) really isn't durable either, even if the the data does actually get to the disk and isn't in a write cache somewhere.

What if that disk crashes, or the SAN array brakes and kills all the data, or the data center burns down?

Dylan16807 · on June 26, 2012

Since it's impossible to completely avoid catastrophic hardware failure in the real world, obviously that can't be the threshold for durability. But losing data to anything short of hardware failure means you're non-durable. Software crashes can't lose data. Kernel panics can't lose data. Power outages can't lose data.

duaneb · on June 27, 2012

> Isn't that just a higher likelihood of "durability"?

It's much, much, much more likely to have a memory failure (a crash anywhere from thread to hardware) than a hard drive failure. I don't mind them claiming that as durability.

blake8086 · on June 27, 2012

Well you have to come up with some way to measure it. How about "time I expect this data to be retrievable"?

So you can measure memory and process durability in hours to days, and spinning disk durability in years. I don't know how long SSDs last yet.

Say, (just making up numbers) you expect your servers to run 10 days without a reboot or crash, and your disks to last 5 years before failing. That means writing to memory is about 0.005479 as durable. So it's not great, but it's not zero either.

An interesting thought experiment: how many machines replicating data only in RAM does it take to be more durable than one machine with a spinning disk?