Postgres is entirely designed around the idea of durability, which dictates that even in extreme events like a crash or power loss, committed transactions should stay committed.
Although if I had a $100 for every time I've had DB corruption in Postgres over the years...
That being said, since 9.4 (or maybe 9.5) these incidents have mostly stopped happening and it's been remarkably stable.
Hi, very curious under what situations you had data corruption. I use postgres a lot and haven't experienced (or noticed) this. I'm wondering if I'm using it differently than you or if I've been lucky =]
Most of our cases have been standby servers having corrupted WAL files, to be discovered only when the master has had a physical disk crash or similar.
After a few tough lessons like that we've put monitoring in place for ensuring that the standby and/or PITR servers have testing in place e.g. PITR servers must have uninterrupted WAL sequences.
Most Postgres corruption (on a master) actually happens due to dodgy hardware, but we don't really use bad hardware. In cases where we used VM's we've suspected something fishy in VMWare itself. In one case we suspected the SAN that the VMWare host was using.
So mostly not Postgres' fault. There were however some known replication bugs in the 9.x versions that were quite nasty. I think those bugs have now all or mostly been fixed, so 9.6 and 10 (beta) onwards are extremely stable.
But the risk of HW issues still remain as the biggest reason for postgres corruption.
I cannot emphasize enough how important backups are (both standby servers and PITR) and also to verify that these are indeed valid for that day when you're going to need to use them.
Great response, thank you. Seems mostly replication-related, which admittedly I haven't delved into much. I've used postgres to hold a lot of data, but haven't gotten to the point where it needed to scale out yet. This is great to know for when that starts to be a concern. I'll definitely heed your advice on backups (already do them, but could put more automation into verifying they work).
Although if I had a $100 for every time I've had DB corruption in Postgres over the years...
That being said, since 9.4 (or maybe 9.5) these incidents have mostly stopped happening and it's been remarkably stable.