It's particularly nice as it allows to do point in time recovery - you can tell PostgreSQL to replay WALs up to a given point so if say corruption happened because of some code bug you could play the database to the point minute before.
Backup process is two part
* archive logs as they come in - PostgreSQL have hook that will run program with the WAL segment to archive so just need to put whatever you want to use to backup it with there
* start backup from PostgreSQL - that will stop it writing to the DB files and only write to WALs - and just copy the database directory files. Then tell PostgreSQL to resume writing to them. No need for anything fancy like file snapshots.
So just copying files really.
Restore is just restore the above and feed it WALs to the chosen point.
We also run slave so master being shot would only kill running transactions, fancier setups I've seen also run "delayed slave" - slave that replays non-current WAL so basically presents view of database from say 1 hour or 1 day ago. That way if something fucks up DB you already have server that is running, just need to replay WALs to the chosen point.
> I don't personally believe in backups that are never verified.
We ended up making backup job lottery. Pick a job out of system and send an email to ticketing system "hey, admin ,restore this job for testing". So far it worked.
One system also have indirectly tested restore, as the production database is routinely anonymized and fed to dev server
We've also baked it into automation we use for deploying stuff so for most things not backing up is harder than backing up. Still, accidents happened...
yes and that's how I know you never worked in the game industry. A lot of companies use perforce, perforce is a source control system that also has assets, and also has metadata, that you also have to take backups and query.
I'm a bit green when it comes to db back ups. Do you have anything I can read on this approach it sounds really promising but I don't think I understand it?
Backup process is two part
* archive logs as they come in - PostgreSQL have hook that will run program with the WAL segment to archive so just need to put whatever you want to use to backup it with there * start backup from PostgreSQL - that will stop it writing to the DB files and only write to WALs - and just copy the database directory files. Then tell PostgreSQL to resume writing to them. No need for anything fancy like file snapshots.
So just copying files really.
Restore is just restore the above and feed it WALs to the chosen point.
We also run slave so master being shot would only kill running transactions, fancier setups I've seen also run "delayed slave" - slave that replays non-current WAL so basically presents view of database from say 1 hour or 1 day ago. That way if something fucks up DB you already have server that is running, just need to replay WALs to the chosen point.
> I don't personally believe in backups that are never verified.
We ended up making backup job lottery. Pick a job out of system and send an email to ticketing system "hey, admin ,restore this job for testing". So far it worked.
One system also have indirectly tested restore, as the production database is routinely anonymized and fed to dev server
We've also baked it into automation we use for deploying stuff so for most things not backing up is harder than backing up. Still, accidents happened...