Secrets of PostgreSQL Performance (DjangoCon)

emmett · on Sept 21, 2011

I wish this wasn't a video. When someone gives a talk, rather than just posting the video, won't someone at least post the slides? Or better yet, a transcript?

simonw · on Sept 22, 2011

We have the video + links to slides and a liveblog on the Lanyrd page: http://lanyrd.com/2011/djangocon-us/shbqg/

apu · on Sept 22, 2011

Future feature for jtv? ;-)

alnayyir · on Sept 21, 2011

Your wish is my command sir. I am hard of hearing so I try to transcribe talks. This one is more abbreviated than my last one.

Secrets of Postgres Performance

Measure Everything Before and After

Do not do Dumb Things

Cache Everything

Watch your query counts

Tuning Parameters

shared_buffers: set to 25% of available ram, move up/down 5% to find sweet spot

effective_cache_size: planning hint that tells PG how much RAM it can expect for OS disk cache

work_mem: per process amount of ORDER BY space

Other Parameters

wal_buffers: Set to 16MB and forget it

checkpoint_segments: Increase to at least 10

maintenance_work_mem: 50MB for every GB of RAM

synchronous_commit: turn off with data loss risks

Disks. Lots of disks.

Not all disks are created equal. Faster disks make a huge difference.

Configuration of disks matters a lot. RAID-5 is bad. RAID-1+0 is good.

Separate out the WAL onto its own disk. Can 4x write performance.

Worry about CPU after your disk/RAID/filesystem configurations are more or less optimal.

Use pgbouncer to pool connections

Tablespaces

Consider putting archive or legacy data onto slower drives/volumes

Separate your data and indices onto different volumes if possible

This applies even on SANs

Best of both worlds with Rackspace and Cloud Connect

EBS volumes and Software RAID is best but scary on AWS.

Choice of filesystem matters. No journaled filesystems for your WAL, it is a journal.

Use XFS for maximum performance.

USE noatime!

Avoid queries in loops.

Count() is slow

tobtoh · on Sept 21, 2011

Awesome - thanks!

And look at that - a 35 minute video is compressed into 1 minute of reading and I bet, thanks to the summary, I obtained 75% of the relevant info that I would have got if I had watched the video in it's entirety.

alnayyir · on Sept 21, 2011

You're welcome, hope it helped.

I really despise non-textual media.

zobzu · on Sept 22, 2011

I can hear and see perfectly. And I love concise text data. Long videos, audio or long documents with a lot of blabla sucks. Thank you.

SkyMarshal · on Sept 22, 2011

Man you are awesome, thank you. I hate video for these kinds of things, it's like watching paint dry when I could have speed-read through a transcript in a tenth the time it takes to watch a video.

I don't know if I'm an edge case here, but if not there might be some kind of opportunity there.

plasma · on Sept 22, 2011

I also hate videos; they are too slow to get to the point.

With text, at least I can speed read and ignore stuff that is not important to me.

alnayyir · on Sept 22, 2011

That occurred to me earlier. There's a lot of people like us (I think) who really just want to accumulate as much "80/20" information as possible and don't want multi-media.

Call it, "The wiki for getting to the fucking point".

aaronbrethorst · on Sept 22, 2011

Tragically, gettothefuckingpoint.com is taken by a squatter. What a pity.

gojomo · on Sept 22, 2011

Workin' on it.

bmurphy · on Sept 22, 2011

Very nice, thank you. Mostly all of this is good advice and correlates nicely with my experience.

One thing though:

"EBS volumes and Software RAID is best but scary on AWS"

I've managed an EBS RAID10 database for a few years now. I wouldn't touch this with a 10 foot pole.

Do yourself a favor, set up an m1.xlarge (or bigger) instance, put the ephemeral drives in a RAID0 and mirror across multiple machines using hot-standby, slony, londiste, or some other tool. You'll be much happier, your system will perform much better, and you'll have a failover strategy in place.

sandGorgon · on Sept 22, 2011

I dont understand this - is there a difference between an explicitly allocated EBS volume and the ephemeral volumes of an EBS-backed instance ?

Or are you focusing on the RAID10 part - but then everywhere RAID10 is touted to be the best RAID solution (right balance between performance and safety)

bmurphy · on Sept 22, 2011

EBS is a shared networked SAN. The performance characteristics of it are not that great and even worse, highly variable. The last thing you want to be running your database on is a system where the performance characteristics vary greatly throughout the day and you have no control over it.

The ephemeral drives are drives directly attached to the server and to the best of my knowledge are not a shared resource. Their performance characteristics are highly consistent, but if your server goes down all data on those drivers are lost.

EBS sounds nice in theory, but by going to EBS RAID you throw away most of its benefits (such as snapshotting) and take on it's worst aspects.

alnayyir · on Sept 22, 2011

Ephemeral volumes are shared, just not nearly as variable.

joevandyk · on Sept 22, 2011

Yes, there is a difference between EBS volumes and ephemeral volumes of an EBS-backed instance.

jdunck · on Sept 22, 2011

And what happens when the AZ goes down? Don't you lose data then? Or do you mean multiple machines across AZ/region?

joevandyk · on Sept 22, 2011

You'll want the mirror in a different AZ. Or you could also put it in a different region.

Use Wal-E to push wal logs to s3.

bmurphy · on Sept 22, 2011

Absolutely. You should always be spreading your data across multiple availability zones and where feasible across multiple data centers and S3 is a great place to store your wal logs. We do the same thing.

johnpaulett · on Sept 22, 2011

Slides available as PDF: http://media.revsys.com/talks/djangocon/2011/secrets-of-post...

spudlyo · on Sept 22, 2011

I'm having a really hard time refraining from writing a really snarky "Switch to MySQL" type response, not because I think that MySQL outperforms PostgreSQL, but because there is always somebody on every MySQL article discussion who does this. Ok, that's off my chest, sorry to rant.

I enjoyed this video, even though I'm not a PostgreSQL guy, as it has a lot of good generic info. The advice about XFS and noatime was right on the money, although I don't think he strongly stated enough why you don't want to run EBS volumes over software RAID-0. In my experience this is a really bad idea, because just one of those EBS volumes getting picking up a laggy await (happens pretty frequently with volumes I've seen in the wild) will drag down the performance of the entire array. Also I'm told that the old "RAID-5 is always bad" notion is considerably more nuanced nowadays, and that many of the earlier OLTP write performance problems with RAID-5 have been largely mitigated with modern RAID controllers.

jpitz · on Sept 22, 2011

  " Also I'm told that the old "RAID-5 is always bad" notion is considerably more nuanced nowadays, and that many of the earlier OLTP write performance problems with RAID-5 have been largely mitigated with modern RAID controllers."

Not to be snarky, but do you have a source for that? RAID5 has a pretty big intrinsic write penalty.

jvdongen · on Sept 22, 2011

"I'm having a really hard time refraining from writing a really snarky "Switch to MySQL" type response, not because I think that MySQL outperforms PostgreSQL, but because there is always somebody on every MySQL article discussion who does this. Ok, that's off my chest, sorry to rant."

I can think of good reasons to switch from mysql to postgres though, and I've a hard time coming up with reasons to switch from postgres to mysql (which is something else all together than choosing mysql over postgres in the first place - although even there, for serious deployments, the only real reasons I can see are in the "it's what we know" department).

Long time Postgres user so I may be a little biased ;-), but still.

lflux · on Sept 22, 2011

This gave me the idea to try http://pgfouine.projects.postgresql.org he mentioned on my logs, and I'm already finding a bunch of stuff to fix after a quick run.

toisanji · on Sept 21, 2011

Anyone know of an available transcript?

alnayyir · on Sept 21, 2011

Yes, see my response to emmett.