I think anyone who came here because they are interested in the subject of the OP link might be interested in watching these videos.
Though disclaimer of course: These videos told me things I didn’t know about scaling PostgreSQL, because there is a lot I don’t know about that. But if you already know a lot about that then of course these videos might not be so interesting to you.
This is the thing I love about PostgreSQL, you can find out everything about the database. Oracle is ridiculous - they have something of the range of over a thousand hidden parameters, query tuning their CBO is somewhat of a black art and can change between point releases, and leads to articles like the following:
Oracle also has serious quality issues. The last patch set I applied had the wrong binaries for Oracle Text. With what they charge, you wouldn't expect that. I've never had such a thing happen with my Postgresql instances. I just don't see how to justify starting out with Oracle today.
Don’t take anything seriously on dba-oracle.com - Burleson is well known in the Oracle community as an expert in self-promotion but his technical skill is mediocre at best.
Jonathan Lewis, Guy Harrison and Craig “the Hammer” Shallahammer are infinitely more experienced and insightful.
Right. There are many Oracle internals experts who are presenting at conferences, they are also quite approachable. I'd especially recommend Tanel Poder, several of his troubleshooting sessions are available on youtube.
That's sometimes unavoidable. If there's a clear bug somewhere, fixing it will sometimes also affect queries/plans that weren't visibly affected by the bug. That's especially the case around costing issues - some queries might have regressed noticeably due to a bug since the last major version, but fixing it might affect other queries negatively (in a minor version). Postgres/we try to avoid that, but sometimes that's the most sensible way forward (and I assume the same is true for mysql etc).
This series is now called Microsoft SQL Server Internals [1]. You can get the latest version - Microsoft SQL Server 2012 Internals (Developer Reference)- from Amazon [2].
As a person who have started in SQL server, I really like the succinct clarity of Postgresql functions and its lightweight installation on the server. Thanks for sharing, this could really help me in the future.
Aside: do you still need to use a cache service when using Postgres? Our Django web app forgoes any caching because “Postgres is fast enough” and “has its own cache”.
Assuming you mean application level cache you can't really generalize it. In general you don't want to cache anything if you can because cache invalidation is a really hard problem.
I read a comment like this and just desperately want to look at the query log to see what’s being run. Followed by explaining a few offenders to see what the planner is working with. You can get really deep into optimising your dB, but often there are a lot of low hanging fruit to work with that you can make “good enough” in no time at all.
That really depends on what you're caching. Sometimes e.g. you might want to have local caches on application servers, not because the database would be too slow or couldn't take the load, but because the roundtrip time to the database make page loads too slow if you hit the DB every time.
Or you have some complex ranking/relationship/aggregation calculations - it might computationally be infeasible for the database, as hard as people worked on its efficiency, to calculate such a query on each request...
Depends on what you're using the cache for. Postgres is definitely fast enough for reads of a K/V pair at a scale your master node can support for most use cases, but there are cases that you can't suit with postgres alone at-scale, e.g:
1. Needing a consistent (meaning reading off replicas is not fresh enough) data source for reads and writes that happens every request to very high-request site. Think sessions, api rate limiting.
http://www.interdb.jp/blog/pgsql/pg_pacemaker_01/
Which in turn lead me to
https://github.com/gocardless/our-postgresql-setup
And in turn to a fifteen minute talk titled “Zero-downtime Postgres upgrades”
https://www.youtube.com/watch?v=SAkNBiZzEX8
Which was interesting and informative IMO.
After watching that one I was lead further via the YouTube suggested videos to a 45 minute talk titled “Tuning PostgreSQL for High Write Loads”
https://www.youtube.com/watch?v=xrMbzHdPLKM
Which I liked a lot as well.
I think anyone who came here because they are interested in the subject of the OP link might be interested in watching these videos.
Though disclaimer of course: These videos told me things I didn’t know about scaling PostgreSQL, because there is a lot I don’t know about that. But if you already know a lot about that then of course these videos might not be so interesting to you.