Things we learned running Postgres 13

summarity · on Sept 21, 2020

As with any Postgres update, I recommend Noriyoshi Shinoda‘s super detailed change log [PDF] with examples as a reference for DBAs looking to upgrade. This version it’s 81 pages (so far): https://www.hpe.com/content/dam/hpe/download/pdf/japan/linux...

artimaeis · on Sept 21, 2020

Is there a list of these somewhere? I can find a few editions on search engines, just curious if there's a more canonical list.

And thanks for sharing, this is a great resource!

summarity · on Sept 21, 2020

I've only started reading them back when PG11 was in beta and found them mostly by googling.

Here's what I can find:

- PG 9.6: https://www.slideshare.net/noriyoshishinoda/postgresql-96-ne...

- PG 10: https://www.hpe.com/content/dam/hpe/download/pdf/japan/linux...

- PG 11: https://h50146.www5.hpe.com/products/software/oe/linux/mains...

- PG 12: https://h50146.www5.hpe.com/products/software/oe/linux/mains...

There are more, but those are in Japanese.

petergeoghegan · on Sept 21, 2020

https://www.postgresql.org/docs/13/release-13.html

blu_ · on Sept 21, 2020

Wow, thanks for sharing this!

gigatexal · on Sept 22, 2020

This is a treasure. Thank you for sharing.

mastazi · on Sept 21, 2020

There is one thing I'm curious about and never found a good answer: why there is no VACUUM equivalent in MySQL or MS SQL? I guess VACUUM is necessary because of the way Postgres stores data? How do those other RDBMS clean up indexes?

luhn · on Sept 21, 2020

When you update a row in a SQL database, you need to keep the old row around until the transaction is committed.

In Postgres, a new row is written, and both the new and old coexist in the table. Even after the transaction is committed, the old row stays in the table. VACUUM scans the table, finds rows that are no longer needed, and marks them as okay to overwrite.

In MySQL, the row is overwritten in the table immediately, and then the old row is temporarily written to a separate area (the UNDO Segment), so the old record is available if needed. Since old rows don't accumulate in the main table, there's no need for VACUUM.

dikei · on Sept 22, 2020

Actually for MySQL it depends on the storage engine, what you described is how InnoDB works.

MyRocks storage, however, only appends data, and requires regular Compaction to clean up old rows, similar to Postgres VACUUM.

petergeoghegan · on Sept 22, 2020

Your point about MyRocks is a good one. It's also true that MyRocks takes a similar approach to Postgres when it comes to implementing the higher isolation levels. It wouldn't make sense to add InnoDB style next-key locks to MyRocks, either.

Systems that closely adhere to traditional 2PL designs such as InnoDB or DB2 end up with tight coupling between recovery, concurrency control, and storage. This has many consequences, both good and bad. It's no coincidence that Postgres can support quite a variety of index access methods, including support for transactional full text search. You get the transactional stuff "as standard" with the Postgres approach to versioned storage. There is no need to bake concurrency control into each and every index access method.

luhn · on Sept 22, 2020

Yeah, I should have specified that. Thanks for mentioning it.

mastazi · on Sept 22, 2020

Thank you for your answer, this makes sense! I guess that there must be some tradeoff, right? What I mean is that while in MySQL you don't need to vacuum, perhaps updates would be slightly slower (due to the need to write into the UNDO segment)?

petergeoghegan · on Sept 22, 2020

Recovery in Postgres is simplified and sped up considerably by the approach taken to MVCC. Same is true with transaction rollback. SQL Server users (on recent SQL Server versions) can optionally get a similar trade-off by enabling constant time recovery (CTR), which is called Accelerated Database recovery (ADR) in the MS marketing material. See the paper:

https://www.microsoft.com/en-us/research/uploads/prod/2019/0...

SQL Server has something called "background cleanup" when this feature is enabled, which takes place in a separate process called "the cleanup process" - just like VACUUM. The paper itself explicitly acknowledges the similarity. All of the advantages the paper claims for CTR/ADR are also advantages for Postgres.

It's also true that both traditional SQL Server (SS with ADR/CTR disabled) and MySQL have background cleanup processes of their own for indexes, but that is a little different. It's about asynchronously reclaiming space for index entries that were already delete-marked.

evanelias · on Sept 22, 2020

Conceptually you still need to clean up the old row versions, regardless of where they're located. InnoDB has purge threads to clean up undo log entries that are older than the oldest active transaction. If too many of them accumulate, such as due to a long-running transaction preventing purge progress, it increasingly harms performance. This can be a major problem in MySQL for workloads which mix a high-volume of quick UPDATE and/or DELETE operations concurrently with other slower/longer transactions.

gshulegaard · on Sept 22, 2020

I think this might provide more context which you are looking for (and your suspicion seems spot on!):

> This means more writes to do on updates and makes access to old row versions quite a lot slower, but gets rid of the need for asynchronous vacuum and means you don't have table bloat issues. Instead you can have huge rollback segments or run out of space for rollback.

https://stackoverflow.com/questions/25153532/why-is-it-a-vac...

Edit: Just wanted to explicitly call out that the Robert Haas article linked in the answer is excellent (I am still reading my way through it).

nicoburns · on Sept 22, 2020

There are plans for postgres to move to a more similar strategy to MySQL, at least in some cases:

http://amitkapila16.blogspot.com/2018/03/zheap-storage-engin...

Someone · on Sept 22, 2020

“In MySQL, the row is overwritten in the table immediately, and then the old row is temporarily written to a separate area (the UNDO Segment)”

I would hope and expect MySQL does that in reverse order, as “yeah, we have that row in memory” isn’t a good answer to “are you sure you can roll back that change if needed?”

mastazi · on Sept 22, 2020

For future reference, I have found this excellent post about the same topic https://rhaas.blogspot.com/2011/02/mysql-vs-postgresql-part-...

gigatexal · on Sept 22, 2020

The best change imo is the efforts to verify backups now with the pg_verifybackup command. https://www.postgresql.org/docs/13/app-pgverifybackup.html

nnain · on Sept 22, 2020

[Off Topic] Postgres 13 doesn't work with the PSequel client; PSequel is no longer developed and that kinda sucks.

jakeogh · on Sept 22, 2020

line 24 takes 5m on 155M records, it's not a pg problem, how can it be made faster?

https://github.com/jakeogh/pubchemmer/blame/master/README.md

systemvoltage · on Sept 21, 2020

It would be great if we stop adding features now. Are there any pieces of software that has started doing this? Why are we perpetually in this loop of feature creep?

Like a painter, you must know when to stop painting. This is a real dilemma all painters face, especially when dealing with watercolor. If anyone has tried painting here, you know exactly the problem - how do you know you're done? It applies to writing and music as well, but it is more evident in visual arts than any other field.

I wish there was a Postgres branch that took previous version and then just applied optimizations and bugfixes. No more.

citruspi · on Sept 21, 2020

> It would be great if we stop adding features now... I wish there was a Postgres branch that took previous version and then just applied optimizations and bugfixes. No more.

To be fair, quoting from the article:

> There are no big new features in Postgres 13, but there are a lot of small but important incremental improvements. Let's take a look.

But also, in general, yes there are pieces of software that do this - most recently Moment.js[1]. There was some discussion earlier this week[2].

[1] https://momentjs.com/docs/#/-project-status/

[2] https://news.ycombinator.com/item?id=24477941

chrisweekly · on Sept 21, 2020

It's unfortunate that performance is so rarely considered to be a feature.

anarazel · on Sept 21, 2020

There's plenty performance work going on in postgres. Most major releases had significant improvements in some aspect of that.

marcosdumay · on Sept 21, 2020

Postgres usually adds plenty of performance from one version to another.

Not much on .0 versions, those are more feature packed, but all the other time.

petergeoghegan · on Sept 21, 2020

Actually, Postgres does not add any performance features to point releases at all. Point releases are only for bug fixes. You may be thinking of MySQL.

Occasionally, there are cases that could be argued to be exceptions to the general rule. But that's a hard argument to make -- everything committed to a back branch is officially a bug fix. Things like optimizer regression fixes are "performance enhancements" in a certain sense, but are nevertheless justified as bug fixes.

koolba · on Sept 21, 2020

“.0” versions are the only versions that would have a major performance change. Anything else would be a fix version.

chrisweekly · on Sept 21, 2020

[flagged]

dang · on Sept 22, 2020

I've reopened your comment upthread for editing. Please don't post off topic complaints like this though. The guidelines ask you not to go on about downvotes and also not to use allcaps for emphasis.

https://news.ycombinator.com/newsguidelines.html

Edit: I've re-closed the comment but if you want to add to it, you're welcome to email hn@ycombinator.com.

retox · on Sept 21, 2020

The moment.js author just said that no more changes would be added, so that's one recent example. People were praising the decision in the HN thread.

https://news.ycombinator.com/item?id=24477941

fabian2k · on Sept 21, 2020

It's not as if Postgres is compromising on quality to add features, they're still very good at keeping Postgres stable even at the impressive rate that new features are added.

And while there are cases where I might agree with the general sentiment, I strongly disagree with this for Postgres. The new features are important and useful. Postgres is not a very specialized tool, it is a general purpose database that is used in many different ways. It isn't just done and feature-complete, there's still a lot of potential to improve it for various use cases.

cuddlecake · on Sept 21, 2020

> Like a painter, you must know when to stop painting.

You presuppose that Postgres (or all software) is comparable to art, without making or explaining the comparison.

You say that "you must know when to stop painting", but what if the developers behind Postgres know when to stop, and it just isn't finished?

What makes you reason that Postgres is finished?

Igelau · on Sept 21, 2020

Seriously, any analogy would have been better than an artistic painting one. They could have said "like a goatherd, you need to know when to stop feeding your goats" and been closer to the mark.

ako · on Sept 21, 2020

Suppose they had done that 10 years ago, would we be better of? Don't you want to know wher Postgres will be with 20 more years of innovation?

I'm afraid if postgres stopped adding features now, in 10 years it would no longer be relevant in software development. Most developers would probably jump ship to a more innovative database.

The world changes, which means requirements change. What was once good enough, will be insufficient in the future if you don't update it.

anarazel · on Sept 21, 2020

Well, there are branches that just apply bug fixes. We have five years of support for our (yearly) releases.

Optimizations kind of are features (sometimes kinda bugs too). Why should those be singled out vs other features?

The primary "bottlenecks" using postgres are different for different people. For you it may be performance. For others it's easier administration. For others it's SQL level query capabilities. Etc.

EwanToo · on Sept 21, 2020

Postgres 9.5 is still getting updates and bug fixes, each major version has 5 years of support.

If you want to move very slowly and deliberately, you can do, depending on your requirements.

https://www.postgresql.org/docs/9.5/release-9-5-23.html

petergeoghegan · on Sept 21, 2020

How does the introduction of new features negatively impact your use of Postgres?

Igelau · on Sept 21, 2020

You need to RTFA. Also, database development is absolutely nothing like painting. "Postgres 13 features and enhancements :: Making a watercolor painting" is Top Ten Worst HN Analogies material.

eyelidlessness · on Sept 21, 2020

Why? There's lots of features that Postgres doesn't have, but could, which would fit perfectly well into the current offering. Right off the top of my head, built-in bitemporal support recently added to MariaDB would be welcome in Postgres, and would have no negative impact on any existing functionality. "Stop adding features" is sensible when adding features degrades existing features or the experience thereof, it makes no sense at all when it has no negative consequence.

Some people need or benefit from things that don't interest you. If that's the case with a RDBMS, why not just... not upgrade?

darkstar_16 · on Sept 22, 2020

If you go through the changelog, v13 as a release is mostly about improving existing features but I don't think pg is at that point yet where it could "stop" adding features. It almost stopped 8-10 years back and that is when mysql had overtaken it completely. I believe it still is the best open source SQL engine but features like native partitioning, JSON support etc that have been added in the last few years were absolutely needed in this century to be competitive. And now that they're there, they need to be improved and made stable.

jake_morrison · on Sept 22, 2020

There is this special biologist word we use for 'stable'. It is 'dead'.

-- Jack Cohen

njharman · on Sept 21, 2020

Fork source and organize like minded developers (if any) around that concept.

Of course, that is a lot more work than posting comment...

Igelau · on Sept 21, 2020

I'd be game if the project was called "Postpostgres", but only for that reason.

xkcd-sucks · on Sept 22, 2020

If postgres stopped adding new features, we'd be forced to adopt Oracle

tinus_hn · on Sept 22, 2020

Good news! It’s an open source project so you are free to do this yourself!

renewiltord · on Sept 21, 2020

Be the change you wish to see in the world.

Accujack · on Sept 21, 2020

Be the change you wish to find in the couch.

renewiltord · on Sept 21, 2020

I feel like a million bucks!

blu_ · on Sept 21, 2020

[Insert default "its OSS, just create a fork" here]