A PostgreSQL Docker container that automatically upgrades your database

neilv · on July 16, 2023

> It's whole purpose in life is to automatically detect the version of PostgreSQL used in the existing PostgreSQL data directory, and automatically upgrade it (if needed) to the latest version of PostgreSQL.

In a small startup...

* If the data is mission-critical and constantly changing, PostgreSQL is a rare infra thing for which I'd use a managed service like AWS RDS, rather just Debian Stable EC2 or my own containers. The first time I used RDS, the company couldn't afford to lose an hour of data (could destroy confidence in enterprise customer's pilot project), and without RDS, I didn't have time/resources to be nine-9s confident that we could do PITR if we ever needed to.

* If the data is less-critical or permits easy sufficient backups, and I don't mind a 0-2 year-old stable version of PG, I'd probably just use whatever PG version Debian Stable has locked in. And just hold my breath during Debian security updates to that version.

(Although I think I saw AWS has a cheaper entry level of pricing for RDS now, which I'll have to look into next time I have a concrete need. AWS pricing varies from no-brainers to lunacy, depending on specifics, and specifics can be tweaked with costs in mind.)

ekidd · on July 17, 2023

RDS is very useful for companies which are big enough to employ, say, 2 programmers, but still too small to employ a DBA.

The hard part of running a database, in my experience, isn't setting up or running it. The hard part isn't even configuring backups.

The hard part is noticing that your backups have been broken for years, before you actually need to restore from them. Yes, yes, you know how to do this correctly. But you delegated it to the sysadmin, the sysadmin subtly broke the backup scripts, the scripts have been silently doing nothing for 18 months, and then the sysadmin got a new job.

This is the main value proposition of RDS: Your data will be backed up, your backups will be restorable, and most of your normal admin tasks can be performed by pushing a button.

neilv · on July 17, 2023

Exactly, on the backups. There was off-the-shelf open source PITR software that I looked at and could've configured, but I couldn't justify spending all the time to test that setup, given that (supposedly) RDS was rock-solid turn-key. There were other engineering and ops things that needed my time more.

KingOfCoders · on July 17, 2023

In my startup (until our exit) I've run Postgres on my own, with streaming backups, daily backup tests (automatic restores and checks), offsite backups and (slow) upgrades.

Data wasn't mission critical and customers could live with 5min downtimes for upgrades once a year.

No problem for years. When we had hosted Mongo, we had more problems.

If I have the money I'll use a managed database. But running Postgres in a startup up to $10m ARR / ~1TB data seems not like a problem.

ssd532 · on July 17, 2023

How do you test backup restoration on daily basis?

KingOfCoders · on July 18, 2023

A script creates a database instance from a container, restores backup into it, makes some checks (e.G. table sizes above some value, audit table contains data up to the backup point etc.), sends out email that everything looks ok.

KingOfCoders · on July 18, 2023

Also unzipping and decrypting, already makes sure encryption worked and zipping worked and you did not end up with a 0 byte backup file (because of permission etc.).

Same goes of course for snapshots and backups in the cloud. I have several clients which had backup problems because of misconfiguration in AWS/GCP.

ssd532 · on July 19, 2023

super. I have been thinking of doing something similar for my productions db systems as well but was struggling to come up with some efficient way to test restoration. This gives me some starting point.

KingOfCoders · on July 19, 2023

You will be very happy when restoration testing fails. Better to find out in a test than - as some of my clients have - when restoring a backup in an emergency.

spiffytech · on July 17, 2023

> Your data will be backed up, your backups will be restorable

Well, usually...

https://news.ycombinator.com/item?id=13620622

Pretty sure I've read similar stories more recently. Probably still better odds of successful restore with RDS than with a home-rolled setup.

But there's no substitute for verifying backups.

sharts · on July 17, 2023

While true, you'd think something as important as a DB would be something worthy of creating some kind of nightly automated testing/verification of a restore procedure (or multiple).

abrookewood · on July 17, 2023

Agreed ... which is why we test our restores periodically. Feels annoyingly like 'wasted' time, but it's worth it.

tomlue · on July 17, 2023

In my experience boot strapping a startup, RDS is really expensive. Definitely a nice thing, but if you're a one person show trying to get to ramen profitable, RDS might not really make sense. If you are VC backed then go ahead!

amluto · on July 17, 2023

I tried to move a production MySQL database to RDS. Hours later, I decided that I had wasted hours on a crappy product, and I gave up.

RDS, at least for MySQL, is, in my opinion, crap.

kevinventullo · on July 17, 2023

Can you elaborate on what made it crap?

amluto · on July 17, 2023

Many things. Here's a list off the top of my head:

MySQL RDS acts like it's just regular MySQL minus SUPER privilege. So it will not accept various important inputs. For example, GTID state will be rejected. Even just pre-GTID log sequence numbers are barely supported. Various SQL SECURITY things are rejected. The suggestions for importing data using any of these features vary from scary nonsense (just ignore GTID numbers!) to absurd hacks (run sed on your mysqldump output to remove SQL SECURITY!).

The docs basically don't acknowledge that GTID matters in a RDS-to-or-from-non-RDS setup. The suggestions don't seem like they deserve to work. (Azure at least has some documentation for GTID, but it involves using fancy barely-documented APIs just to import your data.)

Replication will just break if you accidentally use a feature that RDS can't handle.

For something that could easily cost hundreds to thousands of dollars per month, I expected to be able to run an unmodified mysqldump and have RDS accept the output, process it correctly, and take my money. Nope, didn't happen.

nhumrich · on July 17, 2023

Even RDS can't do in place upgrades.

mkarrmann · on July 16, 2023

Neat :)

I personally feel like upgrading a database should be an explicit admin process, and isn't something I want my db container entrypoint automagically handling.

justinclift · on July 16, 2023

Yeah, a lot of the time I'd agree with you. :)

This container came about for the Redash project (https://github.com/getredash/redash), which had been stuck on PostgreSQL 9.5 (!) for years.

Moving to a newer PostgreSQL version is easy enough for new installations, but deploying that kind of change to an existing userbase isn't so pretty.

For people familiar with the command line, PostgreSQL, and Docker then its no big deal.

But a large number of Redash deployments seem to have been done by people not skilled in those things. "We deployed it from the Digital Ocean droplet / AWS image / (etc)".

For those situations, something that takes care of the database upgrade process automatically is the better approach. :)

cosmotic · on July 16, 2023

I disagree. It is (or was, last time I tried a year or two ago) hard to upgrade postgres because you need both the old and new binary. Package manager tries its best to allow just one copy. You end up copying from one image to another, then run into tiny breaking differences.

I agree it should be explicitly invoked and not automated, for something almost everyone needs to do, it sure is a hard task.

hamandcheese · on July 17, 2023

I think the point of this container is to be that explicit admin action.

forsakenharmony · on July 17, 2023

Changing the docker container version is still explicit? Also, you don't have to worry about having both versions of postgres installed with this

rc_mob · on July 16, 2023

yes. i do have a few developer environments this would be great for though.

justsomehnguy · on July 16, 2023

  mkdir /var/lib/postgresql/data/old
  mv -v /var/lib/postgresql/data/* /var/lib/postgresql/data/old/
  mkdir /var/lib/postgresql/data/new

You should create the both dirs first, then check if they do really exist and only then move the files. Shit happens and it's better be safe than sorry.

Also I would replace ../old and ../new with $OLD and $NEW to have a bit less clutter in the next block with the explicit upgrade calls, ie:

  /usr/local/bin/pg_upgrade --link -d $OLD -D $NEW -b /usr/local-pg9.5/bin -B /usr/local/bin

Overall it's nice idea, just need some more safety checking before starting the process.

justinclift · on July 16, 2023

Cool, good thinking, thanks heaps. I'll implement that in the morning. :)

justinclift · on July 17, 2023

Implemented in https://github.com/justinclift/docker-pgautoupgrade/commit/0....

I think I got everything right, and it passes the test harness, but please give it a look over if you have a few minutes. :)

javajosh · on July 17, 2023

This looks like this will use twice the disk.

justinclift · on July 17, 2023

Nah. The concept of verifying that the mkdir's worked is sound and also easy to do. They just need doing individually in the script, as doing both in the same spot would muck up the wildcard a bit later on.

Using variable names better ($OLD, $NEW) is a good idea too. Should cut down any potential typo risk as well. :)

justsomehnguy · on July 17, 2023

If you prefer to work with the dirs under pgdata (well it makes sense...) you can just make the list of files (maybe even write it to the file, maybe even write a batch file which would move them) and use it for moving the data from pgdata to old. This saves an unnecessary move.

Add: are sure about "${NEW}"/* in this?

  444   mv -v "${NEW}"/* "${PGDATA}"

justinclift · on July 17, 2023

Yeah, this syntax looks a bit unwieldy:

    "${NEW}"/*

But it's specifically to do wildcard expansion of the quoted string, and the shell interpreter is happy with it. I'm open to suggestions for improvements though. :)

---

This is confusing to me:

    ... you can just make the list of files (maybe even write it to the file, maybe
    even write a batch file which would move them) and use it for moving the data
    from pgdata to old. This saves an unnecessary move.

I'm not understanding what you're meaning here.

I understand the "make a list of files" bit, but I'm not grokking why doing that is an improvement, and I'm not seeing where there's an unnecessary move that could be eliminated?

The pg_upgrade process is pretty much:

    1. Initialise a fresh data directory using the new PostgreSQL version
    2. Run pg_upgrade, pointing at both the old and new data directories
    3. Start the database using the new data directory

For the "automatic upgrade container" purposes, we need to do everything under the single mount point so the "--link" option to pg_upgrade is effective and uses hard links.

Thus the "move things into an 'old' directory" first, then the "move the converted 'new' data into place" bit afterwards.

justsomehnguy · on July 18, 2023

> I'm not understanding what you're meaning here.

If this was PowerShell then I would just get the list of files in $PGDATA, create folders and then move the files in the list, ie

    $files = gci $PGDATA
    try {
        New-item "$PGDATA/old" -erroraction stop
        New-item "$PGDATA/new" -erroraction stop
        }
    catch {
        throw "Failed to create the necessary dirs"
        }
    try {
        $files | Move-Item -Destination "$PGDATA/old" -erroraction stop
        }
    catch {
        # throw "Failed to move pg_data, your databases are now borked, good luck"
        gci "$PGDATA/new" | move-item $PGDATA
        }

It's way more streamlined and if creating the dirs would fail (especially `new`) then it would fail before moving the data. And you don't need to `set +e` in this part.

I tried to replicate this in Linux and... it's a mess.

`find` includes the directory in the list, `ls -1` does the thing but bash stores it's output as a single string, redirecting it to the file get this file included in the list... I even tried xargs, but quickly abandoned the idea. Though if you can create the redirected output file in some other place than $PGDATA (`/tmp` perharps?) then `ls -1` trick would work.

justinclift · on July 18, 2023

> "Failed to move pg_data, your databases are now borked, good luck"

:D

---

Oh. Now I understand the purpose of grabbing a file list first. That would allow for creating both old + new dirs first, prior to any move attempt.

I'll think it over. I'm kind of on the fence about it at first though. Lets see what I reckon after sleeping on it.

Thanks for following up btw. :)

mbreese · on July 16, 2023

This is a very confusing title and project at first glance. It took me a minute to figure out what you’re doing…

But just to be clear — the point of this is to migrate an existing Postgres DB from the original version to a newer version? This is about the data volume itself, not automatically upgrading the Postgres binary when a new version is released?

So, you’d basically pass a flag that says --migrate-db when starting the container to kick start changing the data on disk? So when you start a new Postgres 15 container, you could pass it a volume with Postgres 13 data and it would auto update the on disk DB data.

Izkata · on July 16, 2023

Postgres docs do use the word upgrade, for example:

> pg_upgrade (formerly called pg_migrator) allows data stored in PostgreSQL data files to be upgraded to a later PostgreSQL major version without the data dump/restore typically required for major version upgrades

https://www.postgresql.org/docs/current/pgupgrade.html

justinclift · on July 16, 2023

It's pretty much answered here: https://news.ycombinator.com/item?id=36747498

The upgrade-of-your-data happens automatically when the container starts, prior to starting the database server.

mbreese · on July 16, 2023

Right, but you’re not actually automatically migrating a Postgres 13 program to Postgres 15 in a running container. That was what I first thought when I read the title.

You’re using the work “upgrade” differently than how I normally think about it when talking about version changes. You’re talking about data-on-disk, not just the program version.

(I realize that when talking about Postgres, the two are linked, but that’s not the case for most programs)

justinclift · on July 16, 2023

Is there better wording that's concise?

I'm just going with what I thought of, but am happy to adjust things as makes sense. :)

rcfox · on July 16, 2023

"A PostgreSQL Docker container that automatically upgrades your database"

Still uses "upgrade", but highlights the fact that you're changing the artifact and not just the software.

justinclift · on July 17, 2023

Thanks, that sounds good. Looks like I can't edit the title for this any more now though.

I'll email the HN mods and ask them if they can do it. :)

justinclift · on July 17, 2023

Title changed. :)

Also copied that wording to the GitHub repo page, Docker Hub page, and will probably use it everywhere else that needs a description too. :D

ndsipa_pomu · on July 16, 2023

This looks useful.

I've had issues in the past with PostgreSQL as a backend for NextCloud (all running in docker) and blindly pulling the latest PostgreSQL and then wondering why it didn't work when the major version jumped. (It's easy enough to fix once you figure out why it's not running - just do a manual export from the previous version and import the data into the newer version).

However, does this container automatically backup the data before upgrading in case you discover that the newer version isn't compatible with whatever is using it?

justinclift · on July 16, 2023

> However, does this container automatically backup the data before upgrading ...

Nope. This container is the official Docker Postgres 15.3-alpine3.18 image + the older versions of PostgreSQL compiled into it and some pg_upgrade scripting added to the docker entrypoint script to run the upgrade before starting PostgreSQL.

It goes out of it's way to use the "--link" option when running pg_upgrade, to upgrade in-place and therefore avoid making an additional copy of the data.

That being said, this is a pretty new project (about a week old on GitHub), and having some support for making an (optional) backup isn't a bad idea.

I'll have to think on a good way to make that work. Probably needs to check for some environment variable as a toggle or something, for the people who want it... (unsure yet).

tough · on July 16, 2023

Seems def interesting to handle backup, wonder if could just be something simple enough that lets you downgrade if upgrade fails

justinclift · on July 16, 2023

Yeah. Probably going to need to add a bunch more error catching and handling-of-specific-situations. Edge cases being a thing and all. :)

westurner · on July 16, 2023

Chapter 26. Backup and Restore: https://www.postgresql.org/docs/current/backup.html

Chapter 26. Backup and Restore > 26.3. Continuous Archiving and Point-in-Time Recovery (PITR) > 26.3.4. Recovering Using a Continuous Archive Backup: https://www.postgresql.org/docs/current/continuous-archiving...

IIRC there are fancier ways than pg_dump to do Postgres backups that aren't postgres native PITR?

gh topic postgresql-backup: https://github.com/topics/postgresql-backup

- pgsql-backup.sh: https://github.com/fukawi2/pgsql-backup/blob/develop/src/pgs...

- https://github.com/SadeghHayeri/pgkit#backup https://github.com/SadeghHayeri/pgkit/blob/main/pgkit/cli/co... :

  $ sudo pgkit pitr backup <name> <delay>

> Recover: This command is used to recover a delayed replica to a specified point in time between now and the database's delay amount. The time can be given in the YYYY-mm-ddTHH:MM format. The latest keyword can also be used to recover the database up to the latest transaction available.:

  $ sudo pgkit pitr recover <name> <time>
  $ sudo pgkit pitr recover <name> latest

> The database will then start replaying the WAL files. It's progress can be tracked through the log files at /var/log/postgresql/.

- "PostgreSQL-Disaster-Recovery-With-Barman" https://github.com/softwarebrahma/PostgreSQL-Disaster-Recove... :

> The solution architecture chosen here is a 'Traditional backup with WAL streaming' architecture implementation (Backup via rsync/SSH + WAL streaming). This is chosen as it provides incremental backup/restore & a bunch of other features.

Glossary of backup terms: https://en.wikipedia.org/wiki/Glossary_of_backup_terms

Continuous Data Protection > Continuous vs near continuous: https://en.wikipedia.org/wiki/Continuous_Data_Protection#Con...

justinclift · on July 16, 2023

Thanks. I hadn't come across pgkit before. :)

jacooper · on July 16, 2023

I think following MariaDBs lead is a good idea, thru have an environment variable to enable automatic upgrades, and it creates a backup before every upgrade. Works like a charm.

justinclift · on July 16, 2023

Any idea how well it works with multi-TB databases?

Copying TBs of data around seems like it would delay the start of the database by a lot.

ndsipa_pomu · on July 16, 2023

A default option is to warn about the possibility and then just leave the admins to do their own backups (which they should be doing anyway).

justinclift · on July 16, 2023

What do you reckon the right place(s) to warn people would be?

ndsipa_pomu · on July 16, 2023

Probably in the project readme. You could put a warning in the logs as well (e.g. on container startup), but that wouldn't be as visible.

justinclift · on July 16, 2023

Thanks. I'll make that update in the morning too. :)

justinclift · on July 17, 2023

Warning added in https://github.com/justinclift/docker-pgautoupgrade/commit/e....

Thanks heaps. :)

westurner · on July 16, 2023

As an interactive prompt before mutating the data, with a `-y` to bypass the interactive check (and in the docs for `-y`)

justinclift · on July 16, 2023

Nah. I can't see anything that needs interaction as being workable, as this is supposed to be an "automatic upgrade" thing.

VWWHFSfQ · on July 16, 2023

Just to be clear, this isn't for actually running postgres to serve your database, right? This is just more or less a tool to upgrade your data files to a new postgres version?

justinclift · on July 16, 2023

It does both. As part of the startup process it first checks if the data files are for an old version, and upgrades them first (if needed).

Then it starts PostgreSQL 15.3 as per normal.

It's intended as being a drop-in replacement for existing (alpine based) PostgreSQL containers.

That being said, it's still pretty new so don't use it on production data until you've tested it first (etc).

--

Btw, it uses the "--link" option when it runs pg_upgrade so should be reasonably suitable even for databases of a fairly large size. That option means it processes the database files "in place" to avoid needing to make a 2nd copy.

At the moment it only process a single data directory, so if you have multiple then it's not yet suitable.

lmeyerov · on July 16, 2023

That's great.

We have been maintaining a manual migration script for our docker users for the DB part, while the app part does migrations automatically already (Django), so making the db part more built-in makes a lot of sense.

Our case doesn't really need live/fully auto, just an easy mode for admins during upgrade cycles, and our scripts were pretty generic, so a project like this makes a lot of sense. There are a few modes we support - local / same-server, cross-node, etc - so am curious.

awinter-py · on July 17, 2023

useful, but it's surprising that docker has not standardized upgrade / migration

the closest thing to this is helm which like everything in the k8s ecosystem is either a useful tool or a single feature wrapped in a cursed configuration language, depending on your perspective

not hard to imagine a single command to snapshot the volume, boot the new version, run a migration command, revert if something fails

cortesoft · on July 17, 2023

If you run your DB as a cluster using the Percona Cluster Operator, you can upgrade the whole cluster in place with a single command.

wokkel · on July 16, 2023

So we come full circle it seems. Apt update/upgrade already does this and has been doing this for a few years now. So if i drop docker and just run a container with Debian I get this with a few years of testing behind it.

alphager · on July 17, 2023

This is for the school of "the container is immutable". They don't want to use a years old container and upgrade the binaries inside.

develatio · on July 16, 2023

Related: https://github.com/tianon/docker-postgres-upgrade

justinclift · on July 16, 2023

This one seems like a good proof of concept, but appears to need running as a separate task for upgrading rather than being an "automatic upgrade before starting the database server" thing?

WirelessGigabit · on July 16, 2023

I pin Postgres to vMajor.

But I was bitten in the rear by Wallabag the other day. Didn't work anymore. Ended up having to do a doctrine upgrade. Then it all worked. Same concept I guess. No automatic upgrade does allow you to pause and evaluate what you're doing, and take backups.

Now I say it allows you. Whether you (or I) do so... after all, that is what automatic backups are for...

Now when did I last validate those?

justinclift · on July 17, 2023

Reading through the responses, it seems like a good idea to have pgautoupgrade containers pinned to vMajor too.

So (for example):

    pgautoupgrade/pgautoupgrade:15-alpine3.8

That one would upgrade the database to PostgreSQL 15, and not further.

Simultaneously, there would be other other versions available too. For example:

    pgautoupgrade/pgautoupgrade:16-alpine3.8

This one would upgrade the database to PostgreSQL 16, and not further.

That should be pretty straight forward to do. :)

justinclift · on July 17, 2023

I've just pushed an initial build with a tag for PostgreSQL 15:

    pgautoupgrade/pgautoupgrade:15-alpine3.8-v1

The "-v1" text fragment on the end is a version number because I'm still improving the scripting for the upgrade part. So that'll likely increment over time (-v2, -v3, -v4, etc) until it stabilises.

Alifatisk · on July 16, 2023

I have always wondered how a database and its persistence is handled in a containerized environment.

This covers one of the issues.

johannes1234321 · on July 16, 2023

The way you handle persistence is by using storage volumes mounted form the outside. Don't put the data in the container, only the software, which then can be replaced. Then managing is similar to other environments. For updating you don't replace rpm/deb packages and restart, but replace the container using the same volume, which should trigger the dbms-specific update routine. Similarly you do backups similar to outside container world (depending on DBMS and your exact choice, from filesystem snapshot to logical backup (dump) or something more or less smart in between)

Alifatisk · on July 17, 2023

So all the containers access the same volume on the host? Does that scale well or will it bottleneck at one point?

Or is that when you have a layer of cache inbetween like Redis?

johannes1234321 · on July 17, 2023

No, not the same volume, a volume. Each container can mount an arbitrary amout of volumes. It scales as well as your machine scales.

A container is nothing but a process with restrictions on which filesystem subtrees kt can see, what resources (CPU, memory) it may use and which networks it can access, with some tooling to manage self-contained images of directory structures.

justinclift · on July 16, 2023

With PostgreSQL containers used in docker-compose, the common approach is to use a bind mount so the database files are persisted on the host.

I've not done stuff with kubernetes yet though, so I have no idea how it's done there.

johannes1234321 · on July 16, 2023

> I've not done stuff with kubernetes yet though, so I have no idea how it's done there.

Essentially the same, except that K8s gives you a wide variety of storage backend integrations (Storage classes + storage providers) which can attach "anything" (local volumes on the node, NFS, NAS, Cloud Volumes, ...) depending on your local environment and needs.

tempest_ · on July 16, 2023

Are people putting their databases in K8s now? I thought the old rule of thumb was don't do that but perhaps it has changed.

Feels like the database would take a massive performance hit using network backed storage unless the software is aware of that fact.

freedomben · on July 17, 2023

A lot of people running on prem k8s clusters have block storage. When I worked on open shift it wasn't uncommon for people to run databases in the cluster, backed by their block storage.

mschuster91 · on July 16, 2023

If you're running in the cloud, say on AWS EKS, it makes sense to use in-cluster databases for development environments, and only use RDS DB's for production/integration to save on hosting costs.

williamdclt · on July 16, 2023

You take the risk of not catching RDS (or network) quirks in staging, but that’s a trade-off

johannes1234321 · on July 16, 2023

There is a huge push for doing that. Whether it is abstract the right thing can be questioned, but many IT departments decided to standardize around Kubernetes for all datacenter management and push that way and in some environments (5G networking) it's part of the specified stack.

hotpotamus · on July 16, 2023

Saving it on one host's local filesystem doesn't feel particularly production-ready. There is a distributed store system for Kubernetes called "Longhorn" that I've heard good things about, but I haven't really looked into it much myself. I just run a pair of VMs with a manual primary/replica setup and have never needed to fail over to the replica yet, but I can imagine some sort of fully orchestrated container solution in the future.

justinclift · on July 16, 2023

Heh Heh Heh

I'm just pointing out how it's commonly done. Of course people add things like replication, distributed filesystems, (etc) to the mix to suit their needs. :)

hotpotamus · on July 16, 2023

Yep, it seems like the most common answer is “pay exorbitant prices to your cloud provider for a managed SQL database”, but we’ve managed to save a chunk of money running it on our own. I’ve always said that between three engineers(me being one of them), we can form one competent DBA, but our needs are also pretty modest.

Kab1r · on July 16, 2023

I've had bad luck with longhorn, but I have heard good things about using Rook with Ceph for PVCs

bheadmaster · on July 16, 2023

Both Docker and Kubernetes can use volumes to provide persistant storage to containers/pods without bind mount.

justinclift · on July 16, 2023

Cool, thanks all. :)

user6723 · on July 16, 2023

Mutable containers, what could go wrong?

alphager · on July 17, 2023

This is the opposite. The container is not mutable, but the datastore is. This alleviates the need to run Postgres X and Postgres X+1 to export from X and import into X+1.

geenat · on July 16, 2023

For the uninitiated: Fairly certain this can only handle minor version upgrades... Which is generally as easy as moving your data folder beside the new binary and starting postgres.. (minor version upgrades are already trivial..)

Major versions still require a pg_dump (or a scary migration using postgres' anemic logical replication) unless some advancement has happened on the postgres side I'm unaware of.

justinclift · on July 16, 2023

> Beware: Fairly certain this can only handle minor version upgrades ...

Nope. It's entirely for upgrading between major PostgreSQL versions. :)

It uses the PostgreSQL "pg_upgrade" utility to do the data upgrade behind the scenes:

https://www.postgresql.org/docs/current/pgupgrade.html