Hacker News new | past | comments | ask | show | jobs | submit login
Implementing system-versioned tables in Postgres (hypirion.com)
5 points by ben_s 11 months ago | hide | past | favorite | 1 comment



Good write up. As I've written my own temporal table library for Postgres that also runs on managed databases like AWS Aurora, here are my observations:

If you main table inherits from a history table, you can see a complete history of every entry. It also never requires triggers on insert, only update and delete. A major drawback however is that Postgres table inheritance uses the same mechanism as partitioning, so you can use one or the other but not both.

Instead of inheritance, you could keep them completely separate, which is what I ended up with and created set-returning functions of the same name as the main table with a timestamp range as a parameter. The function performs a UNION ALL to merge active and historical results. So instead of

    FROM mytable
you use

    FROM mytable('2023-03-01 00:00:00Z', '2023-04-01 00:00:00Z')
This has made temporal queries much easier to understand and maintain.

Another option put forth was to save the timestamp range and tableoid in the history but have historical column data stored as jsonb. This allows history to be shared with all tables and is more forgiving of table column changes. It's also obviously slower and less storage efficient. The temporal access functions end up calling jsonb_populate_record(...), which is extremely handy but again not as efficient as a twinned history structure. However unless you are making a lot of temporal queries with strict SLAs, the jsonb record solution is often good enough, especially with judicious use of partial indexes. Watch out for dependencies on tableoid though; the same table name created by anything other than a dump/restore will not have the same tableoid, so a surrogate id and mapping table may be preferred.

One major difference between my implementation and the author's is that my history table is never updated. Ever. It is always append-only. This is a side effect of the initial insert not being recorded in the history. It also means the active table requires at least a last_modified column that is updated by trigger.

As you mentioned, adding and removing columns from the active table is a difficult and complex proposition. In fairness though, neither MariaDB or MS SQL Server handle this situation either. Both just tell you to detach the history, make your table changes, and start a new history. This is where the jsonb history option when rolling your own in Postgres really shines.

Not covered in the spec or in most temporal implementations in Postgres, it can be VERY useful to record the user you made the changes. Rather than record the role—though this may be all most systems have—I've tended to create application users that are passed in from JWT claims as local config variables. I got the idea from Postgraphile (https://www.graphile.org/postgraphile/jwt-guide/). At first I added this to every history table along with the start and end timestamps but later realized the user never changes mid-transaction. I switched to a model where the transaction id and user are stored in their own table. Rows in Postgres already store the transaction id. Unfortunately this suffers from the same dump/restore problem that tableoid suffers from, so you may want a transaction id surrogate instead depending on your needs. You will also need to track the delete event separately from the last state of your active table record. The user who deleted it may very well be different from the last person to insert/update it. This was handled with a start and end timestamp that were the same, aka a point in time.

Finally, it's worth exploring event triggers. Managing all of this manually is a huge PITA. Better to watch for CREATE TABLE events and make the history setup automatically. Better for consistency and for dev sanity. I'm usually not a big fan of dynamic SQL, but this was an area I never regretted utilizing that particular Postgres sledgehammer.

Temporal support is a massive undertaking if you want to go beyond the basics. It also imparted even more respect for the Postgres developers for such a powerful and expressive platform to work on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: