> temporality will become important as well because it matters how information c...

DaiPlusPlus · on Nov 12, 2019

Unfortunately, Temporal Tables in the latest ISO SQL spec and MS SQL Server have issues. At least their current implementation.

First-off, there’s no history of schema changes. While you can easily go from a NOT NULL to a NULL column, you can’t go from NULL to NOT NULL. This is a deal-breaker when using TT for LoB data when schema changes happen somewhat regularly. TT should have been designed with schema versioning from the start.

The second main issue is the still lack of tooling and ORM support for TT. While EF6 is compatible with TT (it will safely ignore the SYSTEM TIME columns) it doesn’t let you directly query the history table.

Third - any UPDATE statement, even when it doesn’t actually change any data, causes a new history row to be added, including a full copy of any mvarchar(max) values. Copy-on-write does not seem to be used. That’s a huge waste of space.

Finally, you cannot exclude columns from temporal tracking - so if you have frequently-updated columns containing inconsequential data (e.g. display sort-order columns) then you’ll end up with history table spam.

I don’t know why the SQL Server team invests massive amount of engineering effort into features like these when other massive priorities exist - like modernising T-SQL’s syntax to be less verbose (parameterised object identifiers and optional predicate clauses, please!) or adding support for column array-types, which would greatly improve the platform.

lichtenberger · on Nov 12, 2019

I also don't know why duplicating timestamps for each row to store the transaction times is needed for a kind of history table.

Other data models with schema after write semantics (for instance trees or graphs) don't have this problem of having to store schema changes.

As I'm working myself on a temporal storage system in my spare time (as well as during my student times at the University of Konstanz), I'd say that you should index each revision instead, storing the timestamp of the transaction commit in a RevisionRootPage for instance, which is the root and main entry to deserialize a revision. Unchanged pages should be shared amongst the different revisions.

Furthermore due to fast random access of flash drives and even parallel I/O with PCIe SSDs you can store page-fragments instead of whole pages. The SSD might fetch too much information due to misalignments, but I'd say that access to persistent storage also gets more fine granular in the future (byte level granularity instead of block-level or the blocks are smaller-sized -- I think even 512 bytes is not that unusual nowadays).

Not only storing the incremental or differential changes from a database page, but changes which fall out of a sliding window doesn't make intermediate full dumps of pages necessary, such that write and read peaks don't occur and a predictable number of page fragments can be read to reconstruct a page in-memory, preferably fetching the fragments if possible in parallel.

slifin · on Nov 12, 2019

Datomic solves all three of these problems an the syntax+array complaints more people should be stealing ideas from it or using it

jiggawatts · on Nov 12, 2019

Linux support for SQL Server is my canary in the coal mine.

It makes zero sense.

Meanwhile, the Windows version, which is used for 99.999% of installs still does not support Vista-era APIs like elliptic curve certificates because of legacy code nobody has touched in 10+ years.

There's a crazy amount of low-hanging fruit in that product that just isn't being addressed.

aljarry · on Nov 12, 2019

> Linux support for SQL Server is my canary in the coal mine. > It makes zero sense.

There's one strategic reason: using it by developers in containers.