> The web request/response model makes ACID pointless. The whole point of ACID is that if two users try to update something at the same time, you won't lose either of their updates. No web app implements that: if user A loads the edit page and then user B loads the edit page and makes an edit, their update is going to be lost when A saves the page. (Or maybe the webapp implements its own locking, but it won't - and can't - use database transactions for it, because there's no way to know whether the user is still editing or has closed the page).
That's a problem that is solved by CRDTs and the like, not any form of data store. But it's just one use of web frontends. Banking doesn't use that, neither does a large chunk of the world's applications.
> Exactly! MySQL did none of the things that "good" databases do, and yet MySQL-based companies were far more successful. Conclusion: the things that "good" databases do aren't actually that valuable.
Conclusion: Early web startups were successful, irrespective of what data store they had. Irrelevant to today's internet. And judging on the wrong criteria. Companies weren't "MySQL-Based", their business model disintermediated and undercut existing models. Correlation is not causation. Companies that ran on SUN/Solaris were successful as well.
SQL is not intended as a machine-to-machine interface, it's a language for expressing relational queries. Prepared statements and caching remove that "3x as long parsing"
> Application programmers know to make sure every thread obtains locks in the same order...
Evidence suggests the opposite, thus the need for things like Rust's borrow checker. Studies have shown that programmers can't work out what the race conditions and locking requirements are.
> But SQL is a very poor interface, and datbase-level ACID is worse than useless in today's (web-based) architectures.
SQL is a crap language, agreed. But your definition of "web-based" architectures seems to be picked to be the CRDT related stream of changes. Which, by the way, could be implemented in an RDBMS by writing the CRDTs as transactions and enforcing the ordering with the appropriate rollbacks and commits.
> That's a problem that is solved by CRDTs and the like, not any form of data store.
But once you accept that you have to use CRDTs or similar, traditional-RDBMS ACID becomes pointless, and you can and should use a datastore that makes better tradeoffs.
> But it's just one use of web frontends. Banking doesn't use that, neither does a large chunk of the world's applications.
Banking doesn't do that, but it doesn't do RDBMS-level transactions either. People cargo-cult the idea that you should use an ACID database, but I'm not sure I've ever seen a web application that actually made use of the database's transaction support, and fundamentally I think it's probably impossible - people say things like "if you get a deadlock, just retry", but if it's something you can retry then it didn't need to be a transaction in the first place.
> Studies have shown that programmers can't work out what the race conditions and locking requirements are.
Which is exactly the problem with using RDBMS transactions! It's hard enough to figure out deadlocks when you can see the lock statements in the code - it's impossible when you don't even have tooling that can tell you which locks will be held by which parts of the code before you run it.
> But your definition of "web-based" architectures seems to be picked to be the CRDT related stream of changes.
The thing about the web is that it makes everything a distributed system. Even if you only have one server, the fact that your clients are accessing it over a network means you have all the problems of distributed systems and are better off using a system that's designed to deal with them.
> Which, by the way, could be implemented in an RDBMS by writing the CRDTs as transactions and enforcing the ordering with the appropriate rollbacks and commits.
You wouldn't want to do a DB-level rollback or commit, the whole point of using a CRDT is that even when two writes conflict you can still commit them to durable storage straight away and resolve the conflict asynchronously. So the DB-level transactions don't help you.
You can implement a perfectly good event sourcing architecture on top of an RDBMS by effectively using the tables as queues - I worked on a system that did exactly that back in 2010, and it worked very well. But at that point you're not really getting much value from the database - its transaction support only gives you ways to shoot yourself in the foot, its flexible query system likewise, its schema support is better than nothing but not by much. There's a lot of really impressive engineering under the hood, but the parts that would be useful to you aren't actually exposed where you can use them (e.g. the MVCC implementation could be really useful, but your database will go to great lengths to pretend that it isn't there and there's only ever one version of your table at a time. E.g. the indexing implementation is often very good, but you don't have enough control over which indices get built when). There are better options these days.
There appears to be some confusion in the terminology here
- database transactions are not meant to prevent the "user A and B are editing object X at the same time".
- Database transactions ensure that when you ask the database to modify X, it either happens in its entirety or it doesn't. It's not meant to prevent concurrent edits, it's meant to keep the data in the database consistent.
- CRDT's allow user A and B to edit object X concurrently, because it's mathematically proven that two instances of the CRDT can be merged later without conflict.
You also don't need to start locking things yourself if all you want is to prevent user A from overwriting user B's changes. Add a column "version" to your table and when the user submits a change, include this version number in the change request. Then upon database UPDATE, check if the version in the DB still matches expectations. If it doesn't you reject the UPDATE and you can report back to the user what happened. You could easily apply this logic using triggers which will ensure that no matter the source of the UPDATE statements, it will be a certainty older versions can't overwrite newer versions.
> - Database transactions ensure that when you ask the database to modify X, it either happens in its entirety or it doesn't. It's not meant to prevent concurrent edits, it's meant to keep the data in the database consistent.
It keeps the database consistent. But fundamentally that's only worth something if you can use it to make things consistent for the user.
> - CRDT's allow user A and B to edit object X concurrently, because it's mathematically proven that two instances of the CRDT can be merged later without conflict.
Right. But the way that works in practice is that your datastore needs to store both user A's edit and user B's edit and then handle that convergence, and if you use a datastore that's actually designed for distributed use like Riak or Cassandra then it will give you the tools for doing that. An RDBMS simply can't do that because it's built around maintaining the illusion that there's a single version of each row (even when it's actually using MVCC behind the scenes). Yes, you can build your own CRDTs "in userspace" but you have to do things like use two tables to represent a 2P-set (two columns would be bad enough, but no, SQL databases don't have collection column types so you really do have to create two extra tables for each set property you have), at which point the database won't be giving you any help.
> Add a column "version" to your table and when the user submits a change, include this version number in the change request. Then upon database UPDATE, check if the version in the DB still matches expectations. If it doesn't you reject the UPDATE and you can report back to the user what happened. You could easily apply this logic using triggers which will ensure that no matter the source of the UPDATE statements, it will be a certainty older versions can't overwrite newer versions.
That's hacky and manual - you're effectively duplicating something that the database itself is doing at a lower-level, so you could do it much more efficiently with a datastore that would just expose that to you. And it still doesn't really give the behaviour you want - you don't want to refuse to store the incompatible change, you want to save both versions and somehow merge them in the future (possibly by asking the user to do so); an approach that destroys data by default is not ideal.
Yes, databases are Turing-complete and you can build whatever you want in them, but if you don't go with the grain of what the tools and built-in features expect then you're giving up all the advantages of using that kind of database in the first place. You're better off picking a datastore that has first-class support for being used in the kind of way you want to use it.
That's a problem that is solved by CRDTs and the like, not any form of data store. But it's just one use of web frontends. Banking doesn't use that, neither does a large chunk of the world's applications.
> Exactly! MySQL did none of the things that "good" databases do, and yet MySQL-based companies were far more successful. Conclusion: the things that "good" databases do aren't actually that valuable.
Conclusion: Early web startups were successful, irrespective of what data store they had. Irrelevant to today's internet. And judging on the wrong criteria. Companies weren't "MySQL-Based", their business model disintermediated and undercut existing models. Correlation is not causation. Companies that ran on SUN/Solaris were successful as well.
SQL is not intended as a machine-to-machine interface, it's a language for expressing relational queries. Prepared statements and caching remove that "3x as long parsing"
> Application programmers know to make sure every thread obtains locks in the same order...
Evidence suggests the opposite, thus the need for things like Rust's borrow checker. Studies have shown that programmers can't work out what the race conditions and locking requirements are.
> But SQL is a very poor interface, and datbase-level ACID is worse than useless in today's (web-based) architectures.
SQL is a crap language, agreed. But your definition of "web-based" architectures seems to be picked to be the CRDT related stream of changes. Which, by the way, could be implemented in an RDBMS by writing the CRDTs as transactions and enforcing the ordering with the appropriate rollbacks and commits.