That's a different problem from your previous examples, and it's the same proble...

Mavvie · on March 3, 2023

I see what you mean, but with ACID you can tell the user right away if the order was successfully placed. With eventual consistency you don't know how long it'll take so you either show them a loading spinner that takes an indeterminate amount of time, or what?

Not saying that the spinner solution is wrong (as it definitely scales better), but the immediate feedback is something you lose with eventual consistency, by definition, right?

lmm · on March 3, 2023

> I see what you mean, but with ACID you can tell the user right away if the order was successfully placed. With eventual consistency you don't know how long it'll take so you either show them a loading spinner that takes an indeterminate amount of time, or what?

It's doing the same thing either way though; in either world you can even wait for the order to be processed, or not. In an event sourcing system you'll commit the event quickly and then wait potentially arbitrarily long for the result to appear downstream; in an ACID RDBMS you'll wait potentially arbitrarily long for your commit to execute (and maybe if you're lucky your database has got a deadlock detector, but what are you going to if it tells you you hit a deadlock? 100% of the time I've seen the answer is "backoff and retry").

Mavvie · on March 3, 2023

> in an ACID RDBMS you'll wait potentially arbitrarily long for your commit to execute (and maybe if you're lucky your database has got a deadlock detector, but what are you going to if it tells you you hit a deadlock? 100% of the time I've seen the answer is "backoff and retry").

I don't agree with the conclusions from this. re: deadlocks, these can be prevented as they are only possible in certain situations, and have mitigations (keep transactions short, never acquire the same locks in a different order, ...). For something "simple" like atomically decrementing an inventory count and then inserting a new row into an orders table, deadlocks (or locking problems at all) are not possible. Of course as you make your system more complex it becomes more likely though, and that's a very fair argument of why ACID systems won't scale as well in general.

But I would still maintain that, even if you have a commit that times out, it does so in an atomic way. It takes at most your "statement timeout" (should be a few seconds probably), and then you can (in deterministic time) show the user an error message. This is still an improvement for the user experience over showing a "your order has been placed" message and then later cancelling it due to overconsumption of inventory.

I appreciate your replies by the way! I haven't been convinced yet that eventual consistency can provide an equally good experience as ACID for this use case, but you've made me think about things in a new way.

lmm · on March 5, 2023

> re: deadlocks, these can be prevented as they are only possible in certain situations, and have mitigations (keep transactions short, never acquire the same locks in a different order, ...).

In principle yes, but this relies on human vigilance; as far as I know there's no automatic checker that can reliably tell you whether your queries have the possibility of deadlocking. Do you review every query before it gets run? And when you miss a case like acquiring locks in the wrong order, it can be weeks or months before it actually bites you.

> But I would still maintain that, even if you have a commit that times out, it does so in an atomic way. It takes at most your "statement timeout" (should be a few seconds probably), and then you can (in deterministic time) show the user an error message. This is still an improvement for the user experience over showing a "your order has been placed" message and then later cancelling it due to overconsumption of inventory.

One of the most fun ways I've seen an SQL system break: user navigates to a page, gets a timeout in their browser; 23 days later the database falls over.

(the page initiated a query for 2 years' worth of data, the database server chugged away through its indices for 23 days and then started trying to stream all the data back).

I agree that it's good to have that kind of fallback behaviour - in the system I currently work on we have something like that, where if a complex process doesn't get an event for over 1 second (most likely because the thing computing it broke, but it could also just be slow) then we have a simple consumer downstream that just emits a cancel event in that case (and passes everything else through otherwise) and we take that stream as canonical. And having something like that by default is a good thing, and one of the things that SQL databases do right is that they're a lot more request-response, whereas event sourcing things can be a bit "shouting into the void". I'd like a system with better support for that kind of case. That said, I think in a lot of cases the SQL defaults aren't great - I don't think I've ever used a database that had a good default timeout setting, and the way SQL databases treat any validation failure as "drop the data on the floor" is rarely what you want in practice.

dragonwriter · on March 3, 2023

There is a difference between theoretical worst case and practical experience. Whether the direct-update experience or the event sourcing experience on this point is better depends on a lot of factors (many of which are dimensions of scale.) Neither is categorically ideal, you've got to understand the particular application.

Of course, when using an ACID RDBMS, you can also very trivially use a CQRS/event-sourcing approach for some flows with append-only event log tables and a separate process applying them to query tables that other clients only read and direct update for other flows.

lmm · on March 4, 2023

> when using an ACID RDBMS, you can also very trivially use a CQRS/event-sourcing approach for some flows with append-only event log tables and a separate process applying them to query tables that other clients only read and direct update for other flows.

Sort of. You can't opt out of transactionality (transaction isolation settings are often global) and you can't defer index updates except by making your indices "manual".

The first job I worked for essentially built an event sourcing system on top of an RDBMS. We still had the production system brought down by an analyst leaving a transaction open for weeks (we weren't big enough to have a separate analytics datastore at that point). They should've known to enable autocommit (even though their queries were read only!), but it's still a footgun that was only there because of the ACID RDBMS.

tsegratis · on March 3, 2023

Im no expert but my recent conclusion from writing an ordering system is customers can instruct their banks to roll back many payments a month or even a year after the fact

So actual ACID transactional consistency requires a year. So after basic checks just say you'll give them what they want and accept payment -- since you'll need failure recovery anyway for lack of delivery, refunds, acts of God etc

So accept and plan for failure

Mavvie · on March 3, 2023

The idea I've seen repeated here a lot is to aim for strong consistency internal to your system, even though the real world/things external to your system will be eventually consistent.

So the inventory <-> create order interactions are internal and can be strongly consistent, while the realities of the banking system are external and will be eventually consistent.

Just because you can't make everything strongly consistent doesn't mean you shouldn't bother at all; it can still eliminate entire classes of errors (in this case, errors around inventory being over-consumed leading to cancelled orders are completely avoided with ACID)

edit: I'm also no expert when it comes to distributed systems. Most of my experience is with transactional systems which I know quite well, and distributed is still a mystery to me. I am very open to new ideas around it and don't have a ton of confidence in my comments here.

audioheavy · on March 4, 2023

A lot of things have improved in distributed databases as of late. There are distributed databases today (my co. provides one) that can have strict serializability (the highest data integrity level) for hyper scale applications. Tolerating "eventual consistency" is largely a choice, oftentimes forced because your DB application or implementation (NoSQL and traditional RDBMS, depending on their implementation) does not provide you strict transaction guarantees. There are third parties (https://jepsen.io/analyses is a great source) that can evaluate and validate such claims. A lot of the anecdotal information in this thread here is no longer accurate or applicable. Yes, there are databases (like Fauna) that don't compromise on the highest transactional levels, and yes, there are more widely used databases (MongoDB, DynamoDB, and many others) that either cannot make those guarantees, or their implementation by their service provider cannot demonstrate that guarantee. Happy to provide more info if interested.