> Category (1) can always be written in terms of conditional logic on the data a...

abadid · on Jan 25, 2019

Think of it this way: category (1) are transaction aborts that are dependent on the state of the data. If so, you can explicitly include conditional logic on that data state. The mechanism for accomplishing category (2) is described in the section entitled: "Removing system-induced aborts". I agree it is non-trivial, but we've done it multiple times in my lab.

ThePhysicist · on Jan 25, 2019

I'm trying to understand your proposed system (as a non CS person). Would this be an accurate (basic) description of it?

- You keep a deterministic log of all data inputs / transactions that should take place (a write-ahead log of sorts) that all workers can refer to.

- Each worker remembers the position in the log that it has processed so far.

- If a worker crashes during a transaction it can simply pick up at the position it was at before (possibly after doing some local cleanup) and replay the transactions.

abadid · on Jan 25, 2019

What you said is one reasonable way to accomplish what is stated in the post. There are other alternatives (described in the links from the post), but your way would work fine.

samrift · on Jan 25, 2019

Indeed.

Don't forget that you now have to come up with a distrbuted atomicity protocol that avoids deadlocks. That's a lot of handwaving.

abadid · on Jan 25, 2019

It would be hand-waving if we didn't actually do it :)

Calvin uses a locking scheme with deadlock avoidance. PVW uses a MVCC scheme with no locking (and therefore no deadlock). Fauna uses an OCC scheme with no deadlock and deterministic validation.

samrift · on Jan 25, 2019

Not trying to say you can't do it - I'm sure I'm just not informed enough.

However, I don't see how MVCC could fix a multi-worker issue that would cause category (1) aborts in your scenario.

With MVCC, if another worker concurrently modifies a record ( say 'Y'), I continue to read the pre-modified value once I've read it. So my value for Y may be incorrect between the time I check it's greater than 0, and the time I set X to 42. My constraint check was invalid.

At this point you either have a transaction that can't commit despite your guarantee it can (because my conditional passed!) , or an 'eventual consistency' model where the consistency will be reconciled outside the scope of original change (and in this model you wouldn't use 2PC anyway).

abadid · on Jan 25, 2019

The assumption is that the data partitions are disjoint. Each worker controls its own data and therefore controls what value the other workers see. So the worker is responsible for making sure the other workers read the correct version.

angry_octet · on Jan 26, 2019

How can it possibly know which is the correct version if it doesn't know what other workers have processed?

magicalhippo · on Jan 26, 2019

My assumption reading the article was that each transaction is assigned a unique id. Then a worker could ask another worker for "y for/as of transaction 42".

Maybe I'm all wrong tho.

abadid · on Jan 27, 2019

What you say is correct :)