Hacker News new | past | comments | ask | show | jobs | submit login

> that's not true in general

Can you elaborate what do you mean by this? I was arguing that it’s possible as the original argument was “this is not possible in a real system and is only theoretical”.

> provide no meaningful consistency guarantees to users, because they allows "committed" writes to be lost

If I set the (shared) value to green and you set it to blue, what is the expected observed value? What if you set it to green and I set it blue, what is the observed value? More importantly, what is the consistency that was lost?




so there are multiple threads of conversation here

first, there is no straightforward way to module "a value" as a CRDT, precisely for the reason you point out: how to resolve conflicts between concurrent updates is not well-defined

concretely, if you set v=green and I set v=blue, then the expected observed value is undefined, without additional information

there are various ways to model values as CRDTs, each has different properties, the simplest way to model a value as a CRDT is with the LWW conflict resolution strategy, but this approach loses information

example: say your v=green is the last writer and wins, then I will observe (in my local replicas) that v=blue for a period of time until replication resolves the conflict, and sets (in my local replicas) v=green. when v changes from blue to green, the whole idea that v ever was blue is lost. after replication resolves the conflict, v was never blue. it was only ever green. but there was a period of time where i observed v as blue, right? that observation was invalid. that's a problem. consistency was lost there.

--

second

> almost any data structure can be turned into a (op-based) CRDT.

yes, in theory. but op-based CRDTs only work if message delivery between nodes is reliable. and no real-world network provides this property (while maintaining availability).


> there was a period of time where i observed v as blue, right? that observation was invalid.

Not invalid. The observation was correct at that time and place, meaning, in the partition that it was observed. This is the P in AP.

It seems to me that you’re talking about and arguing for synchronization, that is, consensus about the values (=state), which takes the system to CP as opposed to AP.

> there is no straightforward way to module "a value"

I would recommend to look into the “Register” (CR)data structure.


there is no single definition of a crdt "register"

i know this because i've implemented many versions of crdt "registers", with different properties, at scale

there are lww registers and multi-value registers, the former provides deterministic (but arbitrary) conflict resolution which loses information, the latter provides deterministic and non-arbitrary conflict resolution without losing information but with a more complex API

> The observation was correct at that time and place, meaning, in the partition that it was observed. This is the P in AP.

this is not what partition means

partition means that different subsets of nodes in a single system have different views on reality

if v=blue was true only during a partition, and is no longer represented in the history of v when the partition heals, then this value is not legal, violates serializability, is incorrect, etc.

https://jepsen.io/consistency#consistency-models

this has nothing to do with synchronization


> there are lww registers and multi-value registers

Use a single value register then in place where I said “register”.

> partition means that different subsets of nodes in a single system have different views on reality

Partition means that nodes of a (single) network are (temporarily) not connected.

In the example discussed here, blue and green were in different partitions thus had “different view” to the state. Once synchronized, their view on the state is consistent and both observe both values, blue being the latest state.

> is no longer represented in the history of v when the partition heals

Please review the above discussion again and the original article. You keep saying this but it’s shown in both that it’s is not true.


"last writer wins" does not preserve state from losing (earlier) writers/writes

after synchronization, all nodes share a consistent view of state (yes) but that state has no history, it only contains the net final value (blue) as the singular and true state

this is not complicated, i think you're out of your element




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: