Why Vector Clocks are Easy

roder · on Jan 29, 2010

Justin (CTO of Basho) gave a phenomenal presentation (keynote worthy) at NoSQL East.

Vector clocks are easy and once he explained it in the presentation it "clicked" for me.

You can watch the whole presentation here, but if you want to get just the bit on Vector Clocks, fast forward to 28m

https://nosqleast.com/2009/#speaker/sheehy

But seriously, I highly recommend watching the whole presentation. These guys are brilliant.

tptacek · on Jan 29, 2010

Is this basically the same idea as Lamport's logical timestamps?

HenryR · on Jan 30, 2010

Yes, this is essentially the same idea. The problem with Lamport clocks is that every two pairs of events can be ordered - since a < b or b < a or a = b for all integers a and b. This means that two events that are not causally related can't be identified. Lamport clocks just give you an ordering on events that preserves the causal relationship, but they also introduce unnecessary structure.

Vector clocks, at the cost of more storage, allow you to detect whenever two messages are not causally related. To be concrete, if I send you a message, and you then send one to someone else, your message is causally related to mine because its possible that whatever was in my message influenced the message you sent. However, if I send my message to a third party, it no longer causally precedes yours because you couldn't have reacted to what I sent.

Vector clocks will preserve this lack of ordering because the clock of my message will be 10, and the clock of your message will be 01. 10 is neither 'less than' or 'greater than' 01 under the ordering of vector timestamps, so we can conclude that the messages are not causally related.

This is important because sometimes we care whether two different updates to the same data are in any way ordered - because if so we know which one was 'latest' and therefore the 'current' one. Dynamo uses vector clocks like this: if two non causally related message arrive to update a value, the system is able to detect a conflict and take some remedial action.

mbrubeck · on Jan 29, 2010

It solves the same problem in almost the same way, but a multi-dimensional vector clock preserves more information than a one-dimensional Lamport clock. For example, Dynamo nodes that receive conflicting changes can compare the vector timestamp of each change to determine their relationship (e.g. find the common ancestor for a three-way merge).

justinsheehy · on Jan 29, 2010

It's more like Mattern's expansion of Lamport's logical time.

(a lattice, not a total ordering)

evgen · on Jan 29, 2010

Lamport timestamps sync events between two systems, vector clocks extend the sync to N systems by using an array of these logical timestamps.

mbrubeck · on Jan 29, 2010

That's not true. Lamport clocks work across any number of nodes/processes. The figures in Lamport's paper all show examples using three processes:

http://docs.google.com/viewer?a=v&q=cache:IJxXjuFmdHEJ:c...

justinsheehy · on Jan 29, 2010

Lamport's logical clocks work across multiple nodes, but they can lose information about causality.

Hence Mattern's extensions to Lamport's work, which introduce the idea of vector clocks.

Raphael · on Jan 29, 2010

Cool, it's like the data is surfing a web made of people (or inanimate actors), carrying a browsing history with it.

DLWormwood · on Jan 30, 2010

Okay, this article made me feel stupid, given that I've been outside of a CS department for over a decade.

A brief Googling/Wiki-ing explains that they are used to solve certain concurrency problems, but without further explanation. Can somebody give me a practical, real world example of a problem this kind of strategy is supposed to help solve? I feel like I'm missing out on something...

agazso · on Jan 30, 2010

Suppose you have a cluster of database servers in different datacenters, and somehow the connection between the datacenters go down. If you want to ensure high availability and allow your users to reach and modify the data while there is no connection, then you have to mark each modifications somehow so that later when the connection is up and the servers synchronize they are automatically able to merge most of the changes and find out conflicting cases.

That problem is solved by vector clocks and the most prominent application of them is Amazon's Dynamo system. See also eventual consistency for more information.

strlen · on Jan 30, 2010

They're used (amongst many other applications) by Amazon-dynamo patterned storage systems, such as Dynomite, Riak or Voldemort. My presentation from November < http://behemoth.strlen.net/~alex/Voldemort_NoSQL_Oakland.ppt > explains vector clocks and their use in a storage system, but in my opinion the parent article does so even better (I wasn't too good at OmniGraffle and only used PPT out of expediency).

</Shameless plug>

You should definitely read the Dynamo paper < http://www.allthingsdistributed.com/2007/10/amazons_dynamo.h... > as it's just plain interesting (although, it's by no means a "blueprint" to building a storage system). Here's my attempt at cliff notes to the Dynamo paper, explaining the application of a Vector clock:

Suppose you have a system where you want to achieve N-way replication of data, but don't want to lose availability for writes by going to two-phase commit. So in this case, you do this: * Write your data synchronously to W nodes

* Perform a background (async) write to N - W nodes

* When reading, instead read from R nodes. Now, if R + W > N, you can now be sure that at least one of the R nodes will have the most up to date version of the data, achieving a very basic "read your writes" consistency, without having to write to all of the nodes.

Problem is, how do you tell which is the node holding the truth? That's what vector clocks are for.

Given a value X with clock A and and a value Y with clock B, you can tell whether X was written because Y was read (implying X was written after Y), whether Y was written because of X was read (implying X was written before Y) or if the two are independent of each other. The latter scenario occurs when nodes holding X and Y become partitioned from each other (i.e. network link going down) - but some subset of clients is able to talk to X, another is able to talk to Y and another to both. In this case you can't tell which came as a result of which, but you can see that the scenario has happened and attempt to reconcile the values.

The simplest example of this reconciliation is Amazon's shopping cart (powered by Dynamo), where if two node had causally unrelated versions of the shopping cart it would make sense just to merge the two versions (combining the items).