Hacker News new | past | comments | ask | show | jobs | submit login
Getting to know logical clocks by implementing them (brunocalza.me)
81 points by brunoac on July 2, 2021 | hide | past | favorite | 19 comments



At point #3, I think you meant "then a→c".


Thank you :)


It seems like the article is providing a theoretical definition of a clock that allows causal order to be determined. But at the top it says

> Physical clocks can't capture causality.

Does a 'physical clock' with sufficiently accurate inter-node synchronisation and sufficiently good time resolution not meet the 'logical clock' definition?


> Does a ‘physical clock’ with sufficiently accurate inter-node synchronisation and sufficiently good time resolution not meet the ‘logical clock’ definition?

No, because a logical clock is a clock that is logically guaranteed to capture (I would prefer “bound”, but “capture” is the article’s term) causality, while a physical clock may or may not happen to do so with sufficient synchronization and resolution, but this is (unless I misunderstand) impossible to verify as a necessary condition regardless of how good the clock is.


In practice though, physical clocks can actually provide such good bounds that you can start to not care about the distinction. That’s why you see Google using GPS clocks for their global database to really great results. NTP doesn’t cut it. Atomic clocks could be even better but they have distribution challenges that GPS drastically cuts down on.


Ok, that's a good point, the word 'capture' might have been the issue, I would have said that the logical clock idea guarantees causal order then. I guess that is what the article is saying.


Yes, but what does the synchronisation? That's the essence of these algorithms - how to effectively create a single clock that is (acceptably) synchronised between nodes. Here's some lecture notes on the problem (and solutions):-

https://cseweb.ucsd.edu/classes/sp16/cse291-e/applications/l...


As it says in the notes that you linked to:

> In distributed systems, it is frequently unnecessary to know when something happened in a global context. Instead, it is only necessary to know the order of events. And often times it is only necessary to know the order of certain events, such as those that are visible to a particular host.

They are talking about logical clock implementations as a way to avoid having to build a potentially expensive sychronised 'physical clock' system with sufficient accuracy and security for the application.

The notes also talk about ways to get higher accuracy synchronisation.


I think parent meant that instead of using an auto-incremented integer for the local clocks, one could just use local time (which can also be expressed as an integer). As I mentioned in the other comment, I think that would break the guarantee that order can be inferred at the local level.


If you have a server in Chicago, and another in Denver, news of any event takes about 0.005 seconds to reach the other. Thus causality between the nodes depends on the location of the observer. If an observer half way between sees simultaneous events, both ends will see the other event happening after the local event.

Thus, physical clocks, no matter how well synchronized, will never agree everywhere.


The happens-before relation that they use in this article (and in the whole field) does take this into account. It’s only defined in terms of local events (i.e on the same node) and the time when messages are sent/received between nodes.

A perfectly synchronized physical clock totally works here.


In physics, perfect clocks, can only perfectly synchronized for a moment of time, from an arbitrary observation point. They will always drift apart, because no two points in the universe have the exact same stress energy tensor across time.

Clocks have improved to the point where a 2 centimeter difference in height in the same room results in an observable difference in time. If you take into effect the gravity of the Sun and Moon, no two points on earth experience the same amount of gravity over time.

Now, if you have a logical clock, you might be able to pull consistency out of this, but the real world doesn't work that way, unless you lower the observation quanta to hours, minutes, seconds, on Earth only, and use UTC which isn't consistent across time.


You can implement a vector clock with physical time: instead of a lamport clock per node entry, make the nodes exchange messages with their physical timestamp, and use that as the value for the different entries. As long as the physical clock of any given node advances monotonically (this can be configured for NTP), you'll be safe.


I think leap seconds and other time strangeness would prevent physical clocks from being totally reliable. AFAIK time is now implemented as being always monotonous, but you could get the same timestamp for multiple actions which are actually executed sequentially. Incrementing the local logical clock on every action is guaranteed to be safe.


> ...leap seconds and other time strangeness would prevent physical clocks from being totally reliable

That's why we have TAI[0]

[0]https://en.wikipedia.org/wiki/International_Atomic_Time


Neat, TIL. You can still have multiple actions happen at the same time increment with multi threading.


Just because A occurs before B in « wall clock » time does not mean they are causally related.


But neither does a logical clock. The implication only goes one way: if A happens before B, then the logical clock of A has to be smaller than that of B. The other way does not have to hold.


For Lamport Clocks the implication is only one way. But for Vector Clocks it is both ways.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: