Hacker News new | past | comments | ask | show | jobs | submit login
Cloudant on Google's Spanner (cloudant.com)
69 points by mlmilleratmit on Sept 23, 2012 | hide | past | favorite | 13 comments



Here's an excellent talk on Spanner by one of the developers, Alex Lloyd: http://vimeo.com/43759726


Spanner is quite impressive and I remember wondering if any of the cloud hosting companies would attempt to match it. The problem of synchronizing atomic clocks and GPS is itself not trivial (remember the faster-than-light photons at CERN?), but there are plenty of other challenges too.


Well the CERN clocks were operating at a much lower error tolerance than Spanner's 5ms.


It seems this becomes another tune-able parameter. The concurrent transactional behavior just depends on the characteristics of the time synchronization.

This also, unfortunately introduces a new failure mode ("bird pooped on GPS antenna" exception) that has to be handled. At worst it would revert to using plain NTP servers. But is just something new that possibly didn't have to be considered previously.


If GPS is unavailable, it reverts to using atomic clocks. As those gradually drift out of sync, the system slows down until the problem is fixed. NTP is not used, at least in Spanner.


I don't know that deploying precise timing is as big of a blocker as one might think[1]. Precise timing is commonly deployed in telecom networks for TDM & RF signal synchronization. There is even a standard for distributing time sync over a ethernet network (IEEE 1588). This has become a big deal as cell site backhaul transitions from SONET to ethernet. On the Wireless ISP side there are companies that make low cost (sub $200) timing devices to synchronize wireless transmission to reduce self-interference, although that is based around 1PPS edge triggering instead of time of day clocking.

[1] Of course this assumes you have control over the physical environment and can install a GPS antenna in your datacenter. This rules out 3rd party clouds.


I thought about this too. But the timing plane is not typically available at the application level in telecom systems, nor would it be distributed to millions of different hosts, so it's not quite the same thing.


True. So if I wanted to do it, how would I do it?

a) Adopt NICs / order servers with proper chipset to support it: http://www.oreganosystems.at/?page_id=71

b) Broadcom has IEEE 1588 support in their reference designs so it wouldn't be too hard to bake into custom switches & server mobos: http://www.broadcom.com/press/release.php?id=s535919 Interesting thing about that press release is in 2010 they mention uses as "service provider, data center and smart grid power control networks". Service provider & smart grid are the usual suspects as far as users of timing go. Who other than Google would need that kind of timing setup in a datacenter? Financial trading?


I think that Finance is indeed the obvious candidate. In neutrino experiments we've gotten pretty good at synchronized clocks with nanosecond resolution. Although (to be fair) we cheat and get some very useful information out of the beam itself, and neutrinos luckily don't travel on optical fibers. I'd be interested to research what it would actually take to provide nanosecond synchronization of clocks in at least a dozen data centers globally, distributed to millions of servers.


Spanner appears to raise the bar on globally distributed DB's, in the process making it desirable to a number of application developers.

Which raises the question of whether Google will make Spanner available to 3rd parties... (hope so!)


The difference between NoSql and RDBMS is getting thinner and thinner with each progressive development. For most of the businesses Spanner is already covering all their needs. Going forward I'm anticipating db management systems to just have a switch "prefer consistency" or "favour partitioning" with the core functionality be the same.


I find it interesting that several groups are converging on similar decisions. CouchDB, Spanner and Datomic have versioned data, with all versions immutable. All support some manner of offline operations. They do have different ways to resolve conflicts: manually, time-based and explicitly, respectively.


Clearly Spanner is of interest, as every single NoSQL and DB startup has been blogging and making statements about it.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: