Google throws open doors to its top-secret data center

fourspace · on Oct 17, 2012

I had the pleasure of helping to build and manage these facilities, both hardware and software, for 5 years. It's nice to see some of Google's real innovations reach the public eye. Some of the smartest folks I ever worked with at the company build absolutely mind blowing tech that the outside never has the opportunity to see or appreciate.

In fact, while much of the content in the article has been written about before, it's still probably 2-3 years or more behind where Google is actually at. I left in 2010 and did't read about anything I had not experienced.

j45 · on Oct 17, 2012

Reminds me when msn search spent a billion dollars (or whatever was reported) saying we have more pages than google. Google simply updated the number of pages indexed after Microsoft was done huffing and puffing.

It was pretty funny at the time but the lesson wasn't lost on me: With competition, get ahead, stay ahead, and have things already done and implemented so you can announce big accomplishments when it's strategic for you.

wpietri · on Oct 17, 2012

Yes. You can get to the top by competing with others; you stay on top by competing with yourself.

There's a great graph in "Toyota Kata" that shows per-worker productivity of major car companies for the last several decades. They all rise together for the early part of the graph. In the 60s, the American car companies level off; Toyota keeps growing. They focused on continuous improvement, while American car companies floundered.

The really interesting part of this to me is that it's rooted in a philosophical difference. Toyota was started and run by engineers. The American car companies gave birth to the MBA approach to business. Engineers naturally seek improvement; MBAs seek profit.

Google is one of the few major companies with a philosophical background like Toyota's. It's run by nerds. Their goal isn't to increase shareholder value; it's to build great stuff and organize the world's information. Like Toyota, by following their vision, they have generated vast profits and dominated their industry.

seivan · on Oct 17, 2012

Wow, that actually sounds amazing. Didn't know that about Toyata, makes you think twice before buying a new car now.

Just curious, is that still valid or has the company been overrun with "Samsungnitis".

You know, stay one step behind, clone and ship mass produced junk in different price ranges?

acdha · on Oct 17, 2012

If you're looking for some old but still educational business theory, read up about the guy whose picture hangs in Toyota's lobby:

http://en.wikipedia.org/wiki/W._Edwards_Deming#Deming_philos...

One of the details which I find highly relevant to software and, particularly, operations is rejecting the mindset of dealing with failure by finding someone to blame and instead changing the system so that one person can't inadvertently cause a failure. I see this a lot with massive ops run books which require humans to repeatedly perform complex tasks without mistake rather than automating it and regularly testing your automation.

seivan · on Oct 18, 2012

Woah, great read! Thank you so much!

adrinavarro · on Oct 17, 2012

If you ever get to be around with someone who works at a car insurance company, ask them about failure rates for cars… japanese cars (and Toyota notably) are among the lowest (that is, they are among the most reliable cars in the world by far).

I still wonder how they can manage to build affordable, reliable cars that last for years, while many expensive car makers have absurdly high failure rates.

wpietri · on Oct 18, 2012

One of the interesting reasons comes back to accounting.

Toyota's approach focuses on value from the customer perspective. So all defects are seen as waste, and are targeted for elimination.

The MBA approach just looks at P&L. Which is why they concealed the Pinto's tendency to explode; it was cheaper to pay the lawsuits than to fix people's tanks. Nevermind that many more people would die without the recall; that wasn't relevant to increasing shareholder value.

Another good example comes at the beginning of Bob Lutz's "Car Guys vs Bean Counters". Lutz, a car lover and an automotive exec for decades, once fixed a problem with transmission manufacturing. The problem was causing a lot of people's cars to die right after the warranty expired. He got yelled at because it blew a hole in their revenue projections; they were looking forward to a lot of highly profitable transmission repairs.

Toyota can make those reliable cars because they see every worker not just as a meat robot, but as a brain that should be engaged in eliminating waste. MBA thinking looks at slow order periods as a time to cut labor costs. Toyota, cognizant of how much they have invested in their workers, looks it as a time for training, plant improvement, and other value-creating activities.

ersii · on Oct 17, 2012

If you've ever heard of "lean" or "kanban" - you've heard of the "Toyota way"!

If I'm not mistaken (Please correct if so); "Lean" is pretty much the "toyota improvement strategy" adapted to the software industry.

I'd recommend reading a little about lean, kanban and/or agile "methods" in general. Mary Poppendieck's a name to look for

reinhardt · on Oct 17, 2012

> Engineers naturally seek improvement; MBAs seek profit.

Even if this is taken as a fact, I don't see how it explains why American car companies level off. It's not like potential profit is bounded but potential improvement isn't.

wpietri · on Oct 18, 2012

The long-term value of a company is based on the amount of value they create for customers. The short-term value of the company depends on profit.

So, for example, an MBA can increase profits by cutting R&D. Or by cutting costs in a way that harms product quality. The company will do well for a while because it takes a while for things like reputation and mindshare to decline. You can hide the declines for longer by investing more in promotion.

The engineer-style approach, in contrast, is to focus on cutting waste rather than cost. This is a high art in the Toyota Production System:

http://en.wikipedia.org/wiki/Muda_%28Japanese_term%29

I don't know that the leveling off is a necessary consequence of the MBA approach. But it's certainly what I've seen, and it makes some intuitive sense. Given that some ways to increase profit improve productivity and some harm it, it's plausible that all the cheap ways to improve productivity would be exhausted early in the MBA approach.

I also think the MBA approach can lead you into a local maximum that's pretty screwed up:

http://www.thisamericanlife.org/radio-archives/episode/403/n...

That's a great radio story about NUMMI, one of GM's worst plant that under the Toyota system became one of the best.

wpietri · on Oct 18, 2012

Thinking further, another factor may be that MBA thinking tends to be focused on external competition, while engineers tend to optimize regardless of competition. So the American car companies could have leveled off because their major competitors were doing equally well. By the time Toyota was an obvious threat, they were too far behind to even understand how they were being beaten.

Google is another good example. They didn't seek out data center "best practices". They radically bested the competition, proceeding one step at a time, with careful attention to what they needed. In an MBA analysis, that would be seen as spending a lot of money on risky R&D with no obvious ROI. In fact, they'd want to re-task all those expensive engineer brains to something more directly related to revenue. And probably drop the quality of the ops staff as a money-saving measure.

It's only when the years of patient engineer-style optimization add up to an insurmountable lead that it looks good in a B-school spreadsheet.

jedberg · on Oct 17, 2012

> In fact, while much of the content in the article has been written about before, it's still probably 2-3 years or more behind where Google is actually at. I left in 2010 and did't read about anything I had not experienced.

What if they just plateaued and didn't really go beyond what you had done when you were there, and this is totally accurate to today?

mvgoogler · on Oct 17, 2012

As someone who just recently transferred out of Platforms[1] to another group at Google, I can assure you that the technology has not plateaued.

[1] Platforms is the group at Google that designs and builds the technology that goes into the data-centers.

enneff · on Oct 17, 2012

I concur. The platforms team are the only group at Google that consistently amaze me quarter after quarter.

fourspace · on Oct 17, 2012

I can believe this in a heartbeat. I know that if Platforms and datacenter/cluster management innovation stopped, I'd see a mass exodus of my Googler friends (as well as a very noticeable change in Google's products).

cloudwalking · on Oct 17, 2012

It's not.

nym · on Oct 17, 2012

Source?

cloudwalking · on Oct 17, 2012

I work at Google.

sounds · on Oct 17, 2012

Single page article: (note: HN guidelines suggest always submitting the single-page article)

http://www.wired.com/wiredenterprise/2012/10/ff-inside-googl...

ImprovedSilence · on Oct 17, 2012

There should be a guide for hacks on how to view articles on a single page. Yeah, wired is easy, there's a link, but often times it calls for a little php knowledge, or viewing in "print" mode. Or something random. Not always complexly obvious, but a list of tricks to try might be handy...

dailo10 · on Oct 17, 2012

Have you tried the "Auto Pager" Chrome extension? It's pretty handy...

missing_cipher · on Oct 17, 2012

There are lots of solutions, one of them is: http://www.printwhatyoulike.com/pagezipper

xtrahotsauce · on Oct 17, 2012

I love AutoPatchWork (for chrome) for this purpose: https://chrome.google.com/webstore/detail/autopatchwork/aeol...

heyitsnick · on Oct 17, 2012

Readability does a very good job of single-paging.

DanBC · on Oct 17, 2012

It's a shame that heat is just dumped outside most of the time.

EDIT:

The article talks about Google's impressive technical achievements. But there's a lot of energy that's wasted in industry. I don't mean "used inefficiently" (although that's bad too); I mean actually wasted.

I used to work at a tiny electronic sub-contracting factory. The morning shift would arrive, turn on the air compressor (2 KW), the reflow ovens (10 KW and 12 KW); and the other machines (about 7 KW).

But they'd do that even if the machines were not going to be running. All these KW were being used for no reason at all. And the machines are pretty inefficient anyway. (One of the owners thought powered machines looked more impressive. Energy costs were included in the rent so there was no incentive to think about when the machines were on or off. )

Counting that waste across all the tiny factories in the world, and including all the waste in offices - it's quite a lot.

bradleyjg · on Oct 17, 2012

Low grade heat (< 400 degrees C) is really difficult to do much with. If you happen to have need for it right at the spot where it is generated (basically heating buildings) great.

Otherwise you are pretty much out of luck - The efficiency of energy extraction from a heat engine is thermodynamiclly limited by the difference between the hot and cold sides. And trying to transport it any significant distance ends up being more trouble than its worth as pumping water gets energy intensive very quickly (and air has terrible heat capacity.)

sbierwagen · on Oct 21, 2012

Endoreversible heat engine efficiency is 1-sqrt(Tc/Th) Units are absolute degrees. For 120F heat and 70F ambient:

1-sqrt(294/322) = 4.4%

So, for every 100 watts of heat, you could recover 4.4 watts of power.

freyfogle · on Oct 17, 2012

Agree. Here in London there was big announcement that heat from telehouse would go to homes: http://www.datacenterknowledge.com/archives/2009/04/15/teleh...

but searching briefly just now I can't find any followup beyond the announcement, not clear if it became reality.

jbri · on Oct 17, 2012

Where else would that heat end up?

Fundamentally, all cooling is the process of transferring heat from one place to another.

skrause · on Oct 17, 2012

The heat could be used to heat up houses or provide warm water: http://en.wikipedia.org/wiki/District_heating

packetslave · on Oct 17, 2012

The problem is getting the heat from where it's generated to where it needs to go with any kind of efficiency.

Evbn · on Oct 17, 2012

Google does this where possible.

7952 · on Oct 17, 2012

It is a shame. However, it is surely much easier to reduce wasted heat by simply buying more modern, more efficient CPUs that will probably be more powerful.

philip1209 · on Oct 17, 2012

I've often wondered if Google would heat its headquarters with a data center . . .

dredmorbius · on Oct 17, 2012

Few office buildings require heat, even in cold-weather climates. The residual heat from lighting, office equipment, bodies, etc., generally has to be removed, not augmented.

Not always the case, but I can assure you that in California, office heating demands are very, very low.

jrockway · on Oct 18, 2012

I heard secondhand that this is even the case in Chicago in the winter. (In my apartment in a highrise there, I never ran my heat. The building was naturally 75 degrees. On some days, I even turned on the AC to get it down to a more comfortable 72!)

esrauch · on Oct 18, 2012

With apartment highrises it is possible that other people in the building had their heat on.

dredmorbius · on Oct 18, 2012

Why not open a window (assuming they open)?

jrockway · on Oct 18, 2012

The building was pressurized such that opening a window only resulted in air leaving through the window.

prodigal_erik · on Oct 17, 2012

Mountain View seems very temperate, but I've never been involved with HVAC. Does office space often need heating rather than cooling?

javajosh · on Oct 17, 2012

Perhaps this is why they prefer to build datacenters in cold climates.

Evbn · on Oct 17, 2012

What is the difference between inefficiency and waste? Would you accept X efficiency loss to reduce waste by Y < X?

K2h · on Oct 17, 2012

To me, broadly speaking inefficiency in this context is the core infrastructure capability: if you run an engine that is 20% efficient at converting gasoline to electrical power then that is what you are stuck with until you replace or upgrade the equipment.

Waste is operating any equipment when not needed, regardless of its internal efficiency.

Another example: Using an incandescent light bulb instead of an LED is inefficient - but leaving either on when not needed is wasteful.

VikingCoder · on Oct 17, 2012

http://www.google.com/about/datacenters/inside/streetview/

knowaveragejoe · on Oct 17, 2012

Amusing easter eggs throughout. One guy has rick roll videos playing on both his computers.

svdad · on Oct 17, 2012

Love the storm trooper.

dredmorbius · on Oct 17, 2012

Not very effective security considering the R2 unit rolling past him.

jrockway · on Oct 18, 2012

That wasn't the droid he was looking for.

squarecat · on Oct 17, 2012

So, basically, the Lenoir data center is their sightseeing location for "factory tours".

The only data stored there is probably Doodles...

Evbn · on Oct 17, 2012

Yeah so you can't actually see the data when you visit a data center.

squarecat · on Oct 17, 2012

Understood, which is why Google, if it's going to all the effort of setting up open door press days, is missing a huge opportunity by not employing hypervisualized metaphors/exhibits to demonstrate how truly amazing a modern data center is, not to mention reinforcing how critical it is for so much of what the average user does every day.

They should probably contract a couple Disney Imagineers and do it right. They could benefit from the the humanizing effects and even create a quirky, niche destination in the process. Hell, throw in an "Android Experience" showroom and it'd probably even have legitimate commercial value.

rpearl · on Oct 17, 2012

It would all be fake. What's the point?

jodrellblank · on Oct 17, 2012

Just think how much wasted effort and embarrassment you could have saved Pixar, Disney / Disneyworld, HollyWood, Shakespeare and JK Rowling if you'd been there to point this out to them.

rpearl · on Oct 17, 2012

you're totally right, industrial data centers are the same as entertainment venues and should be combined.

plywoodtrees · on Oct 18, 2012

Conceptual cartoony view of how a data centre works: http://www.google.com/green/storyofsend

rpearl · on Oct 17, 2012

There are some photos, such as https://www.google.com/about/datacenters/gallery/images/_300...

I wonder why they've mirrored the image (the left side is quite clearly the right side flipped--take a look at the machine identifier labels). What's being hidden?

reledi · on Oct 18, 2012

The blue LEDs in the picture you linked to, indicate that the servers are running smoothly. [1] It's possible that some servers were faulty at the time the picture was taken. More likely, it's to make the image look perfect.

[1] http://www.google.com/about/datacenters/gallery/#/all/12

jefftk · on Oct 18, 2012

Perfect symmetry makes it more impressive?

ims · on Oct 17, 2012

Single page version: http://www.wired.com/wiredenterprise/2012/10/ff-inside-googl...

javajosh · on Oct 17, 2012

Hi Google Platform people. Very nice work. As you may know, Randall Monroe (of xkcd fame) has recently started a feature called "what if" on their site. I would like to post a question to you along those lines:

What if Google was tasked with building an orbiting datacenter? How about a Dyson ring, or sphere? How would you do it?

If we were to use all matter in the solar system for commodity linux hardware, how much gmail storage would I get? How many flops? And what sorts of computation could you do on this monster?

Please answer! This should be fun...

alxv · on Oct 18, 2012

Space would be a terrible environment for building a datacenter. The main goals of a datacenter is to make computation as cheap, as fast and as reliable as possible. Having the datacenter orbit around Earth would not help us accomplish any of these goals.

First off, building a datacenter in space would not be cheap. It costs around $25,000 to send a kilogram of equipment into a geostationary orbit. [1] So let assume we were to use Dell PowerEdge C1100. Each server costs $14,000 and weights 18kg. [2] This means for each server sent in orbit, you could buy 32 extra ones on Earth.

Then, there is the issue of cooling. Although outer space is really cold, its vacuum prevents the heat generated by the machines from being dissipated quickly. Controlling the temperature of a such datacenter would be a very interesting engineering challenge.

And then how would you power this datacenter? Converting the excess heat back into electricity could be an interesting option. But most likely, it would need a lot of solar panels. This would make the datacenter cheap to run once built, but the upfront costs would be enormous.

And we haven't talked about speed and reliability yet. Since the signal would need to travel about 35,000 km from the geostationary orbit to reach us, communications between Earth and the datacenter would have significant delays. Even at the speed of light, the minimum round trip time would be about 250 milliseconds if we ignore all other possible sources of delay.

The hostile space weather would also make it pretty hard to run servers reliability. Radiations would destroy electronics, caused bit to flip randomly and do all bunch of fun stuff to the equipment.

But... anyhow! Let's assume anyway that by some magical work of science and Google engineering, we figure ways to manufacture a datacenter directly space for almost nothing by mining the Moon, discover some amazing thermoelectric generators with near 100% efficiency and space shields that blocks almost all radiations.

So back our previous example, a high performance PowerEdge gives us up to about 300 GFLOPS of computing power, 192 GB of RAM and 12 TB of storage.

Now if we were to convert the total mass of the Moon (7.34767309 × 10²² kg) into one monstrous datacenter, this would give us about 4.0 × 10²¹ servers. It would gives us a whooping 1.2 billions YottaFLOPS (or put differently, 1.2 × 10³³ FLOPS) of compute madness, 0.8 billions YottaBytes and 49 billions YottaBytes of storage. This monster would consume about the equivalent of 1% of the Sun's total power output.

[1]: http://www.futron.com/upload/wysiwyg/Resources/Whitepapers/S... [2]: http://www.dell.com/us/enterprise/p/poweredge-c1100/pd#TechS...

javajosh · on Oct 18, 2012

Thanks for playing along! But realize that one of the great reasons to put a datacenter in space is physical security. Another reason would be unparalleled data connectivity to the entire planet. But yes, it is a very harsh environment, and waste heat is difficult to dissipate. And of course launch costs are very high. The real reason I asked is because I think it's bloody good fun (and figured the Google folks would get a kick out of it).

Some follow up questions: let's assume that we need to move 10^10 yottabytes from the MoonPC to the earth. How do we do it? What's the fastest we could do it without transfering so much heat that it melts either end of the connection?

Axsuul · on Oct 18, 2012

The heatsink could be exposed to the space environment for cooling.

henrikschroder · on Oct 18, 2012

A heatsink works because there is some sort of medium that absorbs heat from the sink and moves away, thereby moving heat away. On earth, we use air for this, sometimes with the help of a fan.

If you stick a heatsink on equipment in space, there's no air that can move the heat away, since space is mostly empty. You'll bleed off some through infrared radiation, but that's not going to be enough.

Axsuul · on Oct 18, 2012

That's right, oops!

mseebach · on Oct 17, 2012

It's a nice piece, but nothing new in it, and most certainly no doors were thrown anywhere.

Hurdy · on Oct 17, 2012

Well, it's the first time that there are photos from inside. There's a bigger gallery here: http://www.google.com/about/datacenters/gallery/

bruceboughton · on Oct 17, 2012

That link seems to throw Chrome 15 into an infinite redirect loop (without cutting off after n redirects).

Maybe they don't want you to see inside.

Hurdy · on Oct 17, 2012

I'm really curious: What's the reason that you are still using Chrome 15?

bruceboughton · on Oct 17, 2012

Corporate firewall blocks auto-update channel.

eco · on Oct 17, 2012

That's pretty amusing that the corporate firewall is making your system far less secure. There has most likely been hundreds of security issues fixed since Chrome 15.

lawdawg · on Oct 17, 2012

True, buts its a good consolidation/confirmation of what public information there is known about Google's datacenters.

3825 · on Oct 17, 2012

Thank you from sparing me from reading the article. :)

ludovicurbain · on Oct 17, 2012

+1, I saw this one coming... top secret DC with open doors ? duh.

Tipzntrix · on Oct 17, 2012

They have a team causing water leaks and stealing hardware to test their disaster recovery. That is some serious penetration testing.

riobard · on Oct 17, 2012

Isn't that just a simulation, not for real?

Tipzntrix · on Oct 17, 2012

From the tone of the article, it doesn't seem so. They did say the protest was placated with imaginary pizza though. Hmm, like that would work :P

seiji · on Oct 17, 2012

Google should one-up Amazon and get into the Datacenter As A Service market. Service segments: normal cages (I'd rather lease cages from Colorful Pipes, Inc than Equinix), pay-n-go turnkey same-hardware in 3 georedundant locations, and lease-by-rack in multiples of 10 pre-populated racks (racks specified as compute-only or storage-only with 10G interconnects between racks).

cracell · on Oct 17, 2012

It's doubtful that they could directly compete as is. Amazon has done well with their services because they eat their own dog food. From everything I read Bezos basically forced them to build this system and consume it for Amazon's own needs. Google has never taken this approach with their APIs and the difference shows very clearly when you consume these products.

sses · on Oct 17, 2012

I assume you're referencing stevey's rant, and if so you're conflating two issues. Steve was talking about the use of APIs between services at the application level, not the API for the datacenter/cloud as a platform.

seiji · on Oct 17, 2012

I think cr is talking about how you get "baby bigtable" and "baby cloudscale" that are copies of internal services, but not what core platforms use themselves.

erikpukinskis · on Oct 18, 2012

EC2 is relevant today, but it will become less and less relevant as the Platform-as-a-Service services (S3, SimpleDB, App Engine, Heroku, etc) get better. There will reach a point where fewer and fewer companies will actually use VMs directly. Seems like a reasonable play for Google to sit this round out and focus on winning the next one.

brown9-2 · on Oct 17, 2012

In regards to the disaster testing:

How did Google do this time? Pretty well. Despite the outages in the corporate network, executive chair Eric Schmidt was able to run a scheduled global all-hands meeting. The imaginary demonstrators were placated by imaginary pizza.

How does one decide what will placate imaginary demonstrators? Who calls them off?

ubercore · on Oct 17, 2012

For the purposes of tests like that, they probably just wanted to see that "reasonable action was taken", which will (hand-waving) probably take care of most instances of that type. In the event of real demonstrators, it would just be the opening salvo of damage control, but it's too hard to predicate how a crowd of angry people would react past the first move.

Loic · on Oct 17, 2012

I start to be annoyed with the "a power efficiency of 2 is the standard in datacenters". My servers are hosted in a datacenter with a global efficiency of 1.15, proved after more than a year in operation. Announcing that Google is doing 1.2 is simply announcing something wrong and I suppose Google is very happy with this number being provided to the press. It means that some competitor will use it as "Google is the best, they do 1.2, we are at 1.3 we are not too bad", where I bet Google is now near 1.1 or less (they operate without cooling in Belgium for example).

rryan · on Oct 17, 2012

http://www.google.com/about/datacenters/efficiency/internal/...

You're right -- those PUE numbers from the article were talking about their PUE at the time. Google's 2012 average PUE across all facilities was 1.12/1.13 with a minimum PUE of 1.09/1.10.

Also, Google puts enormous care into the process of calculating PUE since it's kind of black art and if you aren't careful you'll leave out some aspect of your operation that will mislead you into thinking your PUE is lower than it is.

wmf · on Oct 17, 2012

PUE 2 probably was the standard when Google started building their DCs. There's a wide range between enterprisey DCs that consider going from 2 to 1.8 an epic win and clouds/hosters who are getting below 1.3. Google's PUE data since 2008 is published at http://www.google.com/about/datacenters/efficiency/internal/...

francov88 · on Oct 17, 2012

Really cool article - would be amazing to walk through that facility.... love the Google coloured pipes from the pictures

stock_toaster · on Oct 18, 2012

It is unfortunate (for the rest of us) that datacenter tech is such a competitive advantage for Google. If they were able to share their breakthroughs more readily with others, imagine how much less of the "1.5% of all power globally" datacenters could be using.

dredmorbius · on Oct 18, 2012

When they say that supercomputing is essentially a plumbing problem ... looking at these photos, no kidding.

bhauer · on Oct 17, 2012

All caveats about chrome-dev aside, I find it amusing that this site's navigation does not work in Chrome v24.0.1297.0. Had to use Aurora to view it.

Maybe Google really is Sun v2 ("We are the dot in dot-com" == "Where the Internet lives").

wilfra · on Oct 17, 2012

Good read but most of that is not new information. I read a lot of that in a book about Google over two years ago. The last ~ 1 page was new though.

waqf · on Oct 17, 2012

Officially announcing things that "everybody knows" already can still make a difference. It means that you can ask Google executives about those things in public appearances and they'll at least acknowledge the question even if they refuse to answer substantially.

Fando · on Oct 17, 2012

An incredible article!

no_script · on Oct 18, 2012

Seriously. I can't believe they require JavaScript to view this all this eye-candy and server porn.

http://www.google.com/about/datacenters/gallery/

I thought GWT was designed to "compile" rendered pages for a wide variety of browsers and permutations of configurations?

The pictures are very pretty, but that's really awful of them to release a PR site like this, and force users into using JavaScript.

Unforgivable.