> including unfortunately the tooling we usually use to communicate across the c...

boulos · on June 2, 2019

Edit: and I agree!

I’m not in SRE so I don’t bother with all the backup modes (direct IRC channel, phone lines, “pagers” with backup numbers). I don’t think the networking SRE folks are as impacted in their direct communication, but they are (obviously) not able to get the word out as easily.

Still, it seems reasonable to me to use tooling for most outages that relies on “the network is fine overall”, to optimize for the common case.

Note: the status dashboard now correctly highlights (Edit: with a banner at the top) that multiple things are impacted because Networking. The Networking outage is the root cause.

marksomnian · on June 2, 2019

> the status dashboard now correctly highlights that multiple things are impacted because Networking.

this column of green checkmarks begs to differ: https://i.imgur.com/2TPD9e9.png

pm90 · on June 2, 2019

This is a person who's trying to help out while on vacation...can we try being more thankful, and not nitpick everything they say?

boulos · on June 2, 2019

Thanks! I’ll leave this here as evidence that I should rightfully reduce my days off by 1 :).

boulos · on June 2, 2019

The banner at the top. Sorry if that wasn’t clear.

seltzered_ · on June 2, 2019

While not exactly google cloud, G suite dashboard seems accurate: https://www.google.com/appsstatus#hl=en&v=status

TimothyBJacobs · on June 2, 2019

For me, at least, that was showing as all green for at least 30 minutes.

Twirrim · on June 2, 2019

AWS experienced a major outage a few years ago that couldn't be communicated to customers because it took out all the components central to update the status board. One of those obvious-in-hindsight situations.

Not long after that incident, they migrated it to something that couldn't be affected by any outage. I imagine Google will probably do the same thing after this :)

flurdy · on June 3, 2019

The status page is the kind of thing you expect to be hosted on a competitor network. It is not dogfooding but it is sensible.

Reminds me of when I was working with a telecoms company. It was a large multinational company and the second largest network in the country I was in at the time.

I was surprised when I noticed all the senior execs were carrying two phones, of which the second was a mobile number on the main competitor (ie the largest network). After a while, I realised that it made sense, as when the shit really hit the fan they could still be reached even when our network had a total outage.

techslave · on June 3, 2019

> Not long after that incident, they migrated it to something that couldn't be affected by any outage.

Like the black box on an airplane, if it has 100% uptime why don’t they build the whole thing out of that? ;)

captn3m0 · on June 2, 2019

Was just reading it, they made their status page multi-region.

k_bx · on June 2, 2019

Even more irony: Google+ shown as working fine: https://i.imgur.com/52ACuiY.png

chupasaurus · on June 2, 2019

G+ is alive and well for G Suite subscribers, not the general users.