Hacker News new | past | comments | ask | show | jobs | submit login
Gmail is having an outage (outage.report)
316 points by gdeglin on March 13, 2019 | hide | past | favorite | 138 comments



It isn't just gmail, Google Music, Google Play, Google Drive, and others are also impacted. Seems to be a large scale Google outage of some kind.


Maybe unrelated but a large portion of the Google Cloud web Console was down yesterday for hours.


Google photos is absolutely struggling to load my jpegs right now, though I'm not sure how separate that service is from Drive, if at all.


I would bet they're both using the same storage layer (Google Cloud Storage?)


I also have problems with firebase virtual device testing service and google play console api sice yesterday. I keep getting 500 as response, dont know if related though.


Add YouTube to the list.


YouTube seems to be in this amazing degraded state for me where videos play just fine, but it isn't showing me ads.


That state is called “using ublock”.


Hah I had the opposite. All my ads were playing fine but not the actual videos.


I was in the complete messed up scenario. Youtube loads ads fine, but after playing them, was not able to load the actual content.


Youtube was giving me 500s in ways that have never occurred before.


The root of this is definitely GCS. We noticed some keys in a GCS bucket go inaccessible since like 19:15-19:20ish PDT, and then we noticed increased timeouts/503's ramping up at 19:30 and plateauing at 19:45. But error rate for us seems to have gone back to nominal levels since 20:10... but the keys that originally went unavailable are still returning 503's.

Fun times.


GCS might be relying on the same system as Gmail/YouTube does, so it’s not necessarily GCS itself. Spanner?


If Google's GCP marketing material is to be believed, you are truly on their infrastructure on GCP, so if GCP is having issues so are Google's main services.


It will be very interesting to see what timing issues might be involved if this is indeed due to Spanner.


Blobstore


Fitting that something like this happens when we celebrate 30 years of the decentralized world wide web!


Only (some) Google services are affected. My email still works, as it's hosted elsewhere.


Email is working for me, but attachments are failing. Opening up the console and it has a message about a database connection loss.


Living for the postmortem on this one.


As far as I know, there wasn't a public postmortem for the Oct 16, 2018 YouTube outage. Based on that, I'm not expecting one for this outage.


YouTube ain't a Gsuite service with SLA. So this is much more likely to have a public report. Whether it's detailed or not is another thing. Also it's probably due to the underlying GAE or GCS issue is my guess.


"Red" incidents on the GCP status dashboard, like this one, will typically have public incident reports.

Disclaimer: I used to work in GCP Support, but no longer do.


Sorry, Bill tripped on the server cable


lol. email outage is the new dog ate my homework


Of course this would happen while I'm trying to submit my thesis at the last minute. https://twitter.com/mholt6/status/1105703745143205889


Good thing https://send.firefox.com/ was launched today :)


Except I'd still have to send the link in our email thread!


I found that emails seem to work, but attachments don't. So an email with a send link might be exactly the thing. Good luck!


Maybe text the link?


I don't have their numbers :(


Time to register for a competitor email service that doesn't use google infrastructure. Live.com, maybe?


Google seems to acknowledge the reports now. https://www.google.com/appsstatus#hl=en&v=status


https://status.cloud.google.com/

https://status.cloud.google.com/incident/appengine/19007

" Elevated error rate with Google App Engine Blobstore API and App Engine Version Deployment

Incident began at 2019-03-12 19:49 (all times are US/Pacific).

Mitigation work is currently underway by our Engineering Team. We will provide another status update by Tuesday, 2019-03-12 20:45 US/Pacific with current details. "


Status page is down for me.


These are the not that well designed scenarios, when the status pages are hosted on the very same infrastructure that suffers an outage. This is a common pattern across cloud providers I have seen happen.


That page is not very color blind friendly, incidentally. The information is distinguished only by color.


Hit that feedback button in the bottom corner.


You're kidding, right? Sending feedback to Google? I mean isn't that a bit like talking to a tortoise?


In general, maybe, but they take accessibility concerns very seriously in my experience.


Aren’t the icons all different?


No, they're all circles but in different colors.

EDIT: you might be looking at the Google Cloud status, not the Google services status in the first post.


Not for me - 'available' is a checkmark, 'service disruption' is an exclamation point, and 'unavailable' is an x.

(I notice though that they're images, not text icons, which isn't ideal.)


You're talking about the Google Cloud status page which is in the 2nd post: https://status.cloud.google.com/

This thread is about Google Services linked in the top post which only shows circles: https://www.google.com/appsstatus#hl=en&v=status


Weird. Clearly Google is serving up different versions to different people.

To me they're 3 different colors of circles, no symbols whatsoever.


It's more likely that your browser just isn't loading the CSS background images for some reason. If you inspect the circles, do you still see the background-image declaration?



You might be talking about the page in the first reply? Not in the post that someone was responding to?

I see all mid-toned circles with no icons (Chrome, Australia).


[flagged]


I mean, playing board games is kind of a bitch sometimes as well. I was trying to play Ticket to Ride on Monday and that was exciting.

Good thing I chose this in the same way people choose not to eat certain foods, right?


Propose solutions.


Honestly, it's not hard. Just don't design everything around color. On ticket to ride, the grey lines ones have a dot or shape in the middle whereas the black ones do not. I can see that dot, but it's small. If it was larger then the problem would be solved.


?!? because people choose to be colorblind? I was not aware of this!


there's no such thing as free will so choice is a red herring anywa. You are what you are, whether you 'chose' to be is pointless to talk about.


Cute way of getting out of having to own the fact that you’re an asshole.


Please elaborate.


The orange dots are much darker and the green ones are much lighter.

I'd assume anyone who is colorblind can still distinguish between the two based on brightness and not color?

Is there any reason to think that's not the case?


Heck, I'm not colorblind and I had trouble seeing the difference between "service disruption" and "service outage" at first in the legend at the bottom of the page.

It wasn't until I zoomed in on them that I could see that one was orange and the other red. Once I saw them zoomed, I could then identify which was which at normal size on the status part of the page.

BTW, the orange circle is actually a span whose class is "aad-yellow-circle", and whose CSS loads the colored circle from the file yellow_circle.png.

This suggests that at one time they intended it to be a yellow circle, not an orange circle [1]. I wonder why they switched from yellow to orange?

[1] Actually, RGB to name sites suggest that it is neon carrot.


If the user had e.g. red/green colorblindness, that wouldn't help. Google's made a nice tradeoff for this application, though, and used differently-shaped icons (checkmark vs. exclamation point) as well.

Edit: looks like the icons are served as images. Google should probably consider making them text icons instead to mitigate loading problems.

FYI, Toptal makes a helpful tool for quickly checking live pages with colorblindness filters: https://www.toptal.com/designers/colorfilter/


You're looking at the wrong page. This thread is about https://www.google.com/appsstatus#hl=en&v=status


Ahhh, got it now, thanks!


Mercury, god of communication, retrograde is kicking it up a notch this year.


Google forgot to charge their energy crystals?


I'm getting a ton of 503 Service Unavailable errors from the Google Drive API. I can't wait to see what caused such a large outage.


Seems to me a location specific breakdown as all services I tried like gmail, photos and youtube all are working for me


I thought outages weren't supposed to be posted here? Every time I've posted one about Facebook, it's been flagged and removed.


I was trying to send an email and kept getting "Oops there was a problem" errors, but it appears to be working now.


First time this has happened to me. Historic moment. Might be time to get less completely reliant on google?


AND we are back


May I ask - why the downvote? I know it was a blip, but still, a reminder for a moment that most of life relies on them. And some of my data is not replicated else ware... that is changing.


I personally downvoted because of how a single incident is being extrapolated into a new trend by you, and that feels unhelpful/fear mongering.


Definitely having problems in Melbourne, Australia


In Adelaide also. Things I'm experiencing:

  can't reliably download attachments in Gmail
  messages not getting through in Hangouts
  constant "Oops" auto-save modals when typing Gmail messages


Brisbane too - Google Photos would sync about the time you wrote this comment.


GMail's working fine for me (8min later) in Melbourne, Australia.


Happy 30th birthday, Internet. Enjoy your consolidated cloud infrastructure.


Google Photos too. Unable to download images :(


Google Maps (specifically Street View) is showing up as a black screen for me as well.


Can confirm wrt Drive. I was just at work trying to upload files (and one-deep folders) and barely anything completed. I finally gave up.

Our edge/gateway is in metro SEA, fwiw.


If anyone is interested: Memorystore for Redis is experiencing intermittent failures during instance creation. Existing instances are not affected.


I had trouble replying to an email but it worked once I removed the images inside the quoted signature. Makes me think this is file-related.


Noticed this a few minutes ago. It’s pretty rare for Google to have an outage, don’t they have massive replication and failover?


Youtube was down for a few hours late last year.


YouTube seems down too:

https://outage.report/youtube


Down in Calgary still (10:24pmMST). Was going to do work but not confident my email drafts will save.


Is having to load two tabs of gmail to get my mail to actually appear also a type of outage?


So this is why we couldn't send emails last night (9pm est) Watched Youtube instead :)


Probably caused by a typo in the configuration for the new region they just launched ;)


Attachments are definitely down for me. Can't view received images.


In CA.

Colaboratory Research is experiencing issues loading and saving notebooks in Google Drive.

Sad :(.


Can't attach files to drafts nor send emails from Philippines


Works fine in Illinois.


Down in Singapore


Fine here in CA.


I'm in Mountain View, receiving/sending email works but downloading full size attachments does not for me personally.


I just emailed some folks images. Got a few warnings of something going wrong, so I then checked in my outbox. It showed that the images went, but there was a message that the "virus scanners were down, download at your own risk." Or some such. Seems to be back to working now, so I just assumed it was a time delay.

Edit: Just saw it again. Full message is: Gmail virus scanners are temporarily unavailable – The attached files haven't been scanned for viruses. Download these files at your own risk.

First time I've seen something like this, actually kind of glad they had this failure case coded for. :)


I've been seeing that for the last couple of weeks, off and on.


My guess is there is always a time delay of some sort on files. The transient "something went wrong" errors were surprising.


That error happens all the time on files it can't handle; i think .tgz type files come to mind.


Also Mountain View, but had issues sending/receiving as wells. I'm on Comcast, you?


I am using both a T-Mobile hotspot and Sonic.net DSL.


And in Montreal, Canada too


All Ok in Sydney


Half of my colleagues had issues in Sydney. OK in Singapore apart from images attaching to docs in drive


Having issues in Sydney :)


same - I'm in Sydney - no issues (so far).

Perhaps it's user based, rather than location based?


Seems to be. I'm on the same network as others not having any issues.


Happy that I was sleeping at that time.


Having problems in Budapest, Hungary.


What's the tech scene in Budapest like ?


He told you already, Hungry.


it's funny, not sure why you are downvoted


Ah gmail. Where spam goes to die.


No issues here in the Midwest.


Same, at least not yet. Chicago area here. It does seem slower than usual, but might just be my imagination.


Fine here in Kerala, India.


Street view is iffy as well


Is Gmail onto kubernetes?


Google is not using kubernetes internally but it's predecessor "borg". Kubernetes was build from scratch with the same idea, but not used internally.

https://www.quora.com/How-are-Borg-and-Kubernetes-different


The general sentiment seems to be that Kubernetes is better than Borg in almost all ways.

The only thing keeping Google on Borg is the massive amount of work to migrate.


I think it may be a touch beyond Kubernetes.


Working fine in Austin TX


drive, maps, photos all having issues as well


My bad guys. I'm running out storage space on my google account and I think I might have pushed it over.


On Gmail or on Drive? :-)


Would you believe I have over 10GB of old emails sitting around?


Of course :-)


Isn't it shared between services?


Well that's why I asked which one, otherwise it would've obviously been Gmail...


[flagged]


Why the negativity. Mistakes happen ?


Because it's largely unnecessary. If people were more willing to push for decentralized alternatives then world-wide outages would be a thing of the past, short of the entire internet itself going down in which case you probably have bigger problems to worry about that loss of access to [x].

In my opinion it's not a question of if, but when we'll have our first disgruntled higher level employee manage to effectively destroy massive amounts of information and redundancies. As people become more and more dependent upon centralized services, the impact of this first 'digital nuke' will only continue to grow.

In some ways the internet feels like we have created a machine that can take us anywhere in the universe, but we've decided that instead we'll use it to take us strictly between New York and LA - occasionally with some 'wild' idea to take us to London.


Maybe think of it more as an opportunity to remember how we've let ourselves be lulled by this behemoth. Google itself is pretty much a spof for most of the world. If Google is down, so is business.

Of course, the report afterwards will be very interesting, while we pick apart the reasons on what and how it failed.


Probably just a Google employee waking up and becoming sentient again, running around pulling out as many cables as possible to save humanity. It should be up and running again when they catch and reset him.


There are enough email alternatives.


I am not affected by this outage but I get a bit angry when I think about how many people are. In a better world it would affect much fewer people.


That's pretty fundamentally stupid.


Why?


Elizabeth Warren is winning :(


SHE'S HACKED THE SYSTEM




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: