Hacker News new | past | comments | ask | show | jobs | submit login
Google Services Experiencing Disruptions (google.com)
313 points by coderintherye on Sept 25, 2020 | hide | past | favorite | 235 comments




Why is the Google Status dashboard slower than HackerNews at reporting the downtime?

To get ahead of the cynics — it would not serve the least generous of Google's objectives to be less than transparent about downtime — people figuring it out while the dashboard is green looks much worse.


Honestly, incident management is hard. It is a different skill set to straight up SRE management, and all the training in the world won't get you ready. Believe me it's true in AWS too, true in a lot of places.

It's complicated, and the human factor is massive in an unpredicted scenario.

I'm an IM myself, and regularly have to make the call regarding status updates, and also seen how other companies do it.

One of my favourites was a big it infra company sending a notification about an outage 6 hours after the event, after we opened and resolved a ticket with them. Was actually pretty useful but made me laugh all the same


The time you mark the dashboard as “red” is when the clock starts ticking against your SLA and error budget.


That makes no sense. The dashboard is for current best-effort knowledge. SLA and budget are for actual outages.


Not true.

A lot of customers judge their perception of the severeness of an issue based on the service dashboard. That's why the dashboard makes everything look more rosy than reality.

Every time it says "some customers may be experiencing elevated latency", it usually means "all customers are seeing all requests timeout, and the only ones who aren't are the ones not using the service right now".

Yes seriously - when they say "<5% of users are having trouble accessing Gmail" they are calculating that based on the percentage of all Gmail accounts which have seen an error. Obviously the vast majority of accounts are inactive at any one time, so aren't seeing anything...


Properly investigating takes time. There are constant reports of something not working by customers so they need to confirm that it actually is internal and widespread first.

Also SLAs and service credits are tied to these official notices which causes even more delay before status updates are approved.


when soemthing this broadly painful happens at a company of Google's scale --typically you have something like a hundred people internally that all got paged.


There's no reason for service credits to tied be to the dashboard and not to determinations made after the investigation is over.


The SLA guarantees a certain level of no problems. The dashboard officially says there's a problem. This creates liability.


They should have pro-active monitoring system and update status page automatically when it detected disruption. Especially when you account to it's extremely hard to get human supports from Google because they automated it.


I wouldn't be surprised if, at Google's scale, something or other is disrupted all the time.

I think I'd prefer a status page that reflected that, or even no status page at all with an explanation, rather than a lying one, though.


Google has multiple internal alerting systems and people are aware of any disruption/delay/anomaly from the minute it happens.

The on-caller for the specific service decides on how to produce.

The status dashboard is something which will be (manually, yes by hand) updated by an operations employee, who is a couple of layers behind the actual SWE/SRE who gets the page.


The status dashboard is something which will be updated by an operations employee manually (yes by hand), who is a couple of layers behind the actual SWE/SRE who gets the page.


They are just doing a dry run for when they finally close shop on Google Cloud...


They updated a few min after you posted: https://status.cloud.google.com/incident/zall/20010


One of our GKE clusters has been stuck "upgrading" for 12 hours. Totally stalled, no way to cancel, and auto-scaling is non-functional. Support seems pretty stumped by the issue and they haven't updated us since about an hour after I reported, so I wonder if this is part of a bigger issue?


Doubt it, you're probably just unlucky and have a cursed instance that is misconfigured in a very subtle way. Anecdote: I had an AWS instance that was stuck 'terminating' for 10 days in 2017 and finally got killed - maybe the hypervisor finally timed out or a SRE saw the anomaly.


Just guessing here, but I imagine that updates for service-wide statuses which are used on the scale that Google operates require manual vetting.

It is frustrating that these pages are not updated in real-time, but I do understand wanting to be sure before publishing a message to your entire userbase that you are experiencing disruptions.


> Just guessing here, but I imagine that updates for service-wide statuses which are used on the scale that Google operates require manual vetting.

Gee, you mean we just discovered that area where Google admits that either automation or machine learning does not actually work?!


It's usually less to do with whether automation works and more to do with having an internal party to hold responsible, when it comes to these things.

Google doesn't need an internal party to hold responsible for the concerns of free mail users or their own contractors who are called "content creators." In part because Gmail users and YouTubers don't lead fortune 50 companies and play golf with Google Execs like Google Cloud customers do.


What good is a dashboard if it doesn’t show actual status


What does 'actual status' mean though? If I'm running a website from a single server, the power goes out and the server goes down then it's kinda obvious. I don't think it's so obvious when you have hundreds of data centers serving traffic to billions of people. If one DC becomes inaccessible does that count as an outage? It'll sure feel like it to the affected users, but the vast majority can still use your service.

You need to draw a line in the sand somewhere, but whatever measure you choose is going to be somewhat arbitrary so I think it's reasonable to have a human make the final call (based on some known criteria).


> If one DC becomes inaccessible does that count as an outage? It'll sure feel like it to the affected users, but the vast majority can still use your service.

Then put that on the dashboard! It certainly is an outage for the affected users (by your scenario).

   - 999/1000 Data centers up
   - 100,000 Users without service
   - 6 paying customers without service (will get money back)


Except, sometimes a data center goes down and no one notices. So if a data center goes down, and no user around notices, is it really down?


Are there any large community run status dashboards? Seems like it is something that we can't trust companies to be honest about (due to SLAs).


Just check Hacker News, of course :)



Most of the time, it does!


This is why I can't run a company.

"Sometimes, it's probably almost approximately around what we think could be along the spectrum of known values." - Me, as CEO.


"Our business is not a coherent single entity and we don't have the best operational policies that try to remedy that so I won't make any promises about quality while still sounding like we're worth $1b"


"I apologize for having misspoke earlier. The correct quote is: '...while still sounding like we're maybe worth $1b.' We could be worth more than the disclosed amount; however, we aren't sure if or how often, if ever, someone may be counting that number. It's really up in the air on occasion at around this point." - Me, still CEO, now an award-winning CEO.


I am experiencing intermittent issues with GMail not loading new conversations in a timely fashion, and showing "something went wrong" notification messages.


I'm betting it's that something gnarly got checked in rather than service outage


I'll second that with something more specific. A bad update to their software defined routers or switches.


"A pool of servers that route traffic to application backends crashed"

https://mobile.twitter.com/uhoelzle/status/13093135569956618...

So, "kind of" confirmed, I think. Assuming a bad update is the most likely culprit for a whole pool of servers to all crash at the same time.


Can't something gnarly getting checked in cause a service outage?


There are definitely issues. GDocs functionality is intermittent. e.g. the ability to move a doc into a new folder or even see what folder it's in using the little folder icon at the top while editing has disappeared.

I've suggested to our team they copy/paste locally anything they're currently editing.


Side note: Hacker News is an IT guys best friend, came here for confirmation after staff started complaining about gsuite being down and the disruption announcement is top spot. Much appreciated


twitter too is good for this, especially international


Search is still up and snappy! I may be biased as an ex-search Xoogler, but Search's resilience is a testament to how battle-hardened a stack it is, considering GDocs and Gmail appear to be hard down.


As a present search Googler, it's more of a testament to how easy it is to run a service that doesn't need to accept transactional writes from users, contains little or no private data, and where >1% taint is acceptable.


Search ads require transactional writes. But actually, as far as I remember big outages at Google in the past have been things like "someone pushed the wrong router config" and not "our database got overloaded". Maybe search is just more conservative?


Search ads is a completely different system than search. They don't share tech stacks or reporting chains.


Do ads truly require transactional writes though, rather than eventually consistent log-append writes?


I have the same question. I suppose ads billing might have transactional requirements, but that's separate from search itself.


I worked at Bing/Yahoo.

There are two mechanisms helping keep search up when considering ads:

-> Search engine pages can be served w/ no ads, or even no personalization if components are down or lagging

-> Ad logs can be lost if necessary, since we'd under-bill in that case, which would be our own problem. That happened a small handful of times while I was there.

The billing system is indeed transactional, but has less need for 100% up-time and 0 latency in comparison to search. Not nearly as big of a deal if there's an hour delay on billing ads as compared to search being down for an hour.


Glad to see at least some folks in Search understand this :)


Sorry, what is taint in this context?


Probably, missed elements


Yep, that's it.


No private data? Isn’t search personalized?


I can't really comment on that, except to say that the way it works is a much easier problem than how gmail works.


Big talk! Must be nice being able to serve 99% of the index and not have anyone notice. If the gmail threadlist comes up with 1 missing message everyone will lose their minds. And obviously if user authentication and homing is down then search can serve public docs but gmail etc just can't serve.


I was getting HTTP 500 errors when clicking the search results. The search itself was working, but I couldn't follow the links.

It annoys me a lot that Google doesn't use the "real" URL in their search result links.


Isn't search more or less static? Docs and mail change instantly and users expect instant feedback


Search handles probably an order of magnitude more changes per second than GMail+Docs combined


But if the index doesn't update you won't notice.


Yeah, search isn’t an OLTP system.


I pretty strongly doubt that the arrival rate of new docs at search is an order of magnitude greater than that at Gmail.


Offtopic, but it blows my mind seeing <1 min old StackOverflow comments indexed by Google search, the sheer amount of work going on behind the scenes to make that happen.


I'm pretty sure this only happens when the site is actively pushing updates to google via their webmaster tools. I think your sites importance would also impact how fast google indexes it.


Didn't realize this was a thing: https://developers.google.com/search/docs/guides/submit-URLs

Although I'm not sure if this is what stack exchange actually uses or if there's some private API for big partners.


Search results were either failing or taking a few minutes to load in Hong Kong this morning


Does search actually uses the same platform as the rest of Google?


From what I understand, "mostly". Depends on what you mean by "platform". A couple key differences--search runs at a higher level of redundancy than most other services. And, importantly, it can run very well at reduced capacity.

For example, if you type a search in the homepage or address bar, you start getting suggestions instantaneously. Think about this... if Google decided to only give you suggestions every second keypress, or every third keypress, would you really notice?


Moreover, writes are hard, reads are easy, and everywhere that a write needs to happen when a user does a search can fail gracefully. If this search doesn’t end up in your search history, or doesn’t show an ad, or doesn’t contribute to your future personalization then oh well, you’ll never notice.


My entire Meet call just got dropped and can no longer connect. Lots of 502's.

The good news is that it happened on a Friday -- see you all next week!


It's still Thursday in some places :)


Not such good news if you were supposed to have a meeting in the morning with people in the UK. Due to timezones, it's had to be rescheduled to Tuesday morning.


Poor UK people working Saturdays :(


Yeah that was what initially got me to check status. Checked twitter and saw others with the issue. Tried to do voice only call but Google Voice was down as well.


Do these status dashboards have any more credibility left? Always showing green when services are down.


Trying to understand here, do you think companies make these as a minimum status directly into their monitoring, or as a communications dashboard that someone upates, in aggregate?


I imagine its insanely difficult to write an automated status dashboard since a problem can cause the service to not work while not having everything entirely broken.

The apis tested by the automated system could all still be working. Probably the only way would be to have a constantly running vm testing everything via the web ui.


It's not that difficult, just a piece of work. Google has teams dedicated to just that kind of work and internally has the automated dashboards that update real-time. The confusion here is that the page linked is not it. This is some static HTML or such that is updated when a human has triaged the issue, determined and aggregated scope and got an OK from another human (or more, depending on the situation).


Of course. Green means the service is working for me. It is Red when the service is not working for me, and clearly neither for nobody else. Orange means it doesn't work for me, but you might have better luck.


Several services haven't been working for me for 20 minutes and the dashboard hasn't been updated. They're effectively useless.


A few months ago when all the auth services stopped working it took them a while to update the pages too.


Chat coming back up. Hulu was down, appears to be back up?

Edit: YouTube livestream is up, despite downdetector claim that it is down.

Docs, inside and outside of GSuite appears to be up.

1836 Pacific: Downdetector shows improvement across Google services.

1840 Pacific: Downdetector shows marked improvement across Google services. They don't have units, but ~80% improvement in many services over peak down-ness.

1852 Pacific: Continued improvement.

1854 going for a walk. Looks like things are generally headed for normality.


Hah I read those as years and was wondering if there was an invasion from the Pacific in 1854.


We're getting 502 responses to all of our calls to all Google APIs. There doesn't seem to be a correlating incident on either dashboard:

https://status.cloud.google.com/ https://www.google.com/appsstatus#hl=en&v=status


Wow, this is substantial.

All my mail sync is down, Analytics isn't loading, Sheets and Docs are all stuck in "Reconnecting". I wonder what could cause this.


https://twitter.com/uhoelzle/status/1309313556328841216?s=19

"a pool of servers that route traffic to application backends crashed"


One downside of training on TPUs is that by default they save models to GCE buckets. When buckets go down, there’s no way to save or load your models. It’s kind of funny: my model is training right now; I see it training right now; but it can’t save, so the progress will be wiped out if the TPU preempts or malfunctions.

Luckily they’ll fix the issue within 24h, which is when TPUs preempt. :)


The other downside of TPUs is that they are ridiculously overpriced for all workloads, they are basically GCP snake oil, and on a cost benefit basis there is not a single use case where it’s rational to choose TPUs, not even over other options hosted in GCP.


TPUs are more expensive per model train then regular GPUs? I thought the point was they were way faster, which makes up for the cost. Data scientists using gpus usually can do the math on costs...


Even accounting for speed, TPUs are worse on a cost benefit basis. People (some of whom are data scientists) pay to use them because they are the shiny new toy and there’s entertainment in blogging about it or writing a walkthrough, not because they economically solve any problem.

I manage an enterprise machine learning team and we do tons of stuff on GCP. I’m dead serious: there is not a single use case where it makes sense to choose TPUs unless the main value you’re seeking is just the entertainment factor of using Google’s new thing.


There's a lot of reasons TPUs could be poorly matched for your workload, including model complexity (or lack thereof), the way your model is setup, the inputs to your models (if you're bottlenecked on IO, including host <-> TPU memory bandwidth, well, what can you do?), how you were training them (including the evaluators used).

Since your post didn't actually include any details, I did a search and immediately found an article[1] where TPUs worked better for their particular use-case. I suspect I could find many such reports (and probably some opposite reports too).

It's unfortunate they didn't work for you, perhaps you should give them another shot with a different model. I'd recommend using Cloud's examples as a starting point.

[1] https://medium.com/bigdatarepublic/cost-comparison-of-deep-l...


That ... Doesn't seem likely? How big is your team? Any chance the person who evaluated TPUs made an error when setting up their test? I'm just a little skeptical that Google would have gone to all the trouble to design the things and that so many organizations would be spending so much money to use them if they weren't better than GPUs in some way.


You're almost certainly using the TPUs wrong. It's very easy to use them wrong, unfortunately.

When you use them right, a TPUv3-8 gets equivalent perf to a cluster of 8 V100s.

I was astounded. I trained StyleGAN 2 from scratch at 1024x1024 in 2.5 days. nvidia took 7 days for their official model. Granted, I used a v3-32, not a v3-8, but performance seems pretty similar.


Wow, you really like dealing with absolutes...


TPUs are pretty much the only thing in engineering I’ve ever encountered where literal absolutes actually apply.


Have you compared TPUs to learning on other cloud infra, or some other model?


Have you written about this in the past? This seems pretty absolute.


Google search is up. Google Mail, Calendar, Docs seem to be down.

(This is probably a testament to who pays the bills.)


It makes me wonder if it has anything to do with GCP. Pure speculation, but if it's in the cloud then they probably wouldn't hesitate to dogfood their own platform. But then it would be interesting since search still works. Maybe search was too high-value to risk having a hard dependency on whatever it is that is causing the outage.


Google has an internal cloud that they talk about sometimes. Borg, bigtable, spanner, etc. They run all their services on their internal cloud, including their public cloud, with the exception of Compute Engine, which is its own thing.


Compute Engine isn't really a separate thing so much as it's another thing built on the same infra. All the way down to the individual VMs, it runs on Google's production infrastructure and is scheduled by Borg (onto dedicated machine pools, although that sort of dedicated resource pooling is not unique to GCE). Once you get into VM networking things get a bit more exotic, although all of that still runs atop the same software defined network infrastructure as everything else (but, much like with Borg, it uses more features than many workloads at Google).

(I work on Compute Engine's virtualization infrastructure)


Thanks for the clarification! I had heard that Compute Engine ran on dedicated machines, but was under the impression that approach was unique to GCE.


There was interruption between GCP's monitoring & services on GCP. I also have my own Prometheus/Grafana stack in there and it kept chugging along fine, even though their DB graphs show severe dips in tx/sec in the time period in question.


Or who relies on per-user account data.


google search runs on aws


Incidentally, this affected Google's game streaming service Stadia, which is an interesting timing with Amazon's Luna announcement ;)


It's really funny reading this and seeing the lower-ranked comments saying "Time to switch!"

Yeah, you won't. Stop being hyperbolic. You're amazingly served by google and you'll forget this 20 minutes after service is restored.

It makes me wonder, what duration of outage is necessary for people to actually switch?


I tried Zoho for my last startup, since gsuite nuked the free tier.

That lasted 2 weeks. Being locked out of or requiring special considerations for the first calendar invite, or video conference, or collaborative document was a complete unnecessary pain. “But you can make a google account with your external email” barf.

Its sad because Zoho is trying so hard too. Imagine planning and working on all those equivalent products. lol


Nothing on planet is worse than Zoho. I tried their email service and even after I marked the same sender more than 100 times as spammer, their creepy service still not putting their emails automatically to the spam folder. There were many other issues with their service. Don't use Zoho or you will regret it.


We are sorry that we gave you this perception. Our Spam filter is self-learning and the algorithm responds to user markings. If we have your account information or the headers of the sample mail, we will be able to look into it and identify what went wrong. We request you to write to support (at) zohomail (dot) com with those details and description about the other issues you faced. We will definitely address it for you.


Downvoted for: (a) Sounding like a canned response (b) "We are sorry that we gave you this perception" passive aggressive gaslighting and abdication of any responsibility to the user. Really, I've barely ever heard of zoho and had no opinion of them _until_ this response. Now zoho's on my fuck-them list for talking this way to a customer.

Speak like a human for fuck's sake.


Responding to "your spam detection is not working" with what is essentially spam has to be one of the most unintentionally funny things I've read on HN.


Hello! This was definitely not a canned response, but we can understand why it was seen that way. Tone is a really tricky thing through text, and we didn't mean to sound like a robot. We hope you won't mind if we try again:

We are very sorry that the spam filter is giving you such trouble. It has self-learning capabilities, but algorithms are not always perfect. If you could send us your account details and the headers for some of the emails, we will look into this right away. Please send us an email at support (at) zohomail (dot) com when you have a few minutes and we will dig into the details.


I appreciate y'all trying again. Here's some constructive feedback:

(1) Look at other responses from people directly responding to issues from customers here at HN. Look at how specific, authentic, and action-oriented those responses are. Then look at yours again.

(2) Notice how they don't use the royal "We." It's 2020, we're all working from home with small spots of clean behind us for zoom, the royal We has to die. Unless the speaker is actually a committee or Borg, please use I.

(3) "not always perfect" is garbage. It failed the user at its basic job. Say so. Any apology without admission isn't an apology. Actually, kill the entire sentence. Nobody cares about the capabilities of a system that doesn't work right.

(4) They've already said that they tried you and you failed. Explicitly ask for a chance to make it right. Why would they bother working with you again, if you don't even ask them? They'd be doing you a favor. Ask for the favor.

(5) Nit: The last two sentences redundantly ask for the customer to contact you. Combine into a single request for an email with (btw, you don't have to explicitly "send us an email" when you're giving them an email address to contact -- it's a side-ways way of talking down to the customer's intelligence) account "details" (Which details? The account name? Credentials? What?) and headers.


Hey, thanks for taking the time to offer such constructive feedback. It's really hard not to sound like an automaton at times, and I am grateful for your help.

As for the royal we...I think you are right. In 2020, with the world the way it is now, I think everyone can use more personalization and less of the anonymity provided by we. I take your point, and will do my best to implement it.

Again, thanks for your time, and for your passion. Have a great one!


A real human! Welcome! I'm glad to hear from _you_!

One last bit of unsolicited advice: your job is writing. This is the best writing book I've ever read that deals with the issues I saw in your writing: https://www.amazon.com/Writing-Well-Classic-Guide-Nonfiction...

You can tell the author knows what they're talking about because the book is a really easy read!


Your message is actually much more problematic than its parent, IMO.


How are you locked out of business by not using Google? Enough companies run solely on Microsoft products without any issues. Others use separate providers for Email and conferencing (e.g. Zoom). I don't see how not using Google is a problem when dealing with other companies.

Quite the opposite, it's still much harder to not use Excel if you exchange data with a lot of business partners.


"but you can use separate services for video conferencing, calendars, emails, look at all these other organizations that are less convenient and doing it just fine"

yeah I'm aware. if you have a choice on a new organization, low budget but still able to access good, fast and cheap, use Gsuite.


Or Office, same prices for similar features.


I'm not sure I understand. What do you mean by locked out and by special considerations?


Presumably they were still trying to collaborate with people inside Google's walled garden of documents, calendar events and videoconferences.


Which is everyone, or at least a very predictable friction.


Sorry to learn about your unpleasant experience with us. We request you to elaborate the lock out and the special considerations you are referring to, by writing an email to support (at) zohomail (dot) com. Having specific details along with your account information will help us to analyze and make necessary corrections and enhancements. Looking forward to hearing from you.


I think you are misunderstanding OP. At least I assume that Zoho will not be able to change Google's policies, otherwise that would truly be outstanding customer service.


For me you're not the problem, you just have no power here. Google services are more convenient to use with Google email accounts, they are more prevalent and there is no point in burdening an organization with another email provider if there is a choice.


We switched from Rackspace’s cloud to AWS due to terrible failure modes with their managed MySQL product. We’d have a db cluster go down and wait hours for them to fix several times a year for several years. Switched to Aurora and haven’t looked back.


The only Google domains I allow on my pi.hole are youtube.com and a google subdomain that serves the css for youtube.

Nobody who can afford to pay for a reputable webmail provider should be on Gmail. If not for privacy, because its design is terrible.

If you use Google Chrome, or Youtube, or Google Docs, or Google DNS, or an Android device, you should not use Google Search. That's too much data to hand to one company. I use DDG, and when it's not good enough, I use Bing (and if I used Windows, I'd rule Bing out, too).

Aside from privacy considerations, Google search quality has been declining for at least a year. I want to skate to where the puck is going, not where to where the puck is a mountain of ads followed by nonsense.

"I left Google" is going to be the 20's version of "I left Facebook."


> "I left Google" is going to be the 20's version of "I left Facebook."

So you mean it will be all the rage to talk about in tech forums but somehow the number of active users will continue to go up every quarter?


Yes, it will be almost exactly the same. The people leaving Google will be young users and techies. Google has plenty of smart employees, and they will eek out growth from the rest of the population.


The way you choose to live your life sounds like punishment.


Over the next decade, a lot of Google data will leak, piling up in giant torrents (literally) forever on the net. Photos and emails and search histories from millions of users.

Over the next decade, downloading and storing and processing such dumps of leaked data will continue to become quicker, and require smaller devices.

Over the next decade, mobile devices will shrink, resulting in smart glasses much more powerful than today's smart phones.

Put that all together...

It's 2030 and you're walking down the street. You pass by a stranger wearing Apple Glasses.

The stranger once spent an hour installing an "Intel on Everybody!" app, and downloaded 100 TB of random leaked data from a torrent site on everyone in the world.

As you enter the stranger's view, the glasses use the leaked data to facially identify you.

Now that the glasses know who you are, they automatically alert the wearer of the types of data they know about you.

The bored stranger occupies himself by reading your gmails from 2022, perusing compromising photos you saved, checking out your medical records, reading arguments you had here in the comments of HN, etc etc.

Maybe my understanding of technology and security is wrong somehow, but that future seems awfully plausible to me.

If the downside to minimizing my footprint is that I use DDG instead of Google, the upside is worth it.


I agree with your prognosis completely, and I think it will not just be Google, but all the big data troves out there, even NSA and such. It is just inevitable that on a long enough time scale, all data will either become public or fade away.

I think that there will be a new form of entertainment which will basically generate a soap opera-like experience from real lives based on aggregated camera data, recorded conversations, and the blanks filled in by guessing.

I've adapted a different strategy from you, however. I have accepted the new privacy terrain, and I lead my life as if someone is always watching, which means I try to be my best self at all times. :)


Alas, there exist photos of me wearing multicolored parachute pants back in my nightclubbing phase that I can never allow to come to light under any circumstances. So it's too late for me.


I do wonder which part of my premise readers find unrealistic?

- technology is getting more powerful and smaller

- large companies can be hacked

- leaked data is easily available on the net

- smart glasses will become more common

- apps can be made to facially id people using leaked data

Are any of those statements outlandish?


I find it hard to imagine Google being leaked, though. The sheer amount of data (drive files, gmail data, etc) Google has in its silos makes it almost impossible to copy anywhere else other than on Google's own servers or another giant's DCs. Even just the metadata of every GMail message is probably a few dozen exabytes at the least. Any sort of Google data leak that would allow for you to find a random person would require a persistent backdoor into their systems.

The only realistic attack on Google would be some targeted/filtered dump of celebrities or perhaps entire GSuite organizations' emails and files - and even then it would only make headlines, not cause any real trouble for Google outside of <1% of their regular user base ditching for protonmail or what have you.


  I find it hard to imagine Google being leaked
Not all their data, and not all at once, but a little here, a little there.

  The sheer amount of data
It won't be possible to collect all of Google's data. It will get easier to collect more and more of it though (as storage grows larger and the net becomes faster). I wouldn't be surprised if all the email Hotmail handled in the 90s could fit on a consumer 20TB drive from today.


> The sheer amount of data (drive files, gmail data, etc) Google has in its silos makes it almost impossible to copy anywhere else other than on Google's own servers or another giant's DCs

my first computer had a two Megabyte Harddisk - around twenty years later my Phone has a 256 Gigabyte MicroSD card

it's going to happen


People also generate more and more data.


That's another great point, and it does mean that users always have less to worry about when it comes to recent data. But when it comes to a user's old data... well, you see where I'm going with that :)


>Are any of those statements outlandish?

websites that host the leaked celebrity nudes are torn down constantly. I can't imagine that you can upload that chunk of data or that app onto any platform without repercussions. On top of that, my tin-foil hat take is that internet access will eventually be completely de-anonymized. Social security or gov. ID to sign on.

It becomes much harder then to spread TB of personal information. Especially since even in 2035 I can't imagine there being very many hundred(S) of terabyte datasets being passed around social circles.


  websites that host the leaked celebrity nudes are torn down constantly
That is true, yet surely most live on as torrents (and discoverable via hundreds of sketchy torrent websites that pop up and fizzle away)? Even before most people had broadband, nerds traded collections of pirated software via "sneaker net"

  internet access will eventually be completely de-anonymized. Social security or gov. ID to sign on.
That scenario seems possible to me, too, and I agree its outcome would make what I envisioned less likely. It's not guaranteed to happen though.


The (most) unrealistic part for me is probably Apple allowing such an app in their store...


Good catch! Apple would likely close off the most common avenues to provide such an app. There are nevertheless many ways a developer could make it available. Off the top of my head, one could distribute it as a webapp, or use an exploit to install, or provide as an Xcode project to compile and install directly.

It also is possible someone will force Apple to permit third parties to run app stores.

I should point out that while I chose Apple in my comment, there will be other brands with fewer restrictions.


It won't happen in that way, because reality likes to defy predictions, but that's the general direction we're heading to.

My take: it won't be something everybody does (unless provided by some service, which looks illegal at least under GDPR) but be prepared for extensive profiling by whoever really cares about doing it.


That's probably an accurate take. I agree, your average internet user isn't likely to configure their devices to do such things. It will be the domain of techies, oddballs, and assorted edge-cases (eg: someone who wants to avoid the law, mob, stalker, etc).


You have a very active imagination. What’s your favorite metal to make hats out of?



I get why that would be someone's reaction. Unfortunately there's no way for me to explain what concerns me about the way things are going that doesn't make me sound like a prat :(

As outlandish as my comment may come off, all the pieces (maybe aside from Google leaking data) leading to such a future are conventional takes. As far as big companies leaking data, it's not that it happens every day, so much as it is the cumulative nature of leaks: each leak adds to the pile.


And if entrusting everything about your life to one company ever does backfire, you still won't get to call this out, because then it will be 'blaming the victims' ;)

I don't know the likelihood of this all going dystopian, but I did realise a while back that it isn't hard to pick a not-google option early when a thing is new. Rather than attempting to go cold-turkey you can just stop growing more locked in. After a few years you notice you're not stuck any more.


How would you know? It's entirely possible that not using google is a fantastic choice.


Could be. Maybe the Amish are living the peak human experience. I'm not going to find out but you can.


I find Bing considerably closer to Google than a horse-and-buggy to an automobile. In some ways, Google is more like a horse-and-buggy, since, like an aging horse, Google Search increasingly ignores the user's instructions, and does its own thing instead.


Consider that the vast majority of users are not like you and they need help using the right search terms. Searching used to be an art and I vividly recall a lot of folks struggle to articulate in search query terms what they were looking for. Nowadays those types of people find what they're looking for at the cost of you not being able to treat it like a command line interface. I'm sure Google has the data to back up that it's a net gain. You're welcome to use whatever you'd like.


I see two ways in which Google Search is changing. Neither of them work for me.

It is true that, as you point out, the majority of Google users probably prefer a less literal algorithm. Do the ways Google Search is changing benefit people who aren't me?

One way that Search is changing does align with what the average user wants: it increasingly interprets results semantically (synonyms,contextual data,domain-specific knowledge,etc) instead of just 'grepping' text content.

The other way Search is changing is user-hostile. Google is an established monopoly now, whose founders have ridden off into the sunset. As its competition and idealism have waned, so has its willingness to put search quality above profits and strategy.

It does not serve the user to include more paid content, to find excuses to personalize search, to cross-promote other Google products, to make Privacy settings inscrutable, to steer customers to AMP pages, and so on. Google has "actually, it's a feature!" rationalizations for these changes, I am sure. No matter!

If the only metric Google cared about were customer satisfaction, I would agree that their surveys, statistics, and A/B tests matter. Sadly, Google seems increasingly concerned with attributes that are at odds with their users: advertising revenue, marketing, etc.

In my judgment, even for normal users, Google Search's bad changes already outweigh the good. I reckon the bad incentives will remain in effect for years to come, too, which does not bode well for Google users.


Complete tangent, but it turns out I was wrong about horses: apparently they become more agreeable with age, not less.


Maybe not time to switch, but it shows the value in distributing your services across different providers.

Bit annoying to have your email, documents, cloud services, analytics, and video conferencing all go out at the same time, especially when everyone is working from home.

I was supposed to have a Meet call with someone from the UK this morning, I couldn't join the call and they couldn't email me (I could still email them though, as I don't use G Suite for my email).


Duration and frequency of outages. We switch to GSuite two years ago, because it did’t really make sense for us to continue maintaining a Zimbra installation, it cost to much time. GMail has been unavailable at least four times since switching, making it at least twice as unreliable as our old on-premise self-managed mail server.


My guess is it's less about the duration and more about the consequences.

If our VPN was down a week we might not switch because we have an alternate workflow, but if our production servers were down for a few hours we'd switch.

Also, I missed one really important email and that's all it took to switch the hosting for that.


I was already slowly moving off google but then they locked me out of my account. At that point the only service I had been using was gmail.

It's easier to switch than you think.


> It makes me wonder, what duration of outage is necessary for people to actually switch?

If Google's security measures had an outage ...


I'm not sure this is a reference this is relevant regardless: https://www.youtube.com/watch?v=y4GB_NDU43Q


On the bright side, their DNS servers didn't go down, otherwise I would have spent about an hour arguing with my ISP.


It's always good to memorize at least one IP address! Very similar to memorizing at least one phone number in this day and age ;)


My meet call dropped & hangouts chat is failing to connect. Maybe time to switch to Jitsi!


I was using Google Reader and suddenly it vanished? Is that the disruption?


Clever!


I've seen Google Login fail on multiple services including Trello


Was taking important notes using Google Keep yesterday. Then, all of the sudden I couldn't take notes anymore. Couldn't type, it wouldn't accept input. Refreshed and got a 502. Then eventually could take notes again. And then I eventually ended up in a state where my new additions were DELETED.

And now I guess I am not using Google Keep anymore, which is sucky but probably good for me in the long run.


Maybe it's just me, but I try to never do any serious work in the browser, or to never write longer blocks of text in a browser app. And it's precisely for this reason - few applications implement measures for either the server or my internet going offline that work 100% the time.


Firebase api calls don't seem to be working either.


Yeach, we're seeing `Authentication Expired` for our firebase auth calls. Firebase static hosting still looks up.


Google image search is completely down for me


Wasn't gmail down about 1month ago? Jeez, my boss looks at me weird now for switching to gSuite 4months ago


Google Calendar has been spotty for me and Hangouts isn't working at all.


Judging by the tx/sec count on a database I have hosted in us-west2, looks like their network backplane went up & down a few times between 5:58pm and 6:21pm tonight


Actually, looks like it was something between their monitoring system & the DB. Per my prometheus stats, DB kept chugging along the whole time.


Google Voice was down for me earlier, as well as a couple unresponsive Compute engines (despite not showing problematic on this list).


FWIW, the GMail imap interface in Apple Mail is still working alright for me. Can confirm the gmail web interface is completely borked.


I just had problems playing Google Music. I assumed it had something to do with service derating because of the move to YouTube music.


Not sure why Maps shows no disruption. It didn't work at all when I was driving home earlier.


I had issues signing into YouTubeTV.


YouTube TV not working on my phone.


Google Keep experienced a disruption yesterday too, which the dashboard does NOT show.


I was writing a longish note in google keep, and now keep.google.com gives me 502s.


Same here, it lost my note in Google Keep too, it was an important note too.

The dashboard neglects to mention Google Keep experienced a disruption too.


Sorry, I was doing some organizing on my gmail inbox with 100,000 unread messages.


All the diversity hires are working out it seems.


Google Drive down. 502


My Google Voice call was dropped about 5 minutes ago.


I'm also getting intermittent 502s in Gmail


Tesla, Cox and Google in the last 48hrs =/


Tinfoil theory:

This was a test of their kill-switch.


Looks like it might be fixed now?


I can access gmail just fine right now.


2/3 of my Google mail accounts are going up and down like a yoyo. Poor OSX doesn't know how to deal with it


Nest app is not working either.


We just had a hiccup on that too. Seems to be back now.


Sorry, it was me guys. I was just clone the quiche repo and I accidentally brought down the whole system.


Google Analytics down

Edit: Gmail Down


Yep analytics is experiencing issues for me.


I didn't notice any issues at all. Weird


gmail down


Side note: Google'e status page, linked, breaks the back button. How the 'F can a simple status dashboard break something so simple?

(In Google's own browser, too.)


"Please don't complain about website formatting, back-button breakage, and similar annoyances. They're too common to be interesting. Exception: when the author is present. Then friendly feedback might be helpful."

https://news.ycombinator.com/newsguidelines.html


The culprit is that linking to:

  https://www.google.com/appsstatus
Then loads:

  https://www.google.com/appsstatus#hl=en&v=status
If you hit back, it goes to the URL without hash, but then instantly goes to the one with the hash.

Pretty annoying.

(If you paste the 2nd URL directly, you will not have problems.)


So they're using history.push() rather than history.replace().

I love the history API when it's used approprately, but it is a double-edged sword that has to be used carefully. News sites injecting their homepage in history on an article sometimes make me wonder if it's worth it.


Of course it's worth it! You gained valuable insight on which sites utilise a particular dark pattern, and that they probably use others.

I don't bother clicking on TechCrunch articles for this reason.


I don't bother clicking TechCrunch because it redirects to https://guce.advertising.com/collectIdentifiers?sessionId=3_... uuid> and one of the privacy plugins on my Firefox stops me there.


It's a status page, doesn't seem to be a dark pattern intended to keep you engaged and garner clicks, more likely just badly written.

If anything they'd probably want you not visiting it more.


Incompetence, not malice. Who would dark pattern their status page to keep you on it longer—def an accident.


Why are there two comments back to back missing the fact they're obviously referring to this part of the comment:

> News sites injecting their homepage in history on an article sometimes make me wonder if it's worth it.

They even bring up a news site as an example.


I'm not going to assert actual malice, but I think that at Google's scale we can go further than incompetence to bare negligence.


How about canary?


Unfortunately GWT is deprecated so the chance of this getting fixed is slim. Instead bet on this status page getting deprecated and maybe rewritten by 2021.


GWT isn't officially deprecated, although it's not getting much attention.


For whatever reason, adblockers haven't figured out how to remove history spam.


I think a better option would be an immutable history, but pages that are loaded for less than 10 seconds grayed out in the list and pages that are inserted given a light pink background.


I'm not super concerned about history injection other than when blindly clicking the back button. I've used history injection in a legitimate use case requested by users (think master/detail when linking directly to a detail view; back goes back to the master list). I'm not sure how to unobtrusively notify of malicious history injection (like news sites) without downgrading legitimate uses.

Hopefully the answer isn't another permissions request :)


Both of those pollute your history, which sucks, majorly.

We really need a "update uri without side-effects, at least in #scope.


Ironically, this increases their own page’s pagerank


How?


Page views are one of the variables of the page rank algorithm

If you keep going to the site over and over, then potentially it counts as new views (albeit not unique ones)

That said Google probably exempts their own sites from page ranks since they you know, own the search engine


Google does not have insight into page views. They can only know if you clicked through to the site, not how many times you loaded that page.


and for sure they ignore that analytics GET


Yeah I would have assumed they have signaling from use of GA, GTM, Fonts ?


This is definitely a Chrome problem; Firefox works just fine.


WebKit and its forks: Safari, Chrome, and GNOME Web are affected, Firefox is not.


ironically, if this is true for all chrome users, the back button works just fine on firefox + windows.


Works in Chrome & Firefox for me on macOS Catalina but not in Safari.


Back button disrupted for Safari on iOS.


I am sure this is not the most important thing we can find to discuss relating to the featured article.


Why not?


It's things like this that keep me sane though: knowing that even huge companies go offline or break the back button.


I'm sure you know, you can right click or long press the back button (on desktop at least) to access the whole history for that. Still annoying.

https://i.imgur.com/CTxvExS.png


Trying to be way too fancy.


Nobody gets fired for buying AWS.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: