There are several large cloud platforms like Azure, Google Cloud and AWS. Major sites run on these platforms, but have good reasons to do so.
Its like with airplane crashes vs car crashes. The chance that you get an accident with your car is significantly higher, but usually the airplane accident ends up on "BREAKING NEWS". Its just a matter of impact.
> The chance that you get an accident with your car is significantly higher
There's a difference between fatalities and accidents. Fatalities are surprisingly low. Plus, Air travel is only significantly safer per-km, per-journey it's not at all safer [1].
> Air travel is only significantly safer per-km, per-journey it's not at all safer
Of course there are far more car journeys taken than plane ones, so the GP's point stands (and since the point was just to make an analogy, unless you disagree that plane crashes get far more news coverage than car crashes, I'm not sure what the point of disagreement was).
(note those numbers also come from the UK, 1990-2000. If you took the numbers from the US, the car fatalities would be significantly higher (per capita or per km) and if they were from the last decade the air fatalities far lower)
Agreed. But while this outage was global, saying "GCP went down globally" overstates it.
AWS doesn't have a comparable service to the global version of GCP's load balancer service, which is what went down. Other GCP services were only affected to the extent that they use global load balancers for ingress, which varies by service.
For most GCP services, the ingress method is under the user's control and a switch to regional load balancers in one or more regions (whether split through GeoDNS or through round-robin) would have been a workaround.
Admittedly one point of global load balancers is to be able to mitigate a lot of other outages... I guess the secondary lesson here is to keep a short TTL on top-level DNS entries which point to the load balancers, and ideally have two DNS providers in the mix too.
> AWS doesn't have a comparable service to the global version of GCP's load balancer service
Intentionally, because then customers relying on a service like this, would have regular global outages.
This is the 3rd global outage GCP has had in less than a year. As far as I remember, AWS hasn't had one in many years (but, too many people put everything in us-east-1, so a short single-region outage - which happens very seldom - seems to take down half the internet).
If you check the other comments you'll see that Spotify, Snapchat, Discord etc. were all affected. We're not talking about "sites" but any application or site built on Google's infrastructure, even partly.
Thanks for the information. I see now why I hadn't noticed. I don't Snapchat anymore, I switched to Google Music, and I haven't adopted Discord. I guess I'm not in with the cool crowd anymore!