Disclaimer, not a job-toting sysadmin quite yet, but here's my 2¢:
- Architectural SPOFs (single points of failure) need to be carefully weighed up in any design, and "ALL our files are on $single_provider" is one such huge red flag. Unfortunately these considerations are all too frequently drowned out by the ease of going with the least path of resistance.
For example GitHub occasionally goes down, which breaks a remarkable amount of infrastructure: a huge number of people don't know how to use Git, do full clones from scratch each time, and have no idea how to work without a server (even though Git is built to work locally); CI systems tend to want to do green-field rebuilds, so start out with empty directory trees and need to do full clones each build (I'm not sure if any CI systems come with out-of-the-box Git caching); GH-powered authentication systems fall apart; etc. Kinda crazy, scary and really annoying, but yeah.
In terms of "nail in the coffin", that depends on a lot of factors, including a subjective analysis of how much local catastrophe was caused by the incident; subjective opinions about the provider's reaction to the issue, what they'll do to mitigate it, perhaps how transparent they are about it; etc.
Ultimately, the Internet likes to pretend that AWS and cloud computing is basically rock-solid. Unfortunately it's not, and stuff goes down. There were some truly redundant architecture experiments in the 80s (for example, the Tandem Nonstop Computer, one of which was recently noted to have been running continuously for 24 years: https://news.ycombinator.com/item?id=13514909) but x86 never really went there, and superscalar computing is built on a sped-up version of the same ideas that connect desktop computers together, so while there are lots of architectural optical illusions, well, stuff falls apart.
- Everyone in this thread is talking about Google Compute Engine, but it really depends on your usage patterns and requirements. GCE is pretty much the single major competitor to AWS, although the infrastructure is _completely_ different - different tools, different APIs, different pricing infrastructure. The problem is that it's not like like MySQL vs PostgreSQL or Ubuntu vs Debian; it's like SQL vs Redis, or Linux vs BSD. Both work great, but you basically have to do twice the integration work, and map things manually. With this said, if you don't have particularly high resource usage, VPS or dedicated hosting may actually work out more cost-effectively.
TL;DR: you go back to the SPOF problem, where _you_ have to foot the technical debt for the reliability level you want. Yay.
- Architectural SPOFs (single points of failure) need to be carefully weighed up in any design, and "ALL our files are on $single_provider" is one such huge red flag. Unfortunately these considerations are all too frequently drowned out by the ease of going with the least path of resistance.
For example GitHub occasionally goes down, which breaks a remarkable amount of infrastructure: a huge number of people don't know how to use Git, do full clones from scratch each time, and have no idea how to work without a server (even though Git is built to work locally); CI systems tend to want to do green-field rebuilds, so start out with empty directory trees and need to do full clones each build (I'm not sure if any CI systems come with out-of-the-box Git caching); GH-powered authentication systems fall apart; etc. Kinda crazy, scary and really annoying, but yeah.
In terms of "nail in the coffin", that depends on a lot of factors, including a subjective analysis of how much local catastrophe was caused by the incident; subjective opinions about the provider's reaction to the issue, what they'll do to mitigate it, perhaps how transparent they are about it; etc.
Ultimately, the Internet likes to pretend that AWS and cloud computing is basically rock-solid. Unfortunately it's not, and stuff goes down. There were some truly redundant architecture experiments in the 80s (for example, the Tandem Nonstop Computer, one of which was recently noted to have been running continuously for 24 years: https://news.ycombinator.com/item?id=13514909) but x86 never really went there, and superscalar computing is built on a sped-up version of the same ideas that connect desktop computers together, so while there are lots of architectural optical illusions, well, stuff falls apart.
- Everyone in this thread is talking about Google Compute Engine, but it really depends on your usage patterns and requirements. GCE is pretty much the single major competitor to AWS, although the infrastructure is _completely_ different - different tools, different APIs, different pricing infrastructure. The problem is that it's not like like MySQL vs PostgreSQL or Ubuntu vs Debian; it's like SQL vs Redis, or Linux vs BSD. Both work great, but you basically have to do twice the integration work, and map things manually. With this said, if you don't have particularly high resource usage, VPS or dedicated hosting may actually work out more cost-effectively.
TL;DR: you go back to the SPOF problem, where _you_ have to foot the technical debt for the reliability level you want. Yay.