Why should he re-architect his system and introduce some Amazon-specific dependencies like S3 just so he can give Amazon money? How much development time and money will he spend trying to make his setup work on AWS? How many new bugs will he accidentally introduce in the process? How badly will moving to cloud storage via S3 affect his performance v. having the files on a local disk? When S3 outages like the one that happened two days ago occur, will his customers be understanding of his decision to move into the cloud for funsies?
It's really amazing that, when faced with clear evidence of the expense involved in moving to AWS, you suggest that he prepay for a year of service up-front and redesign his application just so Amazon's bill doesn't look so egregious anymore.
My experience aligns with his. We moved from racks with 20-something boxes to EC2 with 100+ instances. Our monthly bill was 80% of the cost of the hardware in our data center.
How did we solve this problem? Why, move to Docker and Kubernetes of course! Over a year of manpower has been devoted to that task. What kind of savage would ever return to bare metal in this enlightened age of expending millions of manhours redoing stuff that was already working perfectly well?
If you want to autoscale, autoscale on the cloud and keep your primary nodes on bare metal. There's no need to start forking over millions of extra dollars to cloud providers to host all of your infrastructure.
Cloud has some unique benefits, but it should be used for those unique benefits only. There's no reason everything has to be moved there.
I'm trying to compare costs in a better way than "lift app from here and move it there". Sure, you can do it that way, but you're moving an app which was written with your current architecture in mind. It's not surprising it will be more expensive after the move.
I'm not advocating to rewrite just to give Amazon money - you're creating straw man arguments here.
I'm just saying that if we're comparing different ways of hosting, we shouldn't pretend they work the same and have the same tradeoffs. Maybe one server approach is optimal. Maybe the cost would be smaller if the service would be split into various components. That's all part of a proper comparison. Including the cost of getting from one arch to the other.
Specific example:
> How badly will moving to cloud storage via S3 affect his performance v. having the files on a local disk?
Depends what they do with the data and how the users interact with it. Maybe there's no data that could be migrated (all records need to be available in memory for the web app), maybe it would be slower but have a trade-off of using smaller instance, maybe it would improve the performance considerably because file-like blobs are not stored in the database anymore and users can get them quicker via local cloudfront caches. There's no generic answer!
As for the outage... S3 was down for a few hours. And for many people it happened when they were sleeping. If your single server goes down in a data centre - what's your cost and time of recovery? S3 going down once a year still gives many companies better availability than they could ever achieve on their own.
If the need to reassess the architecture doesn't convince you this way, think about migrating from AWS to own hardware. If you were using S3, sqs, lambda and other services, are you going to plan for standing up highly available replacements of them on separate physical hosts? (Omg, we need so much hardware!) Or will you consider if it can be all replaced with just redis and cron if you have relatively little data?
>Why should he re-architect his system and introduce some Amazon-specific dependencies like S3 just so he can give Amazon money?
You could make the same argument going the other way though - why should GitLab spend money recruiting/hiring people, leasing space, setting up monitoring, etc, if their solution today works? Why should anyone re-architect their system to give $COMPANY money? When your bare metal's RAID controller craps out and you have to order another, will his customers be understanding of his decision to move to bare metal?
It doesn't makes sense to factor in fixed costs of such a migration.
>You could make the same argument going the other way though - why should GitLab spend money recruiting/hiring people, leasing space, setting up monitoring, etc, if their solution today works? Why should anyone re-architect their system to give $COMPANY money?
Well, that reason would be, at a minimum, a halving of their hosting expenses.
It also doesn't necessarily take much refactoring to move either way. Even if you're heavily dependent on cloud storage, etc., you can access that from an external application server.
The person I replied to was suggesting that the parent refactor in order to make AWS costs less egregious without articulating any particular reason that the original commenter should do so.
> When your bare metal's RAID controller craps out and you have to order another, will his customers be understanding of his decision to move to bare metal?
His customers won't need to know, assuming he has a standby that can take over. Even if he doesn't, he can rush down and install one and move the disks over. With an AWS outage, you can't do anything but say "I hope Amazon fixes it soon". The bare metal equivalent is a power or connectivity loss at the DC, which is much rarer than AWS outages.
It's really amazing that, when faced with clear evidence of the expense involved in moving to AWS, you suggest that he prepay for a year of service up-front and redesign his application just so Amazon's bill doesn't look so egregious anymore.
My experience aligns with his. We moved from racks with 20-something boxes to EC2 with 100+ instances. Our monthly bill was 80% of the cost of the hardware in our data center.
How did we solve this problem? Why, move to Docker and Kubernetes of course! Over a year of manpower has been devoted to that task. What kind of savage would ever return to bare metal in this enlightened age of expending millions of manhours redoing stuff that was already working perfectly well?
If you want to autoscale, autoscale on the cloud and keep your primary nodes on bare metal. There's no need to start forking over millions of extra dollars to cloud providers to host all of your infrastructure.
Cloud has some unique benefits, but it should be used for those unique benefits only. There's no reason everything has to be moved there.