Unfortunately, the fight goes even further than that when you go against the cloud.
Last week I was in an event with the CTOs of many of the hottest startups in America. It was shocking how much money is wasted on the cloud because inefficiencies and they simply don't care how much it costs.
I guess since they are not wasting their own money, they can always come up with the same excuse: developers are more expensive than infrastructure. Well... that argument starts to fall apart very quickly when a company spends six figures every month on AWS.
I'm on the other extreme. I run my company stuff on ten $300 servers I bought on eBay in 2012 and put inside a soundproof rack in my office in NJ, with a 300 Mb FIOS connection using Cloudflare as a proxy / CDN. The servers run Proxmox for private cloud and CEPH for storage. They all have SSDs and some have Optane storage. In 6 years, there were only 3 outages that weren't my fault. All at the cost of office rent ($1000) + FIOS ($359) + Cloudflare costs and S3 for images and backups.
With my infrastructure, I can run 6k requests per minute on the main Rails app (+ Scala backend) with a 40ms response time with plenty of resources to spare.
Only a Rails developer would think 6k requests per minute with 40ms latency is reasonable with all that hardware. If you rewrote it you probably only need 1 server but you will probably make an argument about how developer time is more valuable :)
I'm talking about a real application here. With 100s of database and API calls on each web page load. I could make the whole thing in Golang or Scala and that would be at least one order of magnitude faster. But then I would have to throw away all the business knowledge that was added to the Rails app.
For instance, the slowest API call on the 40 ms is one that hits an ElasticSearch cluster with over 1 billion documents and is made on a Scala backend using Apache Thrift. There's a lot of caching but still, long tail and customization will kill caching at the top level.
It sounds very similar to the Rails frontend I helped replace at Twitter — no business logic was thrown away, it was done without any loss in application fidelity. We did get approximately 10x fewer servers with 10x less latency after porting it to the JVM as a result. However, without the latency improvement, I don't think we should have done it. Fewer servers just isn't as important as the developer time necessary to do the change as you just pointed out. Just as using the cloud to simplify things and reduce the amount of effort spent on infrastructure is the main driver of adoption. There is clearly a cross-over point though where you decide it is worth it. The CTOs you are speaking of are making that choice and it probably isn't a silly excuse.
It's a matter of dosage. You might be talking about how another tech stack would perform better at this metric, but the price of that is the company would have been simply unable to ship anything in the time they could afford.
Swinging the conversation beyond the dosages of either side doesn't produce interesting insight
I'm not sure how many outages I could avoid using Heroku, but I guess at least a few.
One time I was using Docker for a 2 TB MongoDB and it messed the iptables rules. I notice everything slow for a few days until the database disappeared and when I logged in to check, there was a ransom note.
I flew from Boca Raton to NJ to recover the backup and audit if that was the only breach. That was the longest outage.
Like Rome, this infrastructure was not created in one day. Adding Optane storage is something more recent, for example. Or adding a remote KVM to make easier to manage than dealing with multiple DRACS, which I did after a moved to Florida.
But I'm not against using the cloud. I'm actually very in favor. What I'm against is waste.
In my case, being very conservative with my costs and still have a lot of resources available allowed me to try and keep trying many different ideas in the search for product/market fit.
Are you using the Optane SSDs for Ceph as you mentioned in your original comment? Curious what benefit you're seeing if not for Ceph, and if for something else, would you mind commenting? We're looking to share best practices with the community we're building over at acceleratewithoptane.com on how to take advantage of Optane SSDs.
Anyone who thinks spending $10 instead of $1, so that you can book the expense (presumably for a tax write-off which might save you 30%... MAYBE?) needs to stay away from Finances.
You can typically take the full depreciation in the year you purchased the equipment under IRS Section 179, up to a limit which varies depending on which way the wind is blowing in Congress. For 2018 the limit is $1MM. Whether or not its more beneficial to you to take the depreciation over time is a question for your accountant—technically if you later sell the equipment you're supposed to recapture the revenue from the sale for tax purposes.
Last week I was in an event with the CTOs of many of the hottest startups in America. It was shocking how much money is wasted on the cloud because inefficiencies and they simply don't care how much it costs.
I guess since they are not wasting their own money, they can always come up with the same excuse: developers are more expensive than infrastructure. Well... that argument starts to fall apart very quickly when a company spends six figures every month on AWS.
I'm on the other extreme. I run my company stuff on ten $300 servers I bought on eBay in 2012 and put inside a soundproof rack in my office in NJ, with a 300 Mb FIOS connection using Cloudflare as a proxy / CDN. The servers run Proxmox for private cloud and CEPH for storage. They all have SSDs and some have Optane storage. In 6 years, there were only 3 outages that weren't my fault. All at the cost of office rent ($1000) + FIOS ($359) + Cloudflare costs and S3 for images and backups.
With my infrastructure, I can run 6k requests per minute on the main Rails app (+ Scala backend) with a 40ms response time with plenty of resources to spare.