I don't know how many hours the author has spent/wasted on this topic (and how many people are now wasting their time coming up with other solutions that in the end do NOT fix the real problem), but...
Quite frankly, I find this whole "look i did a horrible hack and let's see who can make the best worst horrible hack" thing quite stupid and silly.
GNU Make is free software released under a free license, my opinion is that instead of doing that crazy thing, the author could have just written a patch for GNU Make in order to make it export a "JOBS" environment variable to all its child processes.
Oh but yes, "I felt this feature was missing so I added it" is way way way less cool than "geez the gnu make folks are insane lollerplex they have no way to know how many jobs they're running".
> I don't know how many hours the author has spent/wasted
> on this topic
The author, John Graham-Cumming, has been extremely active on the GNU Make mailing list, is the author of the "GNU Make Standard Library" of supplemental functions for Make, has written at least 2 books on GNU Make, and has developed commercial products that integrate with GNU Make (namely, "Electric Make").
That is to say: He is one of the world experts on GNU Make.
I'm confident that he did this more "for fun" than because he actually wanted to get the number of jobs.
Do you know if this person hangs out in IRC? I asked this precise question 10 days ago in #emacs, and I was curious to see my precise question answered, almost the way I phrased it.
Ignoring the fact that I doubt that the original post was serious:
Of course you can patch the software and in the long term that will of course be the best solution. But it can take a long time until that patch actually makes it into the main repo, and even longer until it will be in the default `make` on all major distributions. So if you want to use this feature right now then writing a patch is useless for you, unless you only ever use it on your own machine.
The big advantage of the workaround is that it actually works with current versions of make.
Oh, boy. Take a chill pill man! Is no one allowed to do anything for fun any more?
It took me about 15 minutes to come up with the solution. I wrote it up for fun because it illustrates some of the functionality of GNU make that people might not be aware of (order-only prerequisites, $(eval)) and using $(call) in a recursive fashion.
I mean, not only NSA and such organization were theoretically able to get a photo of you without your consent, you they can get an actuall 3D reconstruction of your whole face.
1) you can't get new hardware delivered and get it up and running, all in under 10 minutes.
2) also, assuming the previously gathered hardware is not needed anymore, you can't just return it and say "i used it only for two days because i had a traffic spike, take this 200$ and we're okay.
3) you can't programmatically install, configure, reinstall and reconfigure hardware configurations, networking and services on phisycal services. At least, not as easily.
Many others, but these are very valid points.
Of course, Amazon is not the solution to all of the problems you could ever have, but still it solves a great deal of problems.
1) ok got me there (but SoftLayer does offer VMs too, which presumably have a faster turn up time)
2) SL has hourly physical server rental now (turn up is quoted at 20-30 minutes though)
3) SL has an API for ordering changes, you can setup a script to run on first boot (and probably system images too). What are you thinking for network configuration? Really the only thing I've had to configure on SL is port speed (somewhat API accessible, but not if they need to drop in a 10G card/put you on a 10G rack, etc), and disabling the private ports (API accessible, real time changes).
There's not much need for a fancy article on a fancy website in order to understand a key concept of cloud computing:
Cloud computing offers you the great and awesome advantages of being able to instantly scale your application, replicate your data and basically just grow according to your business volume, and all this without significant investments, delivery time, setup time, people time, maintenance or anything but it's expensive in the long run.
And this is OKAY, this is GREAT.
Once you're big enough, you know what your load is now and what your load will likely be, and you know exactly what you need now and (approximately) what you're going to need in the near future, setting up your own datacenter is way, way more effective.
Amazon does not get free electricity, free servers and/or free people time.
Of course, you're paying that, and you're also paying Amazon's profits.
This is absolutely fine, as long their service fits you.
But when you grow enough, put simply, your needs change. It's just that.
The reason why it is extremely hard to engineer robust large scale AWS cloud apps can be summarized under the umbrella of performance variance:
- machine latency varies more, you can't control it
- network latency varies more
- storage latency varies more (S3, Redshift, etc.)
- machine outages are more frequent
where more can be an order of magnitude more variation than on bare metal deployments. I am not saying the performance is that much worse, only that it will unpredictably vary for a certain instance. The interference is non gaussian and can happen in bursts as opposed to easy-to-model-and-anticipate white noise.
It's a lot harder to engineer cloud scale software to scale robustly and not degrade in latency when running on a large amount of nodes. For example, see [1]
Most of open-source cloud software does not come with these algorithms batteries included and it is not trivial to retrofit this kind of logic. Just being smart about loadbalancing won't cut it when at any given moment one of your nodes will become 10x slower than others even though your code is sound and in fact does not slow down like that.
In fact, what you lose in AWS convenience and "free" maintenance, you gain in simpler RPC/messaging/fault tolerance/storage infrastructure that can sometimes accommodate an order of magnitude more traffic or users on a machine then if deployed in AWS.
> There's not much need for a fancy article on a fancy website in order to understand a key concept of cloud computing:
I wish it were true, but plenty of companies are gripped by cloud fever. I've seen quite a few going down the route of charging into the cloud not because they've run the numbers and found it stacks up, but because they want to be in the cloud, and Amazon have some great marketing people.
It is interesting coming at it from the other side, I get hit up all the time to move my infrastructure to the cloud, I give them the specs and what I pay per month, tell them that if they can beat it I'll switch. The closest anyone has come has been about 4x the cost I'm currently paying.
But I note that I have a unique case that isn't covered by the cloud "architecture" (crawling the web and indexing it). Explaining that it is "ok" if I am in the 10% not covered by your 90% solution. But sales folks never like to hear that.
Yeah, it's always fun when I tell someone at we're 100% self-hosted too. They never stop to think that traffic, load, and usage patterns on craigslist are pretty well understood at this point. Plus it's certainly cheaper than paying someone else to run all your hardware. You get to pick your peers, pick your hardware, and allocate as you see fit.
I've had many clients on AWS rack up six-figure monthly bills unnecessarily simply because they didn't realize how to design for cost.. it's not a silver bullet. Those penny fractions add up fast. There are a lot of tricks to save money.
We run into this at FastMail all the time as well. For the one customer who needed "their own gear" we built out on SoftLayer for much the same reasons as the article - real hardware where you need it.
But our operations costs are so much lower than renting that we could replace all our hardware every year and still break even.
(yeah, we could get SoftLayer to refresh our hardware every year as well, but we don't need it refreshed that fast, and at the end we still own the hardware)
Often times companies move because their organization doesn't deliver, and the hope is a cloud company will do a better job of it.
If you have a great team, I firmly believe hosting yourself is far, far, far less expensive.
If you have a terrible team, then cloud (hosting) is less expensive. Even if it was exactly the same cost, you're gaining by not having to have a staff to run it, and the costs of managing them, etc etc.
Most places don't have great teams. Insert random corporation here likely has a team that is a mess for whatever reasons happen in large companies.
In that case, the Cloud makes a ton of sense for them. They've already screwed up their own organization in some way, and this is a large reset button on the whole thing.
The cost of the systems is almost irrelevant. The time of qualified people is far more expensive. Sot it's not just the elasticity. You have to look everything from the time perspective as well. What additional knowledge will I need? Will I need to learn load balancers, how to make them highly available? will I have to learn about SSL certificates and termination? Will I need to learn how to implement and operate a secure and highly available DNS service? How much is your time worth?
Our hosting costs are a fraction of a single worker's salary. Whether AWS is more expensive is essentially irrelevant. OP argues that it was not doing the job for them which is an entirely different issue.
Don't forget the quality of the work. Frankly, if I'm administering servers I'm not going to do as good a job of it as someone whose whole job is that and I'm not going to do as good a job as Azure or AWS or whoever else either. Platform as a service is the way to go.
Because if your sysadmins are all moonlighting developers having them not develop stuff to do subpar system work is more expensive than just paying the cloud premium.
From personal experience this is absolutely true. The initial cost to migrate and learn the cloud practice is unbelievable, but once you have the process in place running on Amazon CAN save some cost down the road, including the hours you need to replace hardware. You will build tools or use existing tools to create your infrastructure and operation process. A generous estimation to reach a level of maturity is 1.5 year. For ever-growing business this cloud fever is acceptable.
You can definitely save cost by subscribing to reserved instances, but the downside is you have to put down money upfront, which is very hard for many small players out there.
But watch out if you run data pipeline jobs - sometimes your so-called big data is really not that big. A few GBs daily report doesn't need to run on c3.xlarge instances. They can do just fine with a 24/7 m3.large instance. There was an article on HN a while ago about how one could run a custom report with shell commands on a commodity hardware, and get 100x times performance compare to running on EMR. You can also consider running most of your jobs on premise. Ihe network banwidth in/out is probably going to be cheaper than running all of your jobs on EMR. Direct connect is a great choice to boost the connectivity stability and security. Go for it.
Cloud is great for HA, because on Amazon you are encourage to build in multi-AZ and even multi-region. S3 is absolutely the de-facto today IMO for object storage. It's cheap and reliable. The learning curve for proper Amazon (or just about any cloud provider) is really deep. You can either end up like running Black Friday sales, or running like Netflix with monkey enjoying tea.
Running on cloud is no different than running on-premise, just you have to start all over again, because now you have to re-consider network, security, monitoring, and practice.
"You can definitely save cost by subscribing to reserved instances, but the downside is you have to put down money upfront (...)"
This isn't required anymore.
AWS introduced new Reserved Instances options few months ago, including a "no upfront" option which still gets you ~40% discount over on-demand prices.
My old company moved, after I left, from a dedicated system I built that provided less than 1s response times to a cloud system that averaged 8s response times, because the company focus changed from maximizing performance to maximizing sizzle buzzwords.
Sounds akin to the outsourcing binge that went on a decade (?!) or so back.
Back then it seemed most companies did it not because they had done the numbers, but because a few big names had done it and so the others did it to apparently piggy back on the stock markets attention.
in fact it is related to outsourcing. in the past outsourcing processing involved shipping drives of data overseas. The risk and schedule hit were big, the offshore companies had crap hardware that usually cost a lot more than it cost in the US. That quieted down for a while. The new cloud stuff got these people interested again in processing outsourcing, only now the data is shipped to google.amazon/whoever and the outsource employees use the cloud vms to run the processing. I wonder how long before this mini trend starts slowing down when they finally figure out that management overhead and poor communication is in most cases the main problem with outsourcing.
That's cause you need to use the feature of your cloud.
Run lots of servers and shoot them, when you dont need them. That's why the cloud is better than anything else.
But most people just rent their xl3.large instance.
If only it were that simple... But every second relatively young engineer I interview points out that "But Netflix! Look how good the cloud works for them!" and once again I need to explain, that Netflix spends millions in engineering resources cost to handle EC2 issues and we are not (almost nobody is) Netflix (yet?).
Not sure what you mean with "datacenter". If you mean using co-location or similar it probably works out as a small price difference. If you start with an empty room or an empty piece of ground it might cost you a lot to start building everything and hire people to manage it all. Remeber that the big cloud providers can get a scale benefit where a lot of virtual computers share a small operations team. I do think that physiccal servers usually are faster than cloud computers meaning you might be able to use fewer of them to compensate for the price difference.
I see there are already some contrarian-because-I-feel-like it replies, but this is a really succinct way to put it. Of course there are exceptions and edge cases, but EC2 is all about elasticity. If that doesn't appeal to you, don't touch EC2. If you aren't using the "E" in EC2, there are far cheaper, more performant alternatives.
On a side note, I feel like EC2 is simultaneously the best and worst service on AWS.
Amazon is not know for posting operating profits, because they plow funds back into new research and businesses. But they most definitely operate with gross margins on everything they sell.
Update: in FY2014, they sold $89BN ($70.1BN in products, $18.9BN in services), and their cost of sales was $62.8BN. I am pretty sure that their margins are razor-thin in retail, but the service side had higher margins.
A Motley Fool article claims that while AWS is successful, it isn't profitable, mainly because of the accounting behind capital expenditure. Last year Amazon spent more in interest than their entire operating income. http://www.fool.com/investing/general/2015/02/04/amazon-just...
Profit for something like AWS is likely difficult to put a number on.
They built it so they could run Amazon.com off the infrastructure, but then decided they could make money leasing their under-subscribed portions.
If they are not making a cut-and-dry profit from AWS (I'd fathom they are, they are one of the most expensive cloud providers and many other providers turn a healthy profit), then they are at least dramatically offsetting their own Amazon.com infrastructure costs which are mandatory anyway.
The reason they're not making profits as a company is they're spending the money on new projects.
That doesn't mean the AWS part is in itself unprofitable. I'd wager it is, given how the business is somewhat mature and they have a huge chunk of that market.