Hacker News new | past | comments | ask | show | jobs | submit login
Amazon EC2 Spot Instances - And Now How Much Would You Pay? (aws.typepad.com)
101 points by jeffbarr on Dec 14, 2009 | hide | past | favorite | 48 comments



It never ceases to amaze me how an online bookstore can be so ahead of all the self-professed hosting providers.

I'm excited to see what innovations come out of this. Of course, this will only become really interesting when there are other providers doing the same.


Its origins as a bookstore are incidental. Amazon started out as an online bookstore because Bezos knew the web (or web commerce, I'm not sure which) was growing at 2300% a year and wanted to get in on it in the best possible area. That area turned out to be books, because they were cheap to ship; required catalogs too large to print, but easy to browse through online; and had a large profit margin. It then expanded from there. It was and always will be about general online commerce.


It never ceases to amaze me how an online bookstore can be so ahead of all the self-professed hosting providers.

Scale. It doesn't make sense to do things like this until you have thousands and thousands of nodes -- most hosting companies simply aren't anywhere near as big as Amazon.


I disagree. When Amazon started out, their datacenter couldn't have been nearly as big as those of top hosting companies. I can't imagine it would have taken more than a handful of Google-style shipping crates to host the site.

I'd attribute their innovation more to their core values:

Ignoring competitors - probably the biggest factor. They actually sat down and thought about what people want, rather than doing what everyone else was doing. This puts them ahead of traditional hosting providers, as well as competition-oriented companies like MS.

Customer focus - having an insanely strong focus on the customer distinguishes them from equally innovative competitors like Google.

Commitment - Amazon recognized from the start that they had limited resources, and very carefully chose to enter the cloud computing market, ignoring all the other proposals. This focused them more, while also giving the public more confidence in the service. Compare Google App Engine, which isn't taken with the same seriousness.

Other than that, relative small size helps. Their customers know that Jeff won't eat them for lunch at the first opportunity. And having a few billion dollars of cash in the bank doesn't hurt either.


"When Amazon started out, their datacenter couldn't have been nearly as big as those of top hosting companies. I can't imagine it would have taken more than a handful of Google-style shipping crates to host the site."

If you're talking about when EC2 first started (2006), then I think you underestimate Amazon's scale by a couple orders of magnitude. For example, it's been a long time since Amazon has had "their datacenter" rather than many large datacenters around the world. [Note: I was an Amazon employee from 2005 to 2008.]

Consider that Amazon had $10 billion in revenue in 2006, (how many pageviews does it take to end up with that many orders?) and has a lot of very complex processing under the hood (over 200 distinct services called just to render the home page; search and recommendation engines; payments; inventory control; fraud detection; etc.). Then you can start to estimate just how much infrastructure they require.


When Amazon started out, their datacenter couldn't have been nearly as big as those of top hosting companies.

Sure. But they didn't do Reserved Instances and Spot Instances when they started out.

You only need a few racks in order to make renting Xen instances by the hour feasible; but in order to make a market for Spot Instances reasonable, you need to have enough "spare" capacity to support a stable market (i.e., one where a single customer won't dramatically shift the market price) -- given the variety of instance sizes, I doubt this would work without having hundreds of racks of spare capacity.


Gotcha, you were talking specifically about spot pricing. I read the OP's comment more as asking "Why are they so consistently innovative?".


Customer focus - having an insanely strong focus on the customer distinguishes them from equally innovative competitors like Google.

I'm not so sure about that -- Google certainly claim (and intend) to be very strongly focused on the customer. Do you think Amazon are more effective at this than Google?


Both are highly data-driven and responsive to customer metrics. But Amazon sees personalized customer support as a core value that they provide in-house. Google sees support as a pure cost, to be eliminated through automation.

(That's not to say that Amazon is always great at support, but I do feel that there's a different attitude about it, having worked at Amazon and having many friends working at Google.)


Because Bezos never thought of Amazon as (only) an online bookstore. And Amazon has not been just an online bookstore for a very long time.


Genius.

This maximizes their hardware investment and allows a whole new class of applications which need large amounts of cheap computing power but are not fussy about when the work gets done.

GENIUS. Revolutionary beyond my capacity to visualize. This is going to change the way the cloud computing space works.

The really interesting part is that physical location is one of the price parameters: picture the spinning globe, with real-time applications constantly migrating to instances where it's daytime for maximum performance, and asynchronous applications hugging the dark side where computing is cheap.

As a friend of mine said, this is to cloud computing as adsense was to online advertising: i.e. revolutionary, and an absolute cash fountain.


What's interesting about this to me is that when combined with the currently free inbound bandwidth, you could have a pretty damn cheap crawler.


I was about to post a puzzled disagreement, but they did waive inbound bandwidth charges a week ago. http://aws.typepad.com/aws/2009/12/aws-price-reductions.html


Now if they would get their outbound charges in line with what you pay at 'regular' hosting providers it would actually start to make sense. Currently if you shop around a bit you can pay anywhere from $.80 to $1.00 for a dedicated megabit, including the machine.

Amazon doesn't come close to that.


Currently if you shop around a bit you can pay anywhere from $.80 to $1.00 for a dedicated megabit, including the machine.

You're clearly a much better shopper than I am. Where are you seeing these rates?


I'm not going to advertise my hosting provider here, but I'll mail you.

j.

mail in your inbox. oh, and my bad but it should have been 'euros', not $ (force of habit, sorry).


I'd be interested in the same too :-) email is in my profile... Thanks


I'd be interested in the same too :-) email is in my profile... Thanks


Um, genius? You're suffering from hyperbolism.

It's technically impressive what they've done with there infrastructure from both a engineering and business/management perspective. But the idea is just time-shared computing which has been reimplemented a few times since the first companies offered compute time on their mainframes.


I wrote a list of jobs I could perhaps pipe to this method of instance procurement but the fact that the process could be terminated at any moment really messes things up.

I'm not even talking about "high-availability", but the cessation of a key node in a dependency tree of instances messes stuff up (at least for my problem space).

What is missing here is a "5 minute warning" API call before shutdown that would let me engineer in a) the opportunity to save any long-duration calculation to disk and b) have my software make a decision to instantiate a full-price instance if my queue is too long and having no cheap-o priced instances running is going to mess things up.

Thoughts?


Some sort of warning system sounds nice, but since your program should be saving as often as it can anyway, a 5 minute warning wouldn't actually be useful because if there was a save point, you've saved, and if there wasn't, well, there wasn't and you couldn't have saved given a 5 minute warning anyway. Amazon also, of course, pushes all their other products to help deal w/ this problem.

They do also mention that it is 'by the hour', which I'd double check that it doesn't include fractional hours, but after the instance has been up for ~55 minutes, you have your 5 minute warning.

You do, however have a really good point with the up-sell off the 5-minute warning. I wonder if that could be too easily gamed. When that 5 minute warning comes the spot instances tell the normal-instance key node, which then requests a bunch of spot instances at 2x the current spot-instance price to be able to finish the current 'set'. (But then also request a bunch more spot instances at below the current spot price, in an attempt to drive down spot instance pricing.)

The other thing is that it is region specific; which means that you could target the EU region from the US west coast to get work done on spot-instances during the day.


after the instance has been up for ~55 minutes, you have your 5 minute warning.

Nope; Amazon can terminate spot instances at any time. Thought experiment: Assume EC2 is full (which it will be from now on) and someone requests a regular instance. To create the instance promptly, EC2 will have to shut down a spot instance.

(But then also request a bunch more spot instances at below the current spot price, in an attempt to drive down spot instance pricing.)

That's not how auctions work. The number of bids below the winning price is irrelevant.


<i>a 5 minute warning wouldn't actually be useful because if there was a save point, you've saved,</i>

Not with cloud storage where disk I/O costs money. You really only want to save at the end of a task rather than during it. Or you save the results off-cloud in a back store hosted elsewhere.

(all of which assumes you are working at scale, where the cumulative costs become prohibitive)


Run the key nodes on ordinary EC2 nodes, and only use Spot nodes for things that you can cope with disappearing i.e. things that can be frequently checkpointed and won't disturb other nodes when they disappear.


b) have my software make a decision to instantiate a full-price instance if my queue is too long and having no cheap-o priced instances running is going to mess things up.

If having your instance die is going to cause problems, why not just bid higher in the first place? You're only going to pay the spot price, regardless of what you bid.


That's true.

but thinking that out... I guess if Amazon allowed everyone to programatically do that via an API warning then I guess everyone would simply build their software to bid up 1c at the time of warning... and at scale, everyone doing that would simply inflate the spot price higher and high to the point it would approach the full cost.


That's assuming people need to keep things running at all times, whatever the cost. This is for those people who want to do their calculations only when CPU is cheap.


No because this could become a hack... using the mechanism I describe above I could keep instances going at below market rate.

I can see some people gaming the system in this way if the system allows it


The instance always runs at market rate. You set a _maximum_ price but are charged less than that if the market rate is lower. So there's no sense in doing such a reactive system when amazon does it for you.


have my software make a decision to instantiate a full-price instance if my queue is too long and having no cheap-o priced instances running is going to mess things up.

Maybe auto scaling can do this for you; if it detects that the group has no instances it will start a new (full-priced) one.


"The simplest way to know the current status of your Spot Instances is to monitor your Spot requests and running instances via the AWS Management Console or Amazon EC2 API."

Hmm. Since this new feature can actively terminate instances, it makes a problem I have with EC2 even more of an issue: I'd like a way to query with your instance ID and find out exactly when it terminated and how much bandwidth it used (so something like a "usage/accounting" API). It would help greatly to know your exact financial situation with Amazon. A notification system (register an URL to get a POST when an instance dies or if its bandwidth use exceeds a certain threshold, etc.) would also be useful.

If a user wants to have very accurate information about these things, they currently need to set up a HA environment offsite to poll (which may not even be enough if there are network issues). That has always frustrated me when the information required ("how much have I spent?") is right there in the database after the instance is done.


They may have just added that feature to the spot instances the command is called: ec2-create-spot-datafeed-subscription

This is the description: "Creates the data feed for Spot Instances, enabling you to view Spot Instance usage logs. You can create one data feed per account."

I havent tried it yet, but if anyone has can they post their impression?


It's uploading the same log file every 30 seconds to my S3 bucket, so I suspect they're still working out the kinks :)

That said, the documentation on the datafeed stuff is annoyingly sparse - the format is reasonably self-documenting, but I wish they'd commit to it somewhere.


This is the moment of commoditisation of computing. I reckon there will be price fixings at a commodities exchanges like the Chicago CBOT in the forseeable future, and standard node classes, provided by various vendors.

What will be the consequences? Probably quite some pressure on margins and on quality, the only variables where you can vary with the product. Once there are other vendors, of course.


This is the moment of commoditisation of computing.

I'd call it the next moment in the ongoing commoditization of computing. But everybody has a different threshold of perception. ;)


At this point, I'd call it the just passed moment, not the next moment -- but maybe we can compromise and agree that this is a moment in the commoditization of computing. :-)


Instead of outright termination, they should give the instance creator a choice when the spot instance is first started:

1. Terminate outright when the time comes.

2. Give the instance a 5 minute warning (for a fee).

3. Keep the instance running but switch it to a non-spot pricing model per hour as a typical EC2 instance.

Much more friendly and option 3 will give them greater revenue.


> Much more friendly and option 3 will give them greater revenue.

Or perhaps not: if this were the option offered, no one would pay for standard EC2 instances at all, they'd just put up spot reservations and take the lower price when it's available, paying the ordinary price when it's not.

They can do that now, of course, but there's significant more complexity involved.


I'm no economist, but my suspicion is that the auto-termination is a key factor in maintaining a separate market value for this class of instances.


They need to make it so that when you do an elastic mapreduce job, you can get the cheapest job that will finish in a specified amount of time.


... the cheapest job that will finish in a specified amount of time.

And how exactly would you determine that? We're talking about a spot market here -- and a large part of the supply comes from Reserved Instances which were purchased for disaster recovery purposes and aren't being used.

If a disaster strikes, the spot price will go up in a hurry.


The pricing has to be cyclical on a daily/weekly basis, right? Plus, the algorithm could be opportunistic and take advantage of temporary dips in price/utilization.


Why would someone buy Reserved Instances for disaster recovery purposes? Why pay for them ahead of time if they aren't being used?


Because you want to make sure they're actually available when you need them, and that the rest of your primary providers customers haven't thought the same thing and fill EC2 to capacity before you get in there.

I know I wouldn't want to be the one to have to tell my boss that we couldn't get our DR setup running because we didn't have any contract in place that ensured capacity was available.


Mainly so that there'll be no delay in capacity when allocating the instance.

You break even on the reservation fee if it's on half of the time.


From an architectural point of view, because EC2 will terminate instances whose bid price becomes lower than the Spot Price, you'll want to regularly checkpoint work in progress

Are they saying that they will kill your instances or kill the bids for instances??


Kill the instance itself. It recommends you back up your results frequently to, say, Amazon SimpleDB :-)


Glad to see companies finally getting onboard with this kind of pricing model and system, which has been around for years (econ 101).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: