Hacker News new | past | comments | ask | show | jobs | submit login
Big Just Got Bigger - 5 Terabyte Object Support in Amazon S3 (allthingsdistributed.com)
61 points by werner on Dec 10, 2010 | hide | past | favorite | 31 comments



It is hard for me to even imagine what the world was like before Amazon Web Services, despite having only used them for half of a year. I truly hope Amazon is making ungodly sums of money with this increasingly powerful platform. Why Google is laser-focused on social and local, yet letting Amazon eat their lunch in this space is beyond me. I know Google has AppEngine and Storage, but they seem to be lacking the full-featured, cohesive collection of cloud services that Amazon has so clearly invested themselves in. I truly wonder if we will wake up five years from now and the vast majority of the web will be running on Amazon infrastructure.


I think Google considers their infrastructure and expertise with web-scale data processing to be a durable source of competitive advantage. You need to be substantially as good as Google to do web-scale crawling and analysis to build a general purpose search engine. Google makes billions upon billions of dollars from having that available. Are they going to make billions upon billions of dollars with a huge competitive moat and 30% gross margins by publishing their competitive advantage as OSS and getting into hosting? Probably not.

Which is not to say Google isn't a friend of OSS and transparency -- indeed, they are the world's most reliable fans of OSSing the competitive advantages of companies which are not Google.


> "Are they going to make billions upon billions of dollars with a huge competitive moat and 30% gross margins by publishing their competitive advantage as OSS and getting into hosting? Probably not."

But are they going to make billions upon billions of dollars when their competitors have instant access to a service that levels the playing field?

[edit] Okay, not quite level, but even with Amazon collecting their margin, it dramatically closes the gap in highly scaled computing, to the point where if I were Google I would not rely on cost advantage as a long-term strategy.

Competing on price is rarely a game you want to get into, it tends to turn into a race to the bottom more often than not.


What does OSS have to do with this? Amazon isn't releasing the source for their infrastructure services.


No. It's much more cost effective to run your own infrastructure.

It's always struck me as irresponsible for reedit to spend the sums they do for example. You could easily spend half that and get four times the hardware (in addition to an admin to manage it).

Some businesses just don't want to deal with it period. I get that. But at some point it's almost irresponsible. Like sending an employee to starbucks for everyone in the office a few times a day because you don't want the hassle of setting up with a coffee service. ;-)


If it really was more cost effective to run your own infrastructure, then I wonder why Netflix is going out of their way to migrate all of their infrastructure to Level3 and AWS. They're even discussing the details at the last cloud computing meetup: http://www.meetup.com/cloudcomputing/calendar/14476942/

[Edit: Slides for Adrian's Talk on Netflix->AWS Here: http://www.slideshare.net/adrianco/netflix-on-cloud-combined...

Video of his talk here:http://blip.tv/file/4252897 ]

Unless you are ruthlessly disciplined (Craigslist), it rarely makes sense to roll out your own infrastructure. When you start to sustain IT Systems in a corporation, the real costs are not only what's visible, but also the cost of managing, hiring, budgeting, change control, procurement, etc...

Plus - you get to ride on all the scale discounts of your cloud hosting company. And, huge win - you don't have to chose the "Safe" technology solutions (Cisco + HP + EMC, or Juniper + Dell + Hitachi) - but you can let your cloud hosting company go spend a fortune on vetting an inexpensive, innovative, but unproven technology, and then ride on their investment in that vetting process.


I disagree. Amazon's cloud is great for getting started, no doubt about that. And yes, they take care of all the administration. But the hardware and bandwidth costs are absurdly expensive. Probably somewhere between 2x - 10x as much. Netflix is outsourcing it because they are rich and can afford the scalability that Amazon offers.


The mantra at every large company I've ever been at once we get past the initial "Get it done" phase, is, "Do it cheaper, cheaper, cheaper."

I think the thing that everyone overlooks when they price AWS out is they are looking at the cost of the bare metal. They aren't pricing their Network Engineers, their DBAs, their Storage SysAdmin, Their Data Center Ops Manager, the finance overhead required to buy/track all this gear.

Forget about the fact that no company over 200 employees I've ever been involved in could add 30-40 servers in less than 30 days.

The reasons to not go to AWS have to do with things like control, security, customization. Cost is the #2 reason everyone I've talked to is looking at moving their infrastructure over to AWS. (Ability to quickly scale is usually #1)


"Netflix is outsourcing it because they are rich and can afford the scalability that Amazon offers." - You honestly think that Netflix is getting an economy of scale from AWS? I think it works the opposite way. Perhaps they took the time to actually examine the hidden costs instead of using a calculator for 15 minutes on Tiger Direct.


As someone else stated, AWS doesn't provide you systems administrators. Storage administrators sure, but that's not difficult to absorb.

So taking that out of the equation, you're left with hardware and bandwidth. I imagine Netflix didn't get the same off-the-shelf price you or I would, so speculating on what their TCO is and how it would compare to a local build-out is probably fruitless.

For a small business, you can keep your hard costs (hardware/cabinets/power/bandwidth) under $10K/month and get a terabyte of memory in a pair of Dell R910s, a triply redundant 44TB SAN, a couple beefy Dell R610s with 96GB of RAM for database servers, Flash for the table-spaces and SAN read/write caches.

You can't come close to that kinda of power at AWS pricing.

Sure there may be cases where AWS makes sense at scale, but there are many more where it simply doesn't. Look at Basecamp. There's a reason they're switching (switched?) to a local SAN vs S3. It's insanely expensive to store anything more than tens of gigabytes in the cloud if you need DAS-like performance.

When it comes to bullet-proofing a system, automate the hell out of it, and skip the service contracts. Buy redundancies.

Oh, and stay far away from Oracle/Sun. ;-)


For $1, I can spin up a complete clone of my production environment in less than 10 minutes.

I can modify my chef recipes to do postgresql database failover and test it on the new servers.

Once I'm confident that everything works, I can destroy the new servers and move that configuration over to the production servers.

If I need 5 TB more hard drive space, it takes less than 10 minutes get it.

That's the value of EC2 to me. I don't really care if it costs $5,000 more a year (or whatever), my business can support that additional cost fine.


This is what's amazing me too. AWS is so much more expensive, even if factoring TCO in. It's good for very fast scaling and experimentation tough.

I've did some very rough calculations in a blog post some time ago

http://codemonkeyism.com/dark-side-virtualized-servers-cloud...

and AWS seems to be a least 2x more expensive.


Besides the EC2, at least, my own calculation shows that Amazon S3 is much more cost-effective than hosting on your own machine.


Care to share? Would be interested as my calculations seem to show at least twice the costs for AWS


Do your calculations include 99.999999999% durability by storing copies in three separate facilities?


I understand you are right in the context of today. A few years from now, however, I would imagine that gains in the efficiency of the business model as well as competitive pressures will drive prices down below the costs of running your own infrastructure.


It depends on what you need to do.

- If you use Amazon for fail-over purposes then their cost cannot be beat since you are effectively only paying them for the few hours a year that you system is down (if at all). Same story if you can just use them to shave peaks off your load a few hours a week.

- On the other hand: if your base load is similar to what you could run on a 1U rack server then it's obviously much cheaper to run that yourself rather than using AWS


That first bit only works if you're already backing up to AWS. Moving terra-bytes of data into the cloud isn't quick and spinning up servers quickly isn't something that's going to help if your primary site goes down for many (most?) types of applications.

The bottle-neck isn't the provisioning speed. I can run a few shell scripts and have a new local VM in a couple minutes. Less if I cared to. That doesn't get media or databases from here to there though.


Indeed. While I would love to migrate to AWS, it is still vastly more expensive than our current power/bandwidth/cage space costs.



Depends on your business. If you need the ability to scale up rapidly, don't have much operational expertise or have high availability requirements, AWS is probably the cheapest option.


What would you estimate Reddit's costs are now? What would their costs be if they had a full-time (decent) admin and their own hardware in a data centre?


Actually you probably should add in a team of admins unless you assume a single person is capable of 24/7 service and handle the possibility of multiple large-scale issues simultaneously. Also factor in expertise (not just familiarity) in all areas of infrastructure.


You need the admins either way, AWS does not help with that. Most problems from experience are with the application (Database, Memory, Locking, ...) not with the infrastructure (Load balancer, Linux, ...).

[Edit]: The problems Reddit, Digg, Foursquare had were problems with application code and application infrastructure (MongoDB, Cassandra, RabbitMQ, Redis, ...). AWS admins would not - and have not - helped with that. So you need your own admins, AWS does not spare you those.


"But I just want to serve 5 Terabytes!"

I like that they increased the limit, not 5x, not 100x, but 1000x. It clearly sends the message that they are the leaders in cloud storage.


I can't believe this hasn't received more comments. I find this absolutely astonishing and promising. The future is mighty bright.


How long before the AWS division is spun off as it's own independent company?


Technically AWS is a separate company: Amazon Web Services LLC. Presumably Amazon.com is the only shareholder, though.

I doubt AWS will go public, though -- it's too central to Amazon's existence. Amazon has never really been about books.


I guess we have to thank Netflix for this? :D


Netflix Cloud Architect Adrian Cockcroft explains: http://blip.tv/file/4252897

slides here: http://www.slideshare.net/adrianco/netflix-on-cloud-combined...


great. unless it's Wikileaks documents...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: