What Amazon’s RRS Really Means

mdasen · on May 24, 2010

The author seems to consider potentially losing 1 out of every 10,000 objects in a year unacceptable. It's true that it isn't the reliability that I would want for critical files, but many things are non-critical. For example: YouTube stores multiple copies of every file (the original, 360p, 480p, potentially 720p and 1080p and potentially using multiple codecs). One copy is "critical". The others are nice to have. On a site like that, you can make the original and the 360p critical so that you're always able to serve the video and you always have the original and then if you need to regenerate the 720p version, you just do that and really people can wait an hour - especially since it still works, albeit at 360p.

Likewise, think about the thumbnails that Facebook stores of images. If one of them disappears, it can be regenerated. There's no reason to store so many copies.

Amazon's new service level shouldn't be used for data you want to keep absolutely safe. It should be used for data that you can regenerate if it disappears since it will save you money. If you have to re-transcode 1 out of every 10,000 videos a year (0.01%) while saving 50% on storage costs, you're in a good place.

jnoller · on May 25, 2010

To us (Nasuni - I am an employee) losing objects in the cloud is unacceptable, therefore, the blog post reenforces that. The Nasuni Filer (our product) is a local-to-the-site, cloud-backed filesystem - and losing objects in a filesystem is simply unacceptable.

Therefore, using RRS doesn't make sense for our current and potential customers who expect that our filesystem, and the snapshots of that filesystem reaching back in time, will not loose data. Sure - you save some money using it, but what happens (as the post points out) when you loose a root entry for your filesystem?

RRS is interesting in the way that buying a really cheap hard drive is - you know you're paying for reduced reliability, and if that's an acceptable risk for you, fine. However to us, right now, it's an unacceptable risk to our customers.

So, we agree with you - Amazon's new service level shouldn't be used for data you want to keep absolutely safe, and keeping data safe is our job as a company and product.

pedrocr · on May 24, 2010

I still find even RRS pretty expensive to handle my backup needs. It probably still has excessive redundancy and too fast IO. I run a family server to backup media and documents. The data there is already a consolidated/versioned backup of other locations put on a RAID5 array. With these two levels of redundancy already in place all I want is a cheap per GB storage online I can rsync to.

With only photographs/video/music collections of 4 people I've filled 1TB very fast and will have to swap bigger drives in very shortly. To backup even 1TB in RRS means 1200$ a year. With that money I can just buy a second server, put it on an existing commercial DSL connection somewhere else and run a daily rsync.

Anyone know of any solution like tarsnap/rsync.net that is an order of magnitude cheaper? Most of the backup services require proprietary clients that I hesitate to rely on, and I doubt their "unlimited storage" claims aren't just like "unlimited bandwidth" claims.

random42 · on May 24, 2010

99.99 Durability does not mean "1 out of 10000 objects WILL drop", it means "you can _legally_ expect 9999 out of 10000 objects not to be dropped, dont come back suing us, if 1 out of 10000 objects goes missing".

Its basically a legal clause for SLA, to protect Amazon from lawsuit inundation.

tszming · on May 25, 2010

99.999999999% durability is only their "Design Requirements", but it is not part of the SLA: http://aws.amazon.com/s3-sla/