Newer, larger "snowball" appliance (current = 50TB and newer 80TB version), as well as access in additional international regions.
Network optimized S3 uploads (faster inbound transfers) for an extra 4 cents/GB.
Sidenote: I'm curious if you can get the same speedup for inbound transfers by using a Cloudfront distribution in front of your S3 bucket for no additional charge (no cost for inbound transfer on either service).
Even though their Amazon S3 Transfer Acceleration Speed Comparison page [1] do test uploads only, this new feature is documented to accelerate downloads as well.
I guess optimized WAN network routing between S3 regions and their edge locations usually beats regular internet routing.
We played with this a bit using both cloudfront and other cdns. While we did see some small improvements, it wasn't nearly as good as we were expecting on the upload side. It may be fixed now, but there were also many issues getting the proper headers to pass through cloudfront to s3 for chunked uploads.
You can't upload through Cloudfront. It's a good analogy though, you can think of this feature as a "CDN for uploads", if you'll excuse the abuse of the CDN acronym.
"You can use existing Amazon CloudFront distributions for upload requests by simply enabling support in the AWS management console. When end users upload content, CloudFront will send the upload request back to the origin web server (such as an Amazon S3 bucket, an Amazon EC2 instance, an Elastic Load Balancer, or your own origin server) over an optimized route that uses persistent connections, TCP/IP and network path optimizations."
I stand corrected, thanks for the info. Cloudfront doesn't appear to support multi-part uploads however, which I guess is the primary value-add of the newly announced service.
That gets less important each passing year as connections get faster, even on mobile. I agree you can't restart uploads though without multi part support.
Anyone know if the quoted cost of $0.04/GB for Transfer Acceleration is instead of the $0.03/GB for standard S3 ingress, or in addition to it? Ie. is the final cost per GB $0.07 or $0.04 when using Transfer Acceleration?
Edit: as pointed out to me below, there actually is no ingress cost for standard S3 uploading. The $0.03/GB is monthly storage. So the cost of this service is $0.04/GB instead of $0.00.
> Anyone know if the quoted cost of $0.04/GB for Transfer Acceleration is instead of the $0.03/GB for standard S3 ingress, or in addition to it? Ie. is the final cost per GB $0.07 or $0.04 when using Transfer Acceleration?
S3 ingress is free. You're thinking of storage pricing.
Did anyone else have some pretty terrible results from the speed tester? I was pretty optimistic seeing their sample screenshot - am I missing something? http://i.imgur.com/UwNBGf3.png
It says on the page:
"Note: In general, the farther away you are from an Amazon S3 region, the higher the speed improvement you can expect from using Amazon S3 Transfer Acceleration. If you see similar speed results with and without the acceleration, your upload bandwidth or a system constraint might be limiting your speed."
So I suspect you might be having that problem in your test.
We have 20Gbit of Direct Connect with public peering on. This causes upload for us to be 40% slower in primary region and 2-5% slower in the other regions in the US. We only get any benefit going to APAC, granted we aren't your "typical" or "ideal" user for this
Same here, got 11%-24% slower in most regions, and only 3% faster in US-EAST-1. Sao Paulo got 60% faster, though.
Regardless of my results, I've gotta hand it to Amazon on a really kickass speed comparison page. That's a great user experience. It immediately tells me whether I should be using this or not from my particular corporate network without requiring me to waste a day or so trying it out.
An error occurred during a connection to aws.amazon.com. Peer attempted old style (potentially vulnerable) handshake. Error code: SSL_ERROR_UNSAFE_NEGOTIATION
> The bucket names must match the names of the website that you are hosting. For example, to host your example.com website on Amazon S3, you would create a bucket named example.com.
Looks like they don't want you replacing CloudFront with Amazon S3 Transfer Acceleration - which is a royal pain for some of my use cases (internal facing websites that don't need a full CDN).
S3 Website Hosting is distinct from CloudFront; you can create a CloudFront distribution backed by a bucket with any name. You can combine the two, but it isn't required.
Looks like they're using the Cloudfront infra to have files sent to s3 via cloudfront. Uploads to a cloudfront edge down the street will be a lot faster than the s3 datacenter.
Will this allow the HTTP sliding windows to be optimized more in any way? Any idea if this could approach UDT upload speeds for large files?
Yes, cloudfront gives you far better control over things like headers, routing for different paths, error pages, etc, than vanilla s3 http access does, and is almost certainly lower latency compared to s3 (assuming your files aren't GB-sized)
That is not correct. You can host a website from an S3 bucket without using CloudFront. Although of course, downloads will come directly from the S3 bucket and not from a CF edge location.
Does anybody have a sense of how fast uploading to S3 can get? I'm on a gigabit link here in Singapore, uploading to Singapore instance of S3 (via ARQ) - and I'm disappointed that I rarely see better than about 40 Mbits/second.
Make sure you chunk it and use the S3 multi-part upload API. That way, you can break into N chunks, and then upload N in parallel, to get closer to saturating your uplink. The same works for downloads, where you can use byte range GET's to essentially do multi-part download.
Testing in my office (central US) to N. Virginia and Oregon was getting about 90 Mbit/sec. I'm guessing that's the cap on our connection, not an Amazon limit.
I presume you mean 100 MB (or just plain 100 MBytes) - so, 53 megabits/second. So a little better, but nowhere near 1 gigabit (which I presume you have, it's so cheap here - not that it seems to be useful for anything other than great speedtest results)
I use homeplugs because the router and computer are in different rooms. So I only get about ~300mbit from my room. I've never actually tried from next to the router.
My point was just that I get better results using CloudFront than using S3 or Multi Upload S3.
"What's the fastest & cheapest way to get data off Amazon's cloud services?"
We[1] may have a good answer for you ...
Let's assume you have terabytes of data, otherwise there's no difficulty, right ?
So, when you combine the HN readers discount pricing for 10TB datasets, which is 4c/GB/mo. with the fact that we support 's3cmd' in our environment:
ssh user@rsync.net s3cmd get s3://rsync/mscdex.exe
... and the fact that we have no other charges (no transfer/usage/bandwidth charges) ...
... and the fact that we have 10gbps connectivity through he.net ...
It's possible that we would be a good fit.
You fire up a 10TB account for $400/mo and you issue s3cmd commands, over SSH, on your rsync.net account, which "pulls" the data from S3, at pretty much whatever speed Amazon can throw at us.[2]
I know of three ways to get data out of amazon's cloud. Whether they are the fastest depends on how much data you're talking about.
1) Download the data and pay egress bandwidth charges. Starts at $90/TB, but gets cheaper with more usage.
2) Use import/export snowball to have disks shipped to you. This is $30/TB + $250 per 80TB device.
3) Use direct connect to connect fiber directly to the region. Costs $1620/mo for each 10G line + $30/TB + an unknown amount to your fiber provider.
It's too bad that export traffic can't be marked as "low priority" with a cheaper cost. I imagine there are times at night when utilization is low and big export jobs could be run during those times. (It's obviously not in AWS's interest to make it cheap to get data out of their data centers.)
For downloading & uploading very large datasets, would it make more sense to proxy the through Amazon to a Snowball and ship it. Very possible is the answer is it depends, just trying to get a sense of how data transfer via Amazon is works and is priced.
Well to be fair, that is not the same as "you can not upload gzip'd content".
Your complaint seems to be "S3 doesn't automatically gunzip gzip files that I upload", which sounds like the desired behaviour to me. i.e. if I ever upload gzipped content to S3, it is because I want it to be served compressed over HTTP, or because I am moving a compressed backup file to S3. In neither case would gunzipping be desirable, although I do appreciate that compressing during the transfer, and decompressing on the other end, could save some bytes during transfers.
> Yes, the other things I could have meant would have been odd, so I'm not sure why you spent time typing them out.
It seemed clear that you were either confused about what features were supported by S3, or not expressing your complaint correctly. The reason behind typing them out was trying to politely point this out, while describing some of the real-world use cases for using gzip'd files in S3 (partially to highlight why it would not make sense for S3 to gunzip files automatically).
I felt this would be a more useful comment than "you are wrong, you can upload gzip'd files to S3", or just downvoting your comment for being incorrect.
"Moving lots of data either requires a huge pipe, or a ton of storage disks."
With that, they offer their Snowball device, which, if I'm understanding correctly, holds up to 50TB (now 80TB), which they physically ship to you, and then you ship back to them. How does this fix either of the constraints (disk space / connection pipe)?
With Snowball, you don't need a huge pipe because your data is being sneaker-netted to one of their data centers. You don't need a ton of storage disks because they're lending Snowball to you in the short-term as a means of copying your data to S3 (and/or Glacier) for the long-term.
If you can spare a 1 gigabit connection to saturate with S3 uploads, you can send 50TB in about two weeks. It takes about a week to request a Snowball, have it arrive, you fill it (takes about a day, assuming you have a 10Gbit connection for Snowball), you ship it back, they copy the contents to AWS storage. If you don't have a spare 1 gigabit connection, the speed is that much better. Even if you don't have 10Gbit hardware to fill Snowball with, a local, dedicated 1 gigabit connection to Snowball would be much more reliable.
A week to go from requesting a Snowball to having your data in the cloud. For example, place the request Monday morning, receive it Wednesday afternoon, immediately get started filling it, ship it back Thursday afternoon, they receive it on Saturday, hook it up, your data is in your AWS storage by the end of the day Sunday.
To you and @nxzero, I'm not debating whether it's better or worse, I'm trying to find out if I'm actually understanding it right. From some of the other comments, it appears that it is just a matter of them sending you a physical disk -- it makes sense now.
it doesn't fix the constraint about needing a ton of disks, but it solves it for you - they're loaning you a big pile of disk space to use for the transfer.
Newer, larger "snowball" appliance (current = 50TB and newer 80TB version), as well as access in additional international regions.
Network optimized S3 uploads (faster inbound transfers) for an extra 4 cents/GB.
Sidenote: I'm curious if you can get the same speedup for inbound transfers by using a Cloudfront distribution in front of your S3 bucket for no additional charge (no cost for inbound transfer on either service).