Hacker News new | past | comments | ask | show | jobs | submit login

Do you have any sources or more information about the per-account S3 limits?



I don't have any published sources, it's something they told me, but it's hinted at here: http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-...

They explicitly mention the RPS per account limit in that doc, which is related.


RPS to S3 is limited, but not throughput to S3, except by bucket. Higher throughput can be achieved by sharding your data across multiple buckets. Also, its important to properly namespace your keys within buckets to ensure its efficiently distributed across underlying data partitions.


Unless that is a semi-recent change, that is not what I've been explicitly told. To be fair my information is at least two years old now.


My experience is solely based on recent production workloads attempting to pull TBs of data out of S3 very quickly to restore data to less than reliable indexed datastore. YMMV.


Can you quote the piece where they mention RPS per account limit because I cannot find it.


> However, if you expect a rapid increase in the request rate for a bucket to more than 300 PUT/LIST/DELETE requests per second or more than 800 GET requests per second, we recommend that you open a support case to prepare for the workload and avoid any temporary limits on your request rate.

You have to know how to read their docs. :) This is basically code for, "there is a default limit here that you have to get raised if you want to go above it".


The full quote is:

>Amazon S3 scales to support very high request rates. If your request rate grows steadily, Amazon S3 automatically partitions your buckets as needed to support higher request rates. However, if you expect a rapid increase in the request rate for a bucket to more than 300 PUT/LIST/DELETE requests per second or more than 800 GET requests per second, we recommend that you open a support case to prepare for the workload and avoid any temporary limits on your request rate. To open a support case, go to Contact Us.

So this looks like an auto scaling issue. It states "S3 automatically scales to support higher request rates". However, if we know that a bucket is going to need to scale dramatically, we can request, in advance, that the S3 team pre-scales it.

I'm sure there is an account limit, but to run 1000 cpu's already requires requesting an increase in the account's EC2 instance limit. Are you saying that a team trying to access 150Gb of files, or to make 1000 RPS, as the article documents, will hit that limit? From your experience, how big is this hard limit? Is it Netflix scale or is it GB or TB?


We are routinely pulling a dataset of hundreds of GBs to 100+ instances (1600+ cores) in parallel. We have never noticed throughput going down with the number of nodes. S3 delivers the maximum throughput of 2-4Gbps / instance very consistently.


Take into account OP's former jobs. I imagine if anyone would run into such a limit, it would be Reddit or Netflix.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: