It's not exactly what you're asking for, but we have a large bucket with billions of files (don't ever do this, it was a terrible idea) and we manage deletions via lifecycle rules. If your file naming convention and data retention policy permits it, far easier than calling delete with 1,000 keys at a time.
Also just a word of warning, if you do have a lot of files, and you're thinking "let's transition them to glacier", don't do it. The transfer cost from S3->Glacier is absolutely insane ($0.05 per 1,000 objects). I managed to generate $11k worth of charges doing a "small" test of 218M files and a lifecycle policy. Only use glacier for large individual files.
I have to ask: what’s performance like for operations on the bucket objects?
Edit: I ask because AWS suggests a key naming convention for large object amounts to ensure that you're distributing your objects across storage nodes, to prevent bottlenecks.
“This S3 request rate performance increase removes any previous guidance to randomize object prefixes to achieve faster performance. That means you can now use logical or sequential naming patterns in S3 object naming without any performance implications.”
No difference for Put, Get and Delete. Don't know about List, but if it degrades it's not significant. I worked with buckets with exabytes of data and billions of objects.
Never noticed any speed difference due to bucket size. S3 is generally slow anyway (250ms for a write isn't uncommon) but it scales very well and we use it for raw data storage that's not in our critical path, so the latency isn't a problem.
Edit Response: I've always used the partitioning conventions they suggest so not sure what sort of impact you encounter without.
For us, it was due to the relatively high PUT cost if you're storing a large number of small files. We ended up changing our approach and we now store blocks (~10MB archives) instead of individual files into S3. The S3 portion of our AWS bill was previously 50% PUT / 50% long-term storage charges. After the change, we managed to reduce the PUT aspect to nearly $0 and reduced our overall AWS bill by almost 30%, while still storing the same amount of data per month.
e.g. if you write 1 million 10KB files per day to S3, you're looking at $150/mo in PUT costs. If you instead write 1,000 10MB blocks, you're looking at $0.15/mo in PUT costs.
Due to S3's support of HTTP range requests, we can still request individual files without an intermediate layer (though our write layer did slightly increase in complexity) and our GET (and storage) costs are identical.
Also just a word of warning, if you do have a lot of files, and you're thinking "let's transition them to glacier", don't do it. The transfer cost from S3->Glacier is absolutely insane ($0.05 per 1,000 objects). I managed to generate $11k worth of charges doing a "small" test of 218M files and a lifecycle policy. Only use glacier for large individual files.
[1] https://docs.aws.amazon.com/AmazonS3/latest/user-guide/creat...