Hacker News new | past | comments | ask | show | jobs | submit login

Bulk deletes on S3 should be done using a Life Cycle Policy to avoid the per object delete cost



This (although as others said it’s the list cost, not the delete). If you want to nuke a bucket don’t waste your time with the API and just age the files out. I’ve done this with buckets with tens of millions of files and five minute’s work.


Both list objects and delete object are API actions that incur the standard cost for "requests" ... isn't it?

>>

PUT, COPY, POST, LIST requests (per 1,000 requests) = $0.005

GET, SELECT, and all other requests (per 1,000 requests) = $0.0004

>>

"You pay for requests made against your S3 buckets and objects. S3 request costs are based on the request type, and are charged on the quantity of requests as listed in the table below" https://aws.amazon.com/s3/pricing/#:~:text=You%20pay%20for%2...


From the same paragraph:

“DELETE and CANCEL requests are free.”


Deletes are free, but you have to pay to list to delete if not individually tracking.


Turn on weekly manifest to get a list of files and then you can delete based off that. Much faster and better then listing billions of files.


What’s a weekly manifest?


I believe ludjer is referring to S3 Storage Inventory. This is a daily, or weekly, file produced containing metadata on every file within a S3 Bucket. It does not use the synchronous List APIs.

> You can use Amazon S3 Inventory to help manage your storage. For example, you can use it to audit and report on the replication and encryption status of your objects for business, compliance, and regulatory needs. You can also simplify and speed up business workflows and big data jobs by using Amazon S3 Inventory, which provides a scheduled alternative to the Amazon S3 synchronous List API operations. Amazon S3 Inventory does not use the List API operations to audit your objects and does not affect the request rate of your bucket.

>

> Amazon S3 Inventory provides comma-separated values (CSV), Apache optimized row columnar (ORC) or Apache Parquet output files that list your objects and their corresponding metadata on a daily or weekly basis for an S3 bucket or objects with a shared prefix (that is, objects that have names that begin with a common string). If you set up a weekly inventory, a report is generated every Sunday (UTC time zone) after the initial report. For information about Amazon S3 Inventory pricing, see Amazon S3 pricing.

https://docs.aws.amazon.com/AmazonS3/latest/userguide/storag...


Ignore the replies about inventories. $0.005 Per 1k list requests. A list request returns 1k items. 1 million items for $0.005, or 1 billion items for 5 dollars.

Still use lifecycle policies (I visualized how it ages out 1 billion items here[1]), but list request prices are not a factor worth mentioning.

1. https://tomforb.es/visualizing-how-s3-deletes-1-billion-obje...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: