Hacker News new | past | comments | ask | show | jobs | submit login

The AWS Go SDK now has a connection pool based S3 download/upload manager API that allows saturating your (e.g. 40Gbit/s EC2-S3) network connection using far less memory and CPU than is possible with Python.

A colleague of mine developed this tool to make this functionality available in a CLI: https://github.com/chanzuckerberg/s3parcp




With AWS it's really hard to tell which of their SDKs are up to cutting edge feature parity and which not, that some of them are and some of them are not is a real shame.

Just last week I wrote basically the same thing as an ad-hoc solution using boto3 because I had 10s of TB of data to pull out of Glacier and distribute across S3 buckets. It wasn't a big deal because I'm experienced writing parallel network code in Python and having big datastreams flow, and boto3 has good documentation, but things like this really shouldn't be left as an exercise to the SDK consumer.


Do you know if there is a "sync" function just like the aws-cli?!

I've been thinking of starting using Go to deploy some stuff that doesn't need python as dependency, and statically compiled :P

Edit: In this case for DO Spaces. Way more cheap.


RClone does all that you require and much more.

RClone: https://rclone.org/ S3 Backend: https://rclone.org/s3/


Be cautious though, as rclone "sync" is based on file metadata (e.g. last modified), it does not recompute local etags to know which files need to be sync'ed.

For instance, if you "cp -a" a directory and then apply sync, it could do nothing and return success if the copied files were last modified before the ones in S3.

For our use case at work, we wanted to be _sure_ that sync always work as intended, and thus ended up recomputing etags locally and compare to the ones in S3 to know what to sync (got bitten by the issue of last modified before)


I'm not aware of a supported API in the AWS Go SDK for this, but there is a sketch here: https://github.com/aws/aws-sdk-go/tree/main/example/service/...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: