The AWS Go SDK now has a connection pool based S3 download/upload manager API that allows saturating your (e.g. 40Gbit/s EC2-S3) network connection using far less memory and CPU than is possible with Python.
With AWS it's really hard to tell which of their SDKs are up to cutting edge feature parity and which not, that some of them are and some of them are not is a real shame.
Just last week I wrote basically the same thing as an ad-hoc solution using boto3 because I had 10s of TB of data to pull out of Glacier and distribute across S3 buckets. It wasn't a big deal because I'm experienced writing parallel network code in Python and having big datastreams flow, and boto3 has good documentation, but things like this really shouldn't be left as an exercise to the SDK consumer.
Be cautious though, as rclone "sync" is based on file metadata (e.g. last modified), it does not recompute local etags to know which files need to be sync'ed.
For instance, if you "cp -a" a directory and then apply sync, it could do nothing and return success if the copied files were last modified before the ones in S3.
For our use case at work, we wanted to be _sure_ that sync always work as intended, and thus ended up recomputing etags locally and compare to the ones in S3 to know what to sync (got bitten by the issue of last modified before)
A colleague of mine developed this tool to make this functionality available in a CLI: https://github.com/chanzuckerberg/s3parcp