Hacker News new | past | comments | ask | show | jobs | submit login

If it's within the same cloud, though, why wouldn't there by an internal shortcut so it doesn't have to go fully "out" then "back in"?

Heck, why does it have to physically move at all? Why doesn't this effectively come down to a rename within AWS' system, like when you "move" a file within the same hard drive?




That's probably possible, but not publicly exposed. Like a lot of the people on the Reddit post said, reach out to support. They have a LOT of data and knobs to turn, in my experience.


Not OP but we also had to move a relatively large S3 bucket (significantly larger than 25TB) and unfortunately AWS doesn't have a way to change the ownership of the S3 bucket. My guess is it has to do with how the underlying system stores the objects.

We ended up writing a Go program to copy the objects from one bucket to another to help with the parallelization of migration. This was also before AWS had announced S3 Batch operations so I'm not sure how much better it would be to use that today. The deletion of the bucket also took us over a week. Even though we had deleted all objects in the bucket due to the eventual consistency nature of S3 we weren't allowed to delete the bucket until all objects were fully removed from S3. All we could get from AWS support was to wait a few more days and reach back if we couldn't delete it then.

Edit: depending on your object naming scheme you might also run into the S3 prefix rate limits.


Your advice is completely valid, yet it is a little absurd to be in a situation where something as basic as "mv" is a support-only technology, not available to mere mortals spending millions a month.

I've seen many similar situations that boil down to "UPDATE SET x = y WHERE z" require support tickets at a minimum, or are flat impossible because even their internal staff don't know how to do it.


At least when I did similar, moving the data itself was not the problem - it is entirely s3 server side. The problem is round trip times on the empty-body api calls themselves, aws cli being python is slow and maxes out on https/signatures.

If the copy Api supported wildcards there wouldn’t be any discussion at all on this


If you are moving it across data-centers (Availability Zones) it won't be possible to do it purely 'symbolically'..


Sure, but you still shouldn't have to manually squeeze it through public HTTP APIs




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: