Hacker News new | past | comments | ask | show | jobs | submit | lidlo's comments login

I wouldn't store my entire Google Takeout archive in Azure without encrypting it first. If someone were to hack my Azure account, then they would have everything.

What this project is doing is adding another potential point of privacy failure. My suggestion would be for users to not use the public proxy, but to modify their own proxy to GPG encrypt each Takeout file with their public key as the it is passed through to Azure Storage.


For those concerned about this, the public proxy should just be considered only for demo purposes. You may also consider only using it for the YouTube portion of your Takeout too.

Unfortunately, even if people did modify their own proxies, it would quickly blow through any CPU budget on Cloudflare workers. When you finish processing and return some stream objects or whatever on Cloudflare Workers, Cloudflare basically unloads the worker, stops CPU billing, and just handles shoveling bytes between two objects/sockets.

The closest thing I think would work is using Azure Storage's encryption. It's still theoretical though and I haven't tried it with block by block transloads much less uploads. Unfortunately, it's symmetric and Azure will hold the symmetric key but pinky promises to wipe it after the transfer is done. This should deter most adversaries.

https://learn.microsoft.com/en-us/azure/storage/common/stora...

https://github.com/nelsonjchen/gargantuan-takeout-rocket/iss...

There's also the obvious thing of just downloading the archives via an Azure VM, encrypting it, and throwing it back in the storage/bucket, and deleting the unencrypted transloaded bytes. Downloading from Azure Storage to an Azure VM should be very quick. You might even be able to do it "stateless" without intermediary files in the VM! There is still a window where it's unencrypted but you can close it up afterwards.

The bytes become way more handleable once they're outside Google's URLs and on Azure. Use GTR to get the bytes out of Google Takeout's quagmire.


This is great. I tried doing something similar, Takeout can put the files in Google Drive for you. So i tried creating a Cloudflare worker reads the files from Google Drive and streams them directly to backblaze B2 (S3 compatible).

Worker was supposed to run for free as the files were streaming and no CPU time was used but in practice CF workers were stopping because I was exhausting the free limit so I guess something in my code did end up using CPU


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: