I'm not in the datacenter business, so I've been conservative for lack of experience with storage at PB scale.
I've chosen to err on the side of estimating it to be more expensive, because I think that makes the end result more convincing:
30m is chump change for parties like Amazon, and in reality it'll cost significantly less. 1m might well do. Maybe it's less still. You could combine flagging users with flagging low-certainty or keyword-containing transcriptions.
Either way, you don't need collusion with intelligence parties, just an unscrupulous or naive exac at Amazon that thinks the data might be worth a lot for training future learning models. Of course the more sinister but legal reselling to government agencies is a financially attractive option as well.
I've chosen to err on the side of estimating it to be more expensive, because I think that makes the end result more convincing:
30m is chump change for parties like Amazon, and in reality it'll cost significantly less. 1m might well do. Maybe it's less still. You could combine flagging users with flagging low-certainty or keyword-containing transcriptions.
Either way, you don't need collusion with intelligence parties, just an unscrupulous or naive exac at Amazon that thinks the data might be worth a lot for training future learning models. Of course the more sinister but legal reselling to government agencies is a financially attractive option as well.