As someone else who works on systems like these, I agree training data is the whole problem. However you can use some techniques like homomorphic encryption and gradient pooling to collect training data from client code while remaining end-to-end encryption. It's hard, but it's not impossible.