This is an amazing side project. Would you share some details about how you've h...

echelon · on July 17, 2020

Thanks!

I'm pretty serious, so I've put some money into it. And even more time.

I've got a 2x1080Ti setup I used locally back in the day, but it's really slow. I still train stuff on it, but only things I know will train successfully for a long time (eg the Melgan model).

I use rented V100 GPUs to train the speaker models. They're quick and allow me to refine the datasets and parameters much more quickly than if I was doing all of it on my own box. Colabs are great and I could probably get along with them if I wasn't running so many experiments in parallel.

I can get reasonable results in a few hours on an 8xV100. Once I hone in on a direction I like, I'll let it train for a few days. (The David Attenborough model is a result of this.)

I still have a ton of refinement to do. I'm also working on singing models, and these should be ready by the weekend.

I've thought about buying beefy GPUs at this point as I've proven to myself it's not just a temporary hobby. Cloud compute is expensive.

The models are hosted on Rust microservices (a frontend proxy that fans out into multiple model servers), and this is deployed to a Kubernetes cluster. I'm planning to add more intelligence to the proxy and individual model containers so they independently scale.

amrrs · on July 17, 2020

That's a lot of time. And lot of good information to know! Wish you good luck hopefully it does what you want!