current leader in open source voice cloning is RVC, would like to see how it com...

echelon · 2024-01-01T19:54:23 1704138863

RVC is voice conversion (audio to audio), and it's typically finetuned.

This is zero shot TTS. Samples create vector encodings that serve as input to inference. There's no retraining the model unless you want it to generalize or perform better.

cchance · 2024-01-01T23:30:51 1704151851

It isn't though, people need to read the paper and the comments from the author they aren't actually doing the voice generation they pass the text off to VITS, and then they're sauce is that they are doing tone mapping on that VITS output, so if anything they're a competitor to RVC, it's just that the version they published includes VITS also

echelon · 2024-01-02T00:54:59 1704156899

Interesting.

Funny enough, a lot of RVC packages are using VITS to do RVC for TTS.