Whisper is an STT model, you can use whisperx to transcribe audios locally via t...

jcuenod · on Nov 2, 2023

I've just been looking for SOTA TTS. I found coqui.ai and elevenlabs.io (and a bunch of others). They're good (and better than older TTS), but I am not fooled by any of them. Do you have recommendations?

selfhoster11 · on Nov 4, 2023

Gemelo was the other one listed. I doubt you'll get anything sounding more natural than ElevenLabs with the following settings:

* Model: Multilingual v2

* All options and sliders to boost similarity: set to max/yes

* Stability slider: experimentally set to a value where the model sounds natural enough without destabilising sound output