E.g. have you heard the voices from https://play.ht/ ?
Those capture inflection and tonality quite well, and are able to reproduce near-natural sounding laughter, pauses, and other human qualities. I wouldn't say it's perfect just yet, but give it another ~5 years and it will get there.
and they are multilingual as well.
For me it was the first time to (afaik) to hear tts of this Quality in my native language (which is a major one)
E.g. have you heard the voices from https://play.ht/ ?
Those capture inflection and tonality quite well, and are able to reproduce near-natural sounding laughter, pauses, and other human qualities. I wouldn't say it's perfect just yet, but give it another ~5 years and it will get there.