Hacker News new | past | comments | ask | show | jobs | submit login

I feel like it would be much harder to create a set of hard controls, like MIDI, to affect the voice acting vs. trying to do a co-embedding space of voices and descriptions of the voices and just saying "Say this quietly and meanly". Thoughts?



Exactly! Only issue is having a well-labelled dataset with those type of cues. We have an idea on how to do it though!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: