LibreSpeech is 1000h, WSJ is 73h. Google is training on datasets that are as big... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

tasubotadas on April 17, 2020 | parent | context | favorite | on: Building an end-to-end Speech Recognition model in...

LibreSpeech is 1000h, WSJ is 73h.

Google is training on datasets that are as big as 30kh and MS seems to work on a 10k h dataset.

At the moment, I am working on a similar e2e system but 80h big dataset makes it a really challenging task to generalize well.

Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact