Hacker News new | past | comments | ask | show | jobs | submit login

While the article focuses on speech recognition for arbitrary speech, it misses the fact that speech interfaces for specific situations are now actually useful. Today, I told my car "play track 'severed head'," and it actually played the correct song. I asked my phone "what is my next appointment?" and heard my calendar. I then said "dial 206 421 8989," my phone dialed properly, and so on and so forth. This is all without any explicit training on my part. No need to read Mark Twain or anything like that.

There are still problems here, but the technology for speech interfaces has gone from terrible to OK in the last seven years. I'm looking forward to seeing where it goes next.




"But sticking to a few topics, like numbers, helped. Saying “one” into the phone works about as well as pressing a button, approaching 100% accuracy. But loosen the vocabulary constraint and recognition begins to drift, turning to vertigo in the wide-open vastness of linguistic space."

"As with speech recognition, parsing works best inside snug linguistic boxes, like medical terminology, but weakens when you take down the fences holding back the untamed wilds."

No, it did not miss it. It was a core point of its argument; the entire arc of the article is about how we made steady progress on the small cases but crapped out on the general case.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: