Hacker News new | past | comments | ask | show | jobs | submit login

Just as the mouse input has evolved to include multitouch and 3d touch gestures, voice input can also evolve. The full range of tone, inflection, pitch, etc is available from the human voice.

I wonder if NLP research should have started as our ancestors did, with grunts and hoots and cries. Instead it's focused on recognizing full words and sentences while almost completely ignoring inflection.

Another dimension to add with vocal input is directional. If you have mics in all corners of a room, which direction you speak in can affect whether "turn off" operates your TV, your lights or your oven.




Very good points. I can't wait until devices can read my emotions or inflections in my voice. I can voice-to-text most of my short messages, but anything that requires punctuation or god forbid emojis still require manual input. And I don't want to have to say "period" or "exclamation mark" to indicate my desired punctuation. If I say it unusually loudly, insert an exclamation mark. If I pause at the end of a sentence (Word has known a grammatically correct sentence for decades) and don't say "um" or "uh", put a period. If my inflection goes up or there is a question word in the sentence, add a question mark.

There is a lot of improvement for voice processing in several dimensions of voice.




Consider applying for YC's first-ever Fall batch! Applications are open till Aug 27.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: