Hacker News new | past | comments | ask | show | jobs | submit login

Wait, so if your universe is ["Coke", "Pepsi", "Sprite"], then this library will tell you which one the user spoke?

Just being clear, it wouldn't take you from a bucket of raw sound samples to a string like "I'd like a Coke, please" ?




Yeah, this tool is focused on recognizing specific utterances given the curent context, e.g. (to take an example they use) voice-operating a browser. So you say things like "close tab" and it realizes that you've said one of the command words it recognizes in that context, and acts accordingly. They also have an example somewhere about controlling a calculator by speaking digits and arithmetic operations. Doesn't seem to be aimed at free-form audio-to-text transcription, though some of the technologies it's built on, such as CMU Sphinx, are fairly general.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: