Hacker News new | past | comments | ask | show | jobs | submit login

Oh, and let me address it with something other than "OK, Google". Let me name it, or set the trigger phrase, or do an interpretive dance. Anything but that awful phrase.



I might be totally wrong about this, but I think the reason "OK Google" or "Hey Alexa" are immutable phrases is because the listening for them is implemented via an ASIC chip at the hardware level, in order to save on battery life. That is, instead of a software based `while(listening) {...}`, an actual hardware component looks for the correct wave forms from the microphone output.


I looked into this recently. Voice triggering is typically done on DSP and the digital processing power is typically already small compared to the microphone (which is not much itself). For power optimized applications the DSP implementation is power optimized (low speed, low leakage) and processing is done in two steps: a coarse, basic recognition with some false alarm probability, with an accurate second step. For a plugged device there's no need to super optimize, the DSP part should be in the single digit mW. A fixed trigger makes the system simpler however: no configuration to manage, no risk of having the kids randomly changing the trigger for something funny, etc.


that should have been painfully obvious from the outset.

Has no one but me had friends try to trick my phone into going into voice command by yelling "OK Google" at it when i unlock it?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: