Hacker News new | past | comments | ask | show | jobs | submit login
The Gyroscopes in Your Phone Could Let Apps Eavesdrop on Conversations (wired.com)
77 points by digitalcreate on Aug 14, 2014 | hide | past | favorite | 17 comments



But he says the research is only intended to show the possibility of the spying technique, not to perfect it. “We’re security experts, not speech recognition experts,” Boneh says.

It shows. Sampling at 200Hz means your masimum detectable frequency is 100Hz, per the Nyquist theorem - that'll capture ~40% of the typical male frequency range (fundamental only - not enough for harmonics or formants) and little or nothing of the typical female frequency range. I question the claimed 65% success rate, and would like to know a lot more about the experimental conditions before I'd be inclined to accept it. I do enough synthesis to know what that sort of sample rate sounds like without having to test, and the short answer is 'awful'. I can see possibly getting numbers out of it when the phone is held up to one's head but only in perfectly controlled environments. For contrast POTS bandwidth is 300 to 3400 Hz.

Or if an app really needed to access the gyroscope at high frequencies, it could be forced to ask permission. “There’s no reason a video game needs to access it 200 times a second,” says Boneh.

I think it's quite plausible that people might be able to detect a lag greater than 5ms in the right game. That's around the envelope for involuntary variation by professional drummers.


That is what one would get with a single gyroscope. Now imagine several gyroscopes listening in, with their sampling frequency slightly/randomly out of phase, an array of receivers if you will. For this to work one needs to figure out the 'sync' of the gyroscopes but I dont see a fundamental problem with this approach.


I don't know, requiring several phones within hearing range seems like it would severely limit the applicability of this technique. If you think N phones provides at best N*(maximum sample rate of one phone) and that's assuming their phases a perfectly misaligned I'd think you'd still need quite a few to get something.

Plus the 100 Hz figure is optimistic. I dunno what kind of anti-aliasing these things have. If none you get an aliased signal and with a low order filter the practical maximum frequency detectable would be even less.


Reminds me of the recent Visual Microphone algorithm that researchers found, which recreates sound by looking at micro-vibrations of objects (ex. potato chip bags and house plants).


Here, in case people are interested:

http://people.csail.mit.edu/mrub/VisualMic/

I saw a talk on this recently by Prof. Bill Freeman, a co-author of the above work. The visual mic idea is based on a general technique for amplifying small motions; the motions here are the tiny movements of the bag.

The more generic idea is here: http://people.csail.mit.edu/nwadhwa/phase-video/

It seems like it could have applications to lots of areas.


https://news.ycombinator.com/item?id=8131785 - item and discussion from early August about the MIT research


that an old trick by bouncing laser of the objects near the source of the sound.


I think OP was referring to a recent demo that uses only video. No special equipment required.


I'm curious if anybody has ever had any real success with this concept as a personal project. (record audio with light) My results were lackluster to say the least.


Laser microphones are an old trick, using high speed camera video of ordinary objects is a decidedly new trick.


Similarly, in 2011, it was shown that in-phone motion-sensors could be used to deduce typing in other apps:

Android: http://www.theregister.co.uk/2011/08/17/android_key_logger/

iPhone: http://www.wired.com/2011/10/iphone-keylogger-spying/


FFS, another Wired "if you take this interesting-but-very-primitive research result and ignore the orders of magnitude in sampling rate improvement needed, OMG SPYING" article.

Slightly more feasible than the last one which focused on detecting sound via image differences in high-speed (thousands of fps) camera footage, but still...


Attacks only get better.


I suppose it would depend on the exact gyroscope; apparently some are more sensitive to audio frequency noise than others; some details in this document:

http://www.invensense.com/mems/gyro/documents/whitepapers/A%...

Its placement within the device will also affect the sensitivity to audio, so it will vary between device models - the article doesn't mention if 65% is worst-case, best-case, or an average.


65% accuracy on a limited character set hardly seems worth it at this point.


"Attacks always get better; they never get worse." – The NSA[1]

[1] http://tools.ietf.org/html/rfc4270#section-6


Maybe not at this point, but if they spent some time writing better filtering and speech recognition algorithms it might be able to achieve a much higher accuracy.

Since it's currently so easy to get sensor data on android phones there's a fairly strong incentive for people interested in eavesdropping to do so.

It's pretty cool, anyway




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: