Hacker News new | past | comments | ask | show | jobs | submit login
Candy Wrappers Are Listening (2014) (ieee.org)
109 points by wglb on Jan 22, 2018 | hide | past | favorite | 42 comments



Fun tidbit: Three letter intelligence agencies have known about this attack vector since before this paper was published, and are worried it could be exploited in the wild.

https://www.theguardian.com/world/2013/aug/20/nsa-snowden-fi...

[British government officials] expressed fears that foreign governments, in particular Russia or China, could hack into the Guardian's IT network. But the Guardian explained the security surrounding the documents, which were held in isolation and not stored on any Guardian system.

However, in a subsequent meeting, an intelligence agency expert argued that the material was still vulnerable. He said by way of example that if there was a plastic cup in the room where the work was being carried out foreign agents could train a laser on it to pick up the vibrations of what was being said. Vibrations on windows could similarly be monitored remotely by laser.


I've actually done this in a lab with a laser - it's a fun trick. What the original article talks about though is passively measuring sound from objects using a camera - no laser. Although they're measuring the same thing (physical vibrations of objects) the way they're doing it is actually quite different.


A friend of mine used that technique while working in an entomology lab studying the mating habits of cockroaches. They were trying to figure out what kinds of vocalizations the cockroaches were making while mating, and bouncing a laser off their shell was by far the easiest way to record those very quiet sounds.

It's been a few years, but I think this was the manufacturer: https://www.polytec.com/us/vibrometry/products/

I wouldn't be surprised if the non-laser techniques eventually also show up as off-the-shelf products too, simply to reduce costs - a simple camera could be cheaper than a laser, and might have advantages in being able to record sounds off multiple points at once.


Also, I think a big advantage for the camera-based solutions is that they're totally passive, while laser-based need to interact with the object.


To make a camera-based solution practical, it seems like one would neeed a specialized camera, for modest resolution, but very high frame rate. Without enough frame-rate, you can't capture high frequencies. Even if speech could be discerned from a spectrum that rolls off at 300hz, well under the peak energy of a typical human's speech, you would still need many hundreds of frames per second.


If you watch the video at the end, you'll see that they actually exploit the fact that CMOS pixels are read out sequentially and use the data from each row to extract surprisingly decent high frequency data from even 60 Hz video!


They article says they were able to use the rolling shutter on standard cameras to extract data at a much higher frequency than the nominal framerate would allow.


The article talks about this. The rolling-shutter effect means that you can get information out at higher than frame rate.


Pretty cool idea but I would be a little worried that the laser itself could influence the mating habits due to heat/distraction etc.


The laser thing has been around for a while and is quite well known. The innovation here is that they found a way to do it without any lasers and possibly even with consumer cameras.


Bill Freeman, the PhD supervisor on this project, is a really creative computer vision researcher. Read through his CV, he has a great ability to pick interesting problems slightly outside the mainstream. I wish for that ability.


Also a really cool application of this rolling shutter is video of guitar strings:

https://www.youtube.com/watch?v=8YGQmV3NxMI

(Note that real guitar strings don't vibrate in this way; they vibrate across their entire length, with limited higher order harmonics; the visible waveforms here are an artifact of the CCD capture process).


This will help to solve debates like the one about the moon landing. Use this video (https://www.nasa.gov/multimedia/hd/apollo11_hdpage.html - the highest HD I could find) of the flag moving and extract sound. Then extract the sound (if there is any) http://people.csail.mit.edu/mrub/vidmag/ is the site that he mentions at the end or the ted talk given that goes into more about the workflow.

Not a programmer so not sure how to use the code to implement it. Any takers?


This is going to do nothing to solve the "debate". Moon landing deniers will find an excuse to dismiss this evidence like they have for the rest.


The conspirators purposefully made the flag from nylon, rather than candy wrappers, in order to foil future attempts of sound extraction.


Am I missing something? Wouldn't there be no sound to extract because the moon has no atmosphere?

Or maybe that's your point — if it were faked, there would be sound, otherwise not.


As the original footage was done on film, the time difference between the top and bottom of a single frame is zero, unlike a "rolling shutter" progressively scanned sensor. The framerate is also much lower than the 60fps phone camera used in the last example from the MIT video, which was only barely enough to get meaningful data out.

So unfortunately, we probably won't be using this to get much out of old traditionally archived footage - not because of resolution of the frame, but time resolution.


For more details, full paper can be found at https://dspace.mit.edu/handle/1721.1/100023


Laser vibrometers are neat. They can also be used to sniff keystrokes remotely by pointing them at a laptop lid. Research also released before this paper.[1]

[1]https://www.youtube.com/watch?v=xKSq9efXmh8


The linked article describes a passive monitoring system achieving the same result, no laser required.


Even more "scary" (if it really works): using reflected microwaves in a windowless room in a similar way: https://www.eetimes.com/document.asp?doc_id=1274746


I wonder how much archival footage we might be able to give a second look knowing what we know now.


I'm assuming it would need to be extremely high frame rate video to extreact decipherable voice data. Because even at 60 frames per second, which I understand is at the high end for normal video, that gives you a maximum frequency of only 30Hz, which is nearing the low end of human perception.

So it's interesting and a definite risk if an attacker is supplying their own camera, but definitely does NOT mean you can pull voice data from the vibrating tablecloth in a YouTube video, right?


Not from a YouTube video, due to the video compression obliterating the small details needed to get the frequency. But the article notes that a rolling shutter allows even a 60fps camera to capture higher frequencies. Each scan line of the captured video is at a sub-frame time slice. If the object occupies a hundred scan lines, then you have 100 slightly shifted 60 hz samples that you can combine to reconstruct higher frequencies.


They show a technique which exploits the rolling shutter effect to recover frequencies at up to five times the frame rate. On the other hand, they also mention that artifacts in highly compressed video stop their techniques from working, so any secrets in your YouTube vids are probably safe.


Depends what's on camera and how smart your software is. A full wavelength of audible sound is between a few centimetres and a few tens of metres. If you have a few objects in scene at such known distances, you can calculate how far out of phase they are, interpolate them and piece together a much higher effective sampling rate.


I thought this was as old as continuous-wave lasers, albeit the original idea was bouncing it off a window.


If smartphone accelerometers become more advanced, perhaps they can be used in the same way :)


I'm trying to work this one out in my head ...

You've left a smartphone lying around and you want to read accelerometer data to determine what was said in proximity. Just turn on the mic instead.

You've left a smartphone lying around to remotely read light bouncing off it ... accelerometer not required. Also, just turn the mic on instead.


Apps don't have mic access universally. It has to be granted by the user.


The user leaving the phone behind for the purposes of snooping on the room.


Well fine, but who do you listen to? Is face to face interaction still a thing?


What is a viable defense against this attack? Some kind of white noise machine?


Abstain from candy. And coke. And checks article again bricks.


A private windowless room.


Skype with headphones and a keyboard for the secret stuff?


You can point a laser at a keyboard and log the keystrokes, probably can do the same by pointing one at a headphone.



creepy


[flagged]


You've posted a lot of unsubstantive comments to Hacker News. We eventually ban accounts that do that, so would you please stop?

The idea here is as follows. If you have a substantive point to make, make it thoughtfully; if you don't, please don't comment until you do.

https://news.ycombinator.com/newsguidelines.html


The paper can be found at [1]. Did you have a specific grievance with their procedure/conclusions, or just baseless cynicism?

[1] https://dspace.mit.edu/handle/1721.1/100023


Your bullshit meter is too sensitive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: