Hacker News new | past | comments | ask | show | jobs | submit login
Using ultrasound attack to disarm a smart-home system (theregister.com)
143 points by sohkamyung on April 5, 2023 | hide | past | favorite | 36 comments



> could be remotely hijacked, using carefully crafted near-ultrasonic sounds, and forced to make unwanted phone calls and money transfers, disable alarm systems, or unlock doors.

I know Siri is much too dumb to facilitate a money transfer…


Getting Siri to actually do what you want it by standing close by and shouting tends not to work - not sure how some ultrasonic whisper is going to get the job done.


At least with Alexa devices disabling alarm systems and unlocking doors requires a PIN, mostly to defend against the much more low tech attack of shouting through the victim's letter box.


For six months Siri interpreted my requests to call anyone as "Call Paul." It still might, but at that point I gave up and changed Paul's name in Contacts to stop accidentally bothering the poor guy.


This doesn't make any sense to me.

Why would voice recognition software be interpreting ultrasonic (or near-ultrasonic) signals at all?

First, it doesn't make sense they'd be trained on them. So why would models be interpreting these as speech at all?

And second, it doesn't make sense they'd make it from the microphone to the recognition engine -- surely there's a low pass filter in there to remove all extraneous noise above the vocal range?

I don't get it.

(Edit: could it be some kind of downsampling aliasing artifact that is interpreted as normal vocal frequency, precisely because they skip a low pass filter that would prevent it?)


Why the heck wouldn't they have the wake-word listener confine itself to human-audible frequency ranges? Seems like that would be a really simple fix with zero loss of real-world functionality....


It might be that although the sound is inaudible to humans, it gets distorted around and inside the device, and that distorted sound is at audible frequencies that the alexa microphone picks up. Something like how these ultrasonic directional speakers work: https://www.holosonics.com/what-makes-a-sound-source-directi...


Yeah, it's weird. There was already a scare about ultrasonics being used for ad tracking in 2016 [1] so it's not like this is an unknown attack vector, and I thought the subsequent patch efforts had already added filters that stopped the phones listening on ultrasonics.

I also remember seeing a presentation on the first gen Echo which went into its noise cancelling tech, making sure that stuff coming out of the speaker wasn't received by the mic, so the success of the speaker-to-mic attack vector also seems totally bizarre.

1: https://www.wired.com/2016/11/block-ultrasonic-signals-didnt...


That's on purpose. They created a specific inaudible frequency that the Alexa listens for which causes it to ignore the wake word. This is how they keep from annoying everyone and also blowing up their own servers if, for example, they want to run an Alexa commercial during the Superbowl.


> Reddit user aspyhackr may have figured out the trick Amazon uses here. Apparently, the Alexa commercials are intentionally muted in the 3,000Hz to 6,000Hz range of the audio spectrum, which apparently tips off the system that the “Alexa” phrase being spoken isn’t in fact a real command and should be ignored.

Seems to be the inverse - if the wake word _lacks_ these frequencies, then Echos ignore it.


Yes, I had misremembered it. That does leave a mystery of why they are listening for inaudible frequency. I wonder if it might have to do with their plans of having devices that can connect to other connected devices when they aren't connected to the user's wifi.


Listening on inaudible words surely are not on purpose?

Listening in on a frequency to disable word detection is a whole other thing...


The article taught me that I can control Siri’s feedback volume by for example saying “Hey Siri, speak 25 percent.”


I had no idea about that. Quite tickled by it stopping and asking if I'm sure when I asked to speak at 100 percent.


I learned that too, but now I'm wondering why her voice suddenly went silent a few weeks ago...


I also experienced Siri talking at what seems like 1% volume recently, out of no change that I can recall making. Just learned how to fix that through these comments, thanks.


This has been driving me mad. It seems certain actions have had all audible feedback disabled, but there's no real sense to which ones.

When I say across a room to set a reminder or add something to my shopping list my Homepods will just silently do so, with no indication but a flash of the screen. I have no idea if it's registered what I was saying or not.

When I ask to turn on the lights in a room, it'll do a bing-bong noise at me to indicate that it's registered despite the fact I can see the lights turning on. It's utter nonsense.


My guess is that the combination of a power button long-press continued by pressing the volume down button is something that might happen accidentally while you have the iPhone in your pocket.

Once Siri is active, using the hardware volume buttons control the feedback volume.


Reminds me of this from a couple years ago where researchers figured out they could use lasers to do the same thing:

https://cse.engin.umich.edu/stories/researchers-take-control...


> It's also worth noting that the length of malicious commands must be below 77 milliseconds — that's the average reaction time for the four voice assistants across multiple devices.

I don't get it. Why? You can speak to smart assistants much longer than that isn't it?


This har me confused at first too. Basically you are playing audio on the device and that audio will send a command to the same device. If I’m watching a video on my phone and say hey siri, it will interrupt the audio playing from my phone as it listens to me. But it isn’t instantly interrupting the audio, it apparently takes about 77 milliseconds.


Ah I see, thank you!


What I'm wondering is how a useful command can be expressed in 77 ms?


"weather"


Phreaking is back?


Will the right frequencies in the right order get me free Amazon delivery?


They tried this before on a smart-hotel in Havana.


This makes me wonder if the research was inspired by the clever "prompt injection" attacks in the ChatGPT/LLM world (or vice versa).


Of course. And even the attack itself was proposed by chatgpt.


Quite sure the microphones on any smart speaker or phone do not have the build quality to 'hear' near ultrasonic audio.


Near ultrasound audio doesn't need any special microphone. I just tested with my Samsung Galaxy S3 (i9300 version) and it has no trouble detecting 20kHz. I don't see why newer phones couldn't do the same.


20kHz is not ultrasonic.


I simply do not expose critical switches to Siri from Home Assistant.


I only have Siri kick in when I press the button.


I disabled it after a very potentially awkward event. I was talking to my wife while Siri on my Apple Watch misunderstood me, triggered itself and sent a text saying "I love you" to our previous maid (which we had fired because of reasons).


Years ago a classmate in college was showing Siri to one of our teachers. As a joke he said "Siri, I love you!" and Siri replied "I'm sorry, I can't find a location for <teacher's daughter>".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: