Hacker News new | past | comments | ask | show | jobs | submit login
Ultrasonic Payments (charliegerard.dev)
181 points by ofou on June 8, 2022 | hide | past | favorite | 90 comments



I'm reminded of Google Tone[0], which beamed URLs audibly to nearby browsers in an Airdrop-style experience. A neat trick, but ultimately useless given that most devices have less obtrusive ways of sharing data P2P. The ultrasonic aspect of this experiment makes the technology a lot more useful.

Tone also came with the unfortunate side effect of Google software having constant access to your microphone.

0: https://chrome.google.com/webstore/detail/google-tone/nnckeh...


I disagree that there are better ways to transmit data p2p in most devices. Even the smoothest of setups for things like iot devices usually require a Bluetooth connection with a passcode or a connection to a lone wifi signal after which you need to switch back. Using sound would be a perfect way to setup iot speakers or really anything close by. Instead of having to activate Bluetooth pairing and find the sometimes strangely named device among the dozen or so devices that inevitably show up, ultrasound would be a great way to transmit the initial credentials.


The key exchange problem doesn't go away because you're using ultrasound.


Correct, although it's not that it goes away, more of a it has different properties. The short time I worked in the IOT space, I was a big proponent of exploring an option like this for bootstrapping the WIFI connection in a device that otherwise had just a button or two.

The basic problem is, the wifi password needs to be shared with a device without an interface.

The traditional method at the time was the device boots with an unsecured wireless network. And an app on the phone is used to connect to the network, and share the wifi password, and then rendezvous on the wifi network to continue provisioning. I think there are some protocols for this that I don't fully remember now. There's lots to consider in this method, such as can someone sniff the transfer, how is that protected, what is the range to pick up, etc. Can someone trick the app into connecting to some other device, etc, etc. If you put mitigations in, how can those be countered.

With sound / ultrasound based provisioning (the device I'm talking about already had a microphone and speaker), the range is more limited to who can hear the device. The signal strength is much weaker, an eavesdropper might need to be in the same room. This might allow weaknesses in the overall model of key exchange to be less of a concern, as the properties change to something a lot harder to intercept.


It feels to be that this is better solved with NFC. The hardware is in principle cheaper with NFC (certainly, cheaper transducers) and my understanding is nfc is more robust to snoopers - certainly ultrasound is explicitly broadcast.


When I worked on this was a few years ago, so NFC wasn't a viable option at the time. I think even today, some research would need to be done on how widespread support is, we can't guarantee everyone has a brand new phone. but I agree NFC would be worth serious consideration now.

When considering robustness of support, one nice thing about the WIFI approach is in theory you can fall back to just a web browser. So basically any device could be used for provisioning.

And the audio method, would probably support a great set of devices. We didn't actually consider ultrasonic, just an audio codec that was in use at the time and doesn't sound awful.


I’m kind of bummed nobody has copied how they transfer files on The Expanse. For those that haven’t watched, if someone is standing across from you, you simply swipe it to them.

I feel like Apple’s ‘Find my’ tech could work to get positional data and make it possible. You’d need whitelists and the like of course.


AirDrop is what you're looking for.

The issue at play here is that most devices have no concept of locality or relative positioning. They may know (at best) approximate distance between itself and another device based on the type of transmission, signal level, noise levels, etc.

Phones also can't understand intent. How does it know whether it doesn't mean the person sitting behind your friend or your friend?


I use AirDrop and it works well, but it's still not seamless.

I think there would still be a quick confirmation button listing who owns what phone you're sending it to, and it could default to people you know in vague situations - but generally whoever is closer.


> Tone also came with the unfortunate side effect of Google software having constant access to your microphone.

In today's world, most phones or "smart" devices are also constantly listening; I want to believe they don't listen until the trigger phrase is uttered, which could also be implemented for these ultrasonic applications, but I'm not entirely convinced and them always listening is but a silent over-the-air update or setting change away.


> I want to believe they don't listen until the trigger phrase is uttered

They can't know whether the phrase was uttered unless they constantly listen.


Anthropomorphic language is unhelpful here, filling a ring buffer and running transforms on it looking for a certain shape is one kind of 'constantly listening', and recording a faithful audio stream which is sent to the cloud is another kind, they do the former.


To be fair, the initial phrase recognition probably wouldn't require 'listening' as such - the trigger phrase could be a very simple program that doesn't have the capability to do anything other than recognise a keyword, and then it bootstraps a program that actually listens to you when it detects it


This is exactly how it works.

The microphone is active and "listening" all the time.

The firmware that detects the wake word compares the constant input stream against waveforms that are designated "wake words". Firmware can be sometimes updated for custom or trained words, but it doesn't hold a large dictionary.

If a reasonable match is found, it kicks the full recording/recognition/streaming code, squirts any buffered audio at it (to catch words that come directly after the wake word and before the full handler is ready), and then things proceed according to plan. Depending on the device and service, recognition might happen locally or in the cloud.


IIRC, Webex Teams used to do something like this if you walked into a conference room that had a Webex teleconference setup. I haven't used it in a few years, though.



Google also used ultrasonic pairing/auth for Chromecast https://www.theverge.com/2014/6/26/5846726/chromecast-will-u...


This is such a cool experiment. Anything that involves sound like this is extremely interesting to me (I have a cochlear implant). Maybe I can hack it so I know when people are making payments :D


Awh, Can you give a idea of how much nuisance such devices can cause you daily? Also is there a way to prevent cochlear implants from detecting these frequencies? (like do only some cochlear implants do this or most of them do)

I really wanna use tech like this sometimes in future, but I also wanna care about accessiblity.


I also have a cochlear implant; it doesn't allow me to hear ultrasonic frequencies -- in fact, my hearing range is still a subset of "normal."


I have one as well, and I've noticed that area proximity detectors (in hallways used to detect presence/motion) do get picked up and come through as loud pops.


There are also electricity poles with a strong EM field that cause a humming sound.


How loud? These are audible to me without an implant (unless some are audible and some are inaudible)


You might be hearing a relay, which is itself switched by the proximity sensor.


This reminds me of how the student meal subsidies are implemented in Slovenia, and in my opinion it was quite unwieldy. You call a phone number and place your phone's earpiece on another device with a microphone. Then, some personal data is transmitted using (ultra?)sound. I remember it being quite unreliable, but that might be down to using the telephone network as the data carrier.


I remember back in the time, during the eighties, one of the radio stations broadcasted ZX-Spectrum games over the air. You would record that noise to a tape, and later you could load to your ZX and play. It worked remarkably good.


Generally datasette formats were meant to work with the extremely lossy media of tape. Also, I'm not sure but telephone audio tends to be much more compressed compared to radio. That on top of the fact you are going from earpiece to mic with an air gap and the background noise, it's probably much much worse.


Regular telephone audio is usually filtered 0.3-3.4 kHz (this on analog phone lines). Digital phone lines use PCM with A-law (or µ-law in the US and some other places) logarithmic sample encoding, with 8 kHz sampling rate, getting more or less the same result of an analog phone line, possibly with less distortion. FM radio has much higher bandwidth, usually up to around 15 kHz and with better basses too.


> (ultra?)sound

Well it's not ultra I'd say, that screeching is quite audible. Very similar in sound to an old phone modem.

I think the system it uses is the same as for Moneta, which can be used in much the same way but gets billed to the sim account instead. I'm sure other countries also use the same principle for some services.


> using (ultra?)sound

Phone lines have (or at least had) narrow frequency ranges. I'm not an expert but I'd assume this is just normal sound like any old modem.


Reminds me of Clinkle (https://en.wikipedia.org/wiki/Clinkle) which hoped to use sound to transmit payments.


Key difference, this one has demonstrated it working!

Jokes aside, I'm sure Clinkle had something working in a demo form but obviously the problem is sound as a digital communication medium is terrible outside of specific use cases (air gap attacks?).

Clinkle is up there in the who's who of blow ups.


re clinkle: i have definitely witnessed it working in person. not that that says anything in particular about clinkle. and yeah, knowing a lot of the people who worked there i definitely feel great delight and kinship with that particular blow up. and i worked at Color (Labs) for a bit, which, in terms of blow ups was like Clinkle before Clinkle, so i feel like i've had front row tickets to two of the best ones from that era! :)


we did have it. as a former Clinkle employee I can confirm we had infrasound payments in the app and backend support for it. (I was on backend)


Also Chirp. I wrote an error correcting modem for them. Getting a simplex packet from A to B is the easy part. But any kind of bidirectional protocol (say for flow control) falls apart in an acoustic medium because duplex operations are thwarted by echo and phase. It's a long time since I played with the idea but I reckon you could do a lot more now with phones that have mic arrays and enough CPU power to do more serious DSP.


as someone who worked at Clinkle (on backend) we did indeed have this early on. it was one of the demos used during my interviews to show me the app. The feature used infrasound instead of ultrasound like this article.

unfortunately coffee grinders and such would mess with transmission. the sound engineer working on it also left shortly before I started so the feature never got fixed and eventually killed.


I will say this is a cool project. However, the implementation of the idea in reality is even worse than NFC. While NFC is useful in many contexts, it then requires you to put a faraday cage around your card and manually turn on or off the NFC to not get your money stolen, and even then can be circumvented. Even further relaxing the distance constraints even makes it even easier to steal money. I'd imagine a lot of the engineering would go into making it safe (having some set up such that the US sounds vary to some degree...then may be even an authentication protocol..then...then...then...

The moment I saw the headline, I was incredulous. The actual project implementation in the article is a rather cool hack, so props to the author. I dread the day someone actually picks this up, assuming someone does actually choose to.


Cool technology, bad problem. I fully agree getting more distance is problematic. Not even in the sense of getting your data stolen, but also to allow pro active payment. I mean you want the customer to do an action (holding the card close to a device which shows the amount deducted) to conclusively agree to that payment. Not the cashier presses a button, and anybody too close pays for it.


modem/phone couplers, anyone?


Phone carriers cut off the frequency spectrum.


i'm just referring to the acoustic isolation from the mechanical coupler. obviously (?) the passband for telephone comms is far short of ultrasonic


What do you mean by acoustic isolation?



Yes, but what can ultrasonic do here?


The complaint was that the wrong payment can be made, because audio leaks.


yes. Now I understand, but then the stated advantage of longer distance is also gone...


How would you “steal” money from a contactless card or a phone?


1) Gain access to something like a Stripe Terminal (https://stripe.com/gb/terminal) You should probably avoid using your real identity here.

2) Type in a charge like $50

3) Discretely wave the device at your targets wallet

4) Repeat steps 2-3 as much as possible in a short amount of time.

5) Hope you can withdraw the funds before anyone notices.

I don't think this is a wildly plausible attack and also at least here in the UK your targets card issuer takes 100% liability for fraudulent charges.


This attack (and some variants of it, e.g. fooling the proximity detection or man in the middle) work because the acknowledgement action that the user does is simply having the device nearby. This seems like a poor choice of acknowledgement action for something that transfers money. Payment devices should probably have a physical or soft button that you have to press to acknowledge payment.


Strong disagree. The usability hit is not worth the added security. Having a cutoff for PIN entry requirement and the card issuer taking responsibility for fraud means customers are quite safe (as long as they look at their charges).


Work could be done to make it more usable. With a phone, it could be a button you could press just by holding it. With a smart watch, it could be hooked into any kind of bluetooth sensor. The point is that in normal society, you don't have that much control over who and what gets into proximity with you, and having a system where anything that does get into proximity can take money from you without you even acknowledging that in any way is just a bad way of doing things.


You could do something like "you need to be physically holding the card with your hand", which would complete some circuit. I can't think of many cases where that wouldn't work, except perhaps people who don't take their cards out of their wallets(?).


> 1) Gain access to something like a Stripe Terminal (https://stripe.com/gb/terminal)

Getting a payments terminal is not easy, this would requires ID verification and working business bank account (acquirer), this terminals are highly regulated. Someone doing this can get caught easily by just a couple of customers reporting the fraudulent transactions. This is very small risk and is rarely seen.


> 3) Discretely wave the device at your targets wallet

Phone payments generally require the phone to be unlocked.

Also, it's a credit card transaction: the user will complain later, attacker will get into legal trouble, and user will be refunded fully.


Most new wallets ac as faraday cages.


For example: NFC Proxy?


Interesting experiment, although it seems that most of Asia has settled on NFC (card emulation or token), QR, and 2D barcodes for offline (at least to customers) payments, which is more practical to deploy using existing infrastructure. I know Tez (aka Google Pay in India which was pushed in other countries) has implemented it, but Indians does Google Pay still have ultrasonic transfers?


Google Pay in India no longer has any of those features. Infact it has been completely reworked to support UPI payments [1].

And it has completely transformed how payments happen. paymets happen directly from bank account to bank account regardlessof what app you're using (Samsung Pay, PhonePe, GPay, your bank's app). You can use QR codes or simple username@bank to make payments.

I am not aware of any payment systems that support the use of NFC, (NFC is not that commonplace for anything except cards, and I have had trouble setting up cards in google pay). One of the biggest benefits of QR based payments have been is it does not require any special hardware. Just a mobile and Internet (Which is very common and accessible here). And it has little to no transaction fees. So it has been quickly adopted by literally everyone.

[1] https://en.m.wikipedia.org/wiki/Unified_Payments_Interface


> I am not aware of any payment systems that support the use of NFC

I basically lumped the whole of Asia into a melting pot, whoops. While you're correct that India's UPI doesn't have an NFC mode, other countries do (NETS of Singapore and various systems in Japan for example).


Oh yea, I think I got the idea. The initial direction with apps like Tez too was trying to use NFC for payment. That just didn't catch on, which I'd probably pin on the fact that majority of population here uses low-mid range Smartphones which often skip over NFC support.


Tez did for a while support UPI and ultrasonic. It just used sound to transfer the UPI ID


Pedantic, but I believe QR is a type of 2D barcode, but your comment implies the contrary. Am I wrong?


Ugh, I definitely meant 1D barcodes of course, but you get the point.


I think it’s cool but with payments you’ll have to address security considerations. How can you prevent someone from spoofing a transaction or spoofing audio for payments?


This specific demo doesn't actually send payment data over audio, just a link to pay from. Post-payment would still go though signed stripe webhooks like a normal online payment.

Seems like an interesting way to start a transaction without needing to buy any specialized equipment.


You don’t need to read the payment data. I think a phishing/spoofing attempt may be possible by playing a louder or directional ultrasonic message to introduce an alternate payment url. Or you may be able to accomplish a denial of service via jamming.

(If some one knows more, please step in and comment.) I’m guessing some of these could be issues with NFC too but from what I have skimmed online it seems both the tag and receiver would need to be modified to work at larger than normal distances of “1-5cm”[1]. Also much more power is needed to extend the range of NFC than sound since sound strength diminishes with the square of the distance and, from what I have skimmed, magnetic induction used by nearfield NFC diminishes with the cube of the distance [2][3].

1: https://seritag.com/learn/using-nfc/nfc-tag-scan-distance-ex...

2: https://www.physicsforums.com/threads/magnetic-field-strengt...

3: https://physics.stackexchange.com/questions/44037/why-is-nea...


Like an audio QR code, I guess.


Higher level protocol would establish handshake & encryption I imagine.


Very cool idea! I once worked on something similar [1], but even more low-tech, for CHI student design contest[2](we won). The idea was to do it over an IVR call and without a necessity of smartphones. The users were older people in India who find it hard to use and trust payments. I would love to work on this. I feel, we need more projects in offline and digitally accessible payments. On one end we have Apple Pay later and the other hand we have cash.

[1] http://rohitg.in/portfolio/works/paisa.html

[2] http://st.sigchi.org/publications/toc/chi-2017-ea.html


Slightly tangent: Charlie Gerard's website is full of interesting experiments like this in the Computer-Human-Interaction space!,

Recommended.


You weren't wrong! Reminds me a lot of Gwern's website (https://gwern.net/)


Almost a decade ago Alipay experimented with this https://techcrunch.com/2013/04/14/alipay-launches-sound-wave...

Seems going nowhere


This makes me think of of the Amazon Dash button which had a microphone [0] that would listen for ultrasound emitted from your phone to configure wifi credentials.

[0] http://www.blog.jay-greco.com/wp/?p=116


Ah, it looks like the repo linked in the article should actually be from their personal account, not their stripe account [0].

It also looks like one of the key components is the `quiet.js` library [1] which itself is an emscriptem port of `libquiet` [2].

[0] https://github.com/charliegerard/ultrasonic-payments

[1] https://github.com/quiet/quiet-js/

[2] https://github.com/quiet/quiet


LISNR's been solving this problem for last 10 years. Have several patents in this space dealing with some of the noted challenges around transmitting secure data, increase data throughput, bi-directional support and etc. The LISNR SDK is used to process millions of transactions and growing. Made several announcements, working with and installed in several POS terminals and etc.


Couldn't get the demo to work (iphone/safari -> mac/brave; audio was playing and brave was recording).


LINE messenger uses similar technology for nearby contact discovery. Here's how it sounds.(Audio transposed to the audible range.)

https://youtu.be/GM4KP-YxKvo


LISNR is the leader in Ultrasonic Data Transfer, including payment data.

https://LISNR.com

*full disclosure I worked there 2019 - 2020


I came here to say that this is not exactly new and point to LISNR, although I was not exactly impressed by the payment use case the LISNR tech is pretty cool.


It's also how the newish Furby's interact with a phone or tablet app.

And I believe how Cisco teleconfernce allow wireless configuration of a screen share.


Crazy to see a Clinkle demo 8 years late


I have concerns about ultrasonic pollution if this is widely adopted.


will it cause issues with receiver when multiple senders try to send at the same time? Wont it cause chaos in the sound waves ?


This is a problem with any kind of other radio spectrum emission -- the way they solve this is by multiplexing, in various ways. Here's a quick google intro:

https://www.intechopen.com/chapters/66562

Some packets may get lost, but thanks to checksums we can detect and try some other clear channels/methods/etc.


Seems like you could just use algorithms used in other wireless communications.


Guide dogs seem like a non-starter.


Is this actually a problem? The length of sound is most likely pretty short as it's sending a short spurt of data (like a link). The speaker on the phone is also not very powerful so this would be akin to having someone playing music on their phone (for a very short period of time).


This is bad idea.

A while ago Alipay had this ultrasonic payment method, I queued for a vending machine, opened Alipay in advance because I never used this "ultrasonic payment" feature before, then just when the guy before me pressed the button, my alipay accidentally payed for him. Luckily he and I wanted the drink of the same price, so he payed for mine instead.

Using QR-code or NFC for payment won't mess up like this. And later Alipay completely canceled ultrasonic payment feature. I guess many people encountered the same problem. You can't control omnidirectional ultrasound broadcast. It's very insecure.


Same thing happened with GPay for India. Ultrasonic is too unreliable at the best of times


I find it all fascinating, but at the same time, a bit sad that we must resort to such hacks in order to have an interoperable medium.

It should be easy to send packages wirelessly across devices of different platforms in 2022.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: