Uncaptcha2: Defeat ReCaptcha with Google Speech2Text

timdierks · on Jan 5, 2019

This is irrelevant. Speech-to-text costs $0.006 per invocation (for < 15 seconds) [1], or you can solve 166 captchas for $1. There are already services out there which will solve captchas for $0.50/1000 [2], an order of magnitude cheaper. The fact that Google has a service which will do this inefficiently changes nothing about the threat/cost ecosystem. CAPTCHAs aren't about being a perfect defense, they're about increasing cost to operate at scale.

[1] https://cloud.google.com/speech-to-text/pricing

[2] https://2captcha.com/, the first hit I found with the search [captcha solving serving price]

Disclosure: I work for Google on security and cloud, but not on anything related to captchas or speech to text.

dessant · on Jan 5, 2019

There are free speech recognition services, such as Wit.ai, and a number of others that have generous free tiers. Also, it's likely that spammers will not bother finding a free public API, and will just use the endpoints of Google Voice Search, or any other similar service from other companies.

The unCaptcha paper and the team's research is very much relevant, because it informs the public about the effectiveness of these security systems, and it helps website admins consider these threats and possibly adapt to them.

kaffee · on Jan 6, 2019

I would love to pay money (or do some sort of proof-of-work hashing) rather than solve Google's infuriating, privacy-hostile CAPTCHAs. I rather suspect that Google, as an advertising and consumer-surveillance firm, gets rather a lot of information out of the system.

Rebelgecko · on Jan 6, 2019

I definitely preferred it when recaptcha was helping archive.org digitize books for the good of humanity. Felt much more altruistic than helping Google train neural network.

r3bl · on Jan 6, 2019

The rules were also so much easier.

It's a bit difficult to miss two words displayed on an image.

It's incredibly easy to miss a traffic light 30 meters away from the camera on such a tiny photo.

bscphil · on Jan 6, 2019

That's a funny, very accurate point. I would absolutely pay 10c to have a service automatically bypass Google's CAPTCHA for me every time I encounter one.

beardog · on Jan 6, 2019

They do train their AIs on the system, which is why we solve for so many store fronts, fire hydrants, etc.

notbestcomment · on Jan 6, 2019

From your link, it actually says $2.99 per 1000 ReCaptchas so $0.30 for 100. Still cheaper than the automated method of the author.

Also, the automated method is probably more reliable than humans and much faster. And the cost of the speech-to-text API could be lowered by using cheaper services or an in-house model.

(Looking at other services, they all seem to agree on the $0.20-$0.40 range, mostly dictated by the hourly wage of their workers)

akkartik · on Jan 5, 2019

Funny story: I can no longer moderate Disqus on my site because it pops up reCaptcha, and I don't even attempt reCaptcha anymore. And I can't ask for support because it pops up reCaptcha. And I can't export my data and delete my account because -- you guessed it -- it pops up reCaptcha.

One of these days I'll gird my loins and go into battle to convince a bot that I'm not a bot. One last time.

dessant · on Jan 5, 2019

Sometimes it won't accept correct answers either and you're just wasting time training their classifier for minutes. I found the audio challenge to be much faster to get past, so I've started switching to it a couple of months ago, then finally decided to automate it with a browser extension.

https://github.com/dessant/buster

darkpuma · on Jan 5, 2019

If you manage to force google to serve you the noscript version, it tends to accept correct answers the first time with no re-challenges. The javascript version of recaptcha won't let me through no matter how many times I give it correct answers.

However websites must specifically allow the noscript version to be used; by default it's disabled for all websites.

contravariant · on Jan 6, 2019

What do I have to do to force the noscript version to activate? At this point it's getting impossible for me to pass captchas, and the audio challenge is no help since I just get half a word or some other unintelligible utterance.

darkpuma · on Jan 6, 2019

Sometimes disabling javascript using e.g. umatrix can do it, but only in cases where that website operator has elected to use the most permissive setting (this seems to be rare.)

4channel.org is one site that allows noscript captcha, you can try it out there. But I've rarely seen it possible on other sites.

vfq · on Jan 5, 2019

You're the best.

Semaphor · on Jan 5, 2019

Use Chrome, create a google account, login, allow all cookies and scripts. Login to disqus. Export your data and delete your account.

SequoiaHope · on Jan 5, 2019

Dang. I didn’t realize recapcha depended on being logged in. Dark pattern.

darkpuma · on Jan 5, 2019

They also punish you for blocking cookies, using first party isolation, fingerprinting resistance, etc. ReCaptcha v2 operates much like ReCaptcha v3 (which does away with the question/answer interrogation entirely) without telling users it works that way. v2 will frequently reject correct answers just to punish users who aren't sucking up to the Google surveillance system.

Aeolun · on Jan 6, 2019

Is this related to the Google account? Because since switching to firefox and everything not-google I have been constantly harrassed by recaptcha everywhere.

Semaphor · on Jan 6, 2019

I think it's mostly the google account (and not blocking access to its cookies), but (as an FF user) I feel that just being on FF instead of Chrome is also a strike against you.

darkpuma · on Jan 6, 2019

> Is this related to the Google account?

It seems so. That's not the only factor but it seems to be strong one.

clairity · on Jan 6, 2019

neat, now we have a test for our anti-google privacy efforts. multiple captcha, good for you! no captcha, not so good...

(in most cases, i just close the tab if i get a captcha challenge-i’m sick and tired of the tracking)

akerro · on Jan 5, 2019

I've seen this a week ago on reddit. The researched told google about this vulnerability and Google doesnt care about it, they are totally OK with it. You can see here that captcha doesn't block robots, but blocks people and makes browsing inconvenient. reCaptcha is a way google mines data from us for free.

nolok · on Jan 5, 2019

Oh please, what a total joke of a comment.

> Google doesnt care about it, they are totally OK with it

Google hasn't said they don't care about it, where did you see any of that ?

They merely allowed the code to be released despite it still working against the current. Previous experience (namely the original uncaptcha) prove that they intend to find a way to fix it.

> You can see here that captcha doesn't block robots, but blocks people and makes browsing inconvenient.

Total BS, remember it's not Google that uses it, it's website owners (us), if what you claim was indeed true we wouldn't be using it, we would use something else that did what we wanted.

> reCaptcha is a way google mines data from us for free.

Of course, through the visual selection it displays when "unsure", although I do not know the detail it seems pretty obvious that once it's sure you're human it sometimes ask you to detect things in picture anyway so as to provide training data (for maps, waymo, image search, whatever ...)

avian · on Jan 5, 2019

> if what you claim was indeed true we wouldn't be using it

Not saying this is the case here, but cargo culting does exist in tech, so this is quite a weak argument in my opinion.

nolok · on Jan 5, 2019

I 100% agree but it's not like captcha are a complicated things or that we didn't all more or less switch to it from something else.

I can say for my current needs right now that if created a non-subscriber posting content page tomorrow I would use a captcha because it removes enough bot to be worth it, and I would go with recaptcha because I find it the better one for end users (I as a user prefer to see it on websites compared to other solution).

birksherty · on Jan 6, 2019

100% nonsense. When I use Firefox with 3rd party cookie blocked, it does not accepts my correct answers. Waste 5 minutes of my life to beg google to let me in.

Allow 3rd party cookie, log in to google and I only have to check a box.

Same when I use VPN, it does not accept any of the correct answers.

So when Google sees that I am trying to protect my privacy, it punish me by having to work for them.

One more thing. If try to use audio challenge in the first case, it directly told me that I am using some method to solve captcha and they won't allow it. So much fun.

jsnell · on Jan 5, 2019

I have literally never been asked to solve one of those image recognition recaptchas in my main browser profile. (While it happens once a month in incognito windows.)

So it's not at all obvious that known humans are being asked to solve captchas just for the purposes of training.

Semaphor · on Jan 5, 2019

It barely asks you if you use Chrome and/or are logged into a Google account. ReCaptcha is how you make Firefox and IE/Edge users without a google account hate you.

Because believe me, if I get asked to click another 50 cars without good reason, (3 failed logins would be a good reason) I'll blame your site for being dumb and not google.

hhjinks · on Jan 5, 2019

And here's my anecdotal evidence:

I have only been served image captchas since forever. I literally thought the warped text captchas had been phased out. I literally never see anything but image captchas.

darkpuma · on Jan 5, 2019

> I literally thought the warped text captchas had been phased out.

It has been since March.

https://developers.google.com/recaptcha/docs/versions#v1

rincebrain · on Jan 5, 2019

I think the warped text ones have been phased out, but AFAIK it's in favor of the ones people mentioned above and some black magic for detecting humans without needing to click things.

(I work for Google, on nothing related to browsers or recaptcha, this is purely my impression from encountering it logged in and out.)

nolok · on Jan 5, 2019

For the record, the one I meant are the second solving: sometimes recaptcha ask you one (I believe it is genuine), and then after you succeed it ask again with another set of picture (sometimes another question), which I believe is for training.

I have it semi-regularly (like once or twice a week); but I also have some automated tooling using my account AND I travel quite often so location testing probably flag we as weird.

Remember that the original recaptcha also did that with text to help train OCR (it would send a known word and a unknown word, if you succeeded at the known word it would record the answer for the unknown one, and after enough people gave the same one train it as the proper OCR'ed text).

jobigoud · on Jan 5, 2019

"second solving"? I've sometimes been asked ten times in a row to click on the shop fronts or traffic light tiles.

saagarjha · on Jan 5, 2019

I feel that this depends on how fingerprintable your browser is. Signed in to Chrome? You’ll likely see nothing. But logged out and using Safari’s Private Browsing? You will likely have to do it multiple times.

Izkata · on Jan 5, 2019

I have the feeling this is also configurable somewhere.

A few months ago, my bank added the image-clicking one to its login screen. I've always gotten past that one on the first attempt. But with the same profile on the same Firefox, all other sites always take multiple tries.

darkpuma · on Jan 5, 2019

Google allows website operators to configure how sensitive they want v2 or v3 to be, but in practice it makes little difference. The only apparent effect on v2 is that the least sensitive setting permits the use of the noscript version. The least sensitive setting of v2 will still harshly hassle and punish users who aren't in compliance with google, unless they use the noscript version. Then and only then, it lets people through for correct answers every time.

unilynx · on Jan 5, 2019

The 'owner' of the recaptcha API key can set a slider on a scale from "Easiest for users (some security features turned off)" to "Most secure (all security features turned on)"

dorgo · on Jan 5, 2019

I keep selecting random tiles until the captures get easier and I can solve one without thinking. It takes about 10 to 20 iterations.

Boulth · on Jan 5, 2019

>> Google doesnt care about it, they are totally OK with it

> Google hasn't said they don't care about it, where did you see any of that ?

Check this out:

> The Recaptcha team is aware of this attack vector, and have confirmed they are okay with us releasing this code, despite its current success rate.

Source: https://github.com/ecthros/uncaptcha2/blob/master/README.md

swsieber · on Jan 5, 2019

The parent addressed that in the next line.

> They merely allowed the code to be released despite it still working against the current. Previous experience (namely the original uncaptcha) prove that they intend to find a way to fix it.

r3bl · on Jan 5, 2019

I'm using Buster[0] for this purpose, and it relies on the same method. Available on Firefox, Chrome and Opera in their respective add-on stores, and no additional steps are needed (like in this project).

From my experience, it works perfectly in a default session and not at all in the private browsing mode. I've never bothered to figure out why is that (possibly some other add-on interfering).

[0] https://github.com/dessant/buster

zamadatix · on Jan 5, 2019

reCAPTCHA relies on things like Google cookies to lower the "user is a bot" risk score. Higher risk scores (such as when you go via a blank slate browsing session) result in more/more difficult challenges.

darkpuma · on Jan 5, 2019

> more difficult challenges.

That's just code for "it rejects correct answers to frustrate you." If you manage to get the noscript version of the captcha with otherwise the same browser state it will accept a correct answer the first time nearly every time. Presumably this is because they didn't bother to implement their "hassle the user" code in the noscript version; it's probably neglected by google since it's disabled by default.

For instance, the sloooow fade in of challenge tiles... what legitimate purpose does that serve? That's not there to make it harder for bots. That's there just to hassle and punish real humans that google dislikes because they don't buy into the google 'ecosystem'. The more they dislike you, the slower the fade in gets. The fade-in can be several seconds long in severe cases.

taneq · on Jan 6, 2019

I run a combination of uBlock Origin, Privacy Badger and Firefox's tracking protection. Can confirm, tiles take 5 seconds to fade in, I have to do 3-5 rounds of it, and unless it's really important I'll just tell reCaptcha to piss off.

bennofs · on Jan 5, 2019

If your score is too low, you aren't allowed to do the audio challenge at all

ltc5505 · on Jan 5, 2019

The creator of that repo commented in this thread as well. Pretty neat.

mockingbirdy · on Jan 6, 2019

I've built the same in the past to solve ReCaptchas and my question is:

Why on earth did they publish this?

I've kept it secret because Google will close this loophole and probably make it more difficult for disabled people to verify that they're humans. And Google is not dumb: They already know that speech recognition "breaks" their bot detection, just like screen readers - this is about accessibility. Publishing stuff like this will increase the pressure so they will be forced to "improve" their bot detection system - which simply means that even more people won't be able to solve those captchas.

Heck, some weeks ago I've tried to solve a ReCaptcha for literally 10 minutes! My answers were right, it was a matter of discrimination. My point is: My bot automation is able to solve a Captcha faster than a human being. This is silly and ineffective.

And about the people who've published this: they think they do someone a favor with this. But I can't see how it's in anybody's interest to release this into the public (especially on a site like HN where Googlers are reading). If they would propose a better solution for website owners to secure their sites, fine.

But everyone who's talking about "vulnerabilities" like this makes it more difficult for real people to access the websites that they want to use. I know disabled people who can't solve those captchas - it's just too much of a hassle while it's easy for my bot automation to do it.

We should really ask ourselves what we're really trying to improve here.

robbomacrae · on Jan 5, 2019

I used to work at SoundHound and 3 years ago we had some weird illegitimate looking accounts using our Houndify platform.. turns out they were for breaking recaptcha. It was a bittersweet verification that our voice recognition was ahead of Googles but we had to put in protections against that sort of abuse so we weren't enabling spammers...

aerique · on Jan 6, 2019

Did you end up using recaptcha?

vowelless · on Jan 5, 2019

Here is a similar attempt from 7 years ago (LayerOne 2012). Reveals how 'simple' life was back then.

Bonus: the talk is HIGHLY entertaining. Their approach gets counter measured by Google an hour before the talk and so they can’t demo it anymore.

https://youtu.be/Mj3thHKeKyg

darpa_escapee · on Jan 5, 2019

> https://youtu.be/Mj3thHKeKyg

This was entertaining. Thanks for posting it.

lawrenceyan · on Jan 5, 2019

The importance of recording your demos haha!

alfongj · on Jan 5, 2019

From what I can read on the twit and GitHub, the researcher hasn't proven this works at scale.

The point of recaptcha is blocking "captcha farms" or automated bots from abusively creating accounts, buying tickets, etc.

The author hasn't demonstrated that this attack is effective in those scenarios. The only thing he has shown is a very convoluted way for a human to solve a recaptcha (harder for 99.9% of humans than the standard recaptcha experience)

That would explain why Google didn't care about them publishing this.

dwighttk · on Jan 5, 2019

why not use the project link? https://github.com/ecthros/uncaptcha2

normalhuman · on Jan 5, 2019

Because the twit summarizes the interesting bit, while the project page not so much. Which is good to motivate people to learn more about the project, right?

doe88 · on Jan 5, 2019

Ironically sad that this protection is thereby most easily defeated by its purpose of accessibility. Pessimist in me thinks there is no good deed possible in this world.

xnx · on Jan 5, 2019

Can god make a stone so heavy that he can not lift it?

kolanos · on Jan 5, 2019

1. In OSX, enable VoiceOver (CMD+F5).

2. Navigate to a RECAPTCHA.

3. Click the "I Am not a bot" checkbox.

How is this not an exploit? Is Google doing something extra when it detects a screen reader?

hartator · on Jan 5, 2019

What do you mean? Enabling VoiceOver on OSX is being white-listed by ReCaptchapa?

kolanos · on Jan 5, 2019

Yes. Or at least it appears that way.

dcbadacd · on Jan 6, 2019

I don't doubt that Google employees are reading this thread and this might make the life of sight-impaired a bit worse.

hartator · on Jan 5, 2019

This exploit has been known for a while.

Example: https://github.com/mikeyy/nonoCAPTCHA

Memosyne · on Jan 5, 2019

I'm guessing Google will start using some sort of steganography to prevent this from happening.

dwd · on Jan 5, 2019

You would think they could apply a hidden track at a certain frequency that is inperceptable to a human or a pattern of changes in volume to prevent it being passed through their API.

Of course the next step will be to resample the sound to remove the steganography...and the arms race continues.

js4ever · on Jan 5, 2019

Or simply use other speech to text api not provided by Google

zamadatix · on Jan 5, 2019

Historically they just make it more complicated http://www.dc949.org/projects/stiltwalker/

Steganography prevents for people to use their exact service while making it more difficult in general prevents people from using any existing service.

jdietrich · on Jan 5, 2019

The obvious approach would be to generate adversarial examples.

https://blog.openai.com/adversarial-example-research/

ilovetux · on Jan 5, 2019

I think a simpler solution would be to check if a text to speech API request resulted in the correct answer as part of validating a captcha submission.

desas · on Jan 5, 2019

There are other free text to speech APIs.

bob_theslob646 · on Jan 5, 2019

Very impressive!

Verification seems really hard.

How do you go about verifying very other not the user is an actual person?

anewguy9000 · on Jan 5, 2019

captcha is a lie.

tossaccount123 · on Jan 5, 2019

Google doesn't care, they just want free training data. The team behind uncaptcha even gave them 6 months notice and Google still did nothing

their own sites are protected by additional measures to detect bots like monitoring mouse movement

jayd16 · on Jan 5, 2019

Isn't the text based ReCaptcha pretty ancient at this point? Google's current version is either the simple checkbox (which I assume is checking various things) or the image based version where you have to click on traffic signs. I like to assume that's a live feed from a Waymo car and I'm saving lives.

darkpuma · on Jan 5, 2019

> "Isn't the text based ReCaptcha pretty ancient at this point?"

That was v1, they shut it down in March 2018. You won't see it anymore anywhere.

> "Google's current version is either the simple checkbox (which I assume is checking various things) or the image based version where you have to click on traffic signs."

That's v2. v2 will present you with a simple check box if you're very compliant with the google surveillance system, or will present you will image challenges if you're not (or if it's just in the mood.) v2 is very capricious and will reject correct answers from users google wishes to punish for, e.g. using firefox, using adblockers, using resistfingerprinting, blocking google's cookies, etc.

The recently released v3 is the worst of them all; it does away with the image challenges of v2 completely. The user never interacts with it directly, never has an opportunity to persuade v3 that they're a real human by answering any sort of questions. It's nothing more than a measure of how compliant you are with google's surveillance.

gingerlime · on Jan 6, 2019

> The recently released v3 is the worst of them all; it does away with the image challenges of v2 completely. The user never interacts with it directly, never has an opportunity to persuade v3 that they're a real human by answering any sort of questions. It's nothing more than a measure of how compliant you are with google's surveillance.

I did a very quick experiment with reCAPTCHA v3:

  * using Firefox in private browsing mode (not logged-in to anything)
  * with a VPN
  * using uBlock Origin
  * Do-Not-Track on, disabling 3rd party trackers
  * Only went to one page and filled one form with garbage data

My score was 0.7, which is pretty decent I would say.

I did a similar experiment using Ghost Inspector (a platform for automating browser testing, something similar to Selenium, but not sure what they use exactly), and my scores were consistently 0.1.

I'm also a bit suspicious of Google, and have trouble with the fact that this is the only solution on the market, and it's free for websites to use. But I'm not sure your statement is entirely accurate judging from my very limited experience.

birksherty · on Jan 6, 2019

On Firefox, if disable 3rd party cookie, it never accepts my correct answer. I have to fight 5 minutes with google to prove that I am human. Same when I use vpn. Allow 3rd party, logged in to google account, no vpn and just one check.

Edit: If try to use audio challenge in the first case, it directly tells me that I am using some method to solve captcha and they won't allow it. So much fun.

Google is evil.

gingerlime · on Jan 6, 2019

I was talking about reCAPTCHA v3, which doesn't present any challenges at all. It lets the site owner decide what to do based on the score.

birksherty · on Jan 6, 2019

I think nothing appears if my score is above google's (because they decides my mark in the end) pass mark. If not recaptcha appears which is what happens to me and I go through the same process.

gingerlime · on Jan 8, 2019

This sounds like reCAPTCHA v2

birksherty · on Jan 8, 2019

My mistake. I checked now. With vpn, firefox, 3rd party cookie block I got 0.3 score which is too bad. Google logged in, allow 3rd party and vpn gives 0.9.

darkpuma · on Jan 6, 2019

I get 0.3 with firefox right now when blocking their cookies, using first party isolation and resist fingerprinting. In the past I've gotten as low as 0.1 with a similar configuration.

anticensor · on Jan 5, 2019

Obligatory xkcd (crowdsourced steering): https://xkcd.com/1897/

Y_Y · on Jan 5, 2019

> Sorry, you are rate-limited. Please wait a few moments and try again.

I don't know why twitter blocks my Android Firefox, but I feel as if we both benefit.

SheinhardtWigCo · on Jan 5, 2019

Firefox is blocking one of their scripts from loading. Same problem with the Firefox Focus content blocker on iOS.

hhjinks · on Jan 5, 2019

This happens to me whenever I use the Galaxy Browser. First request is always "rate-limited". When I immediately reload the page, it goes through. I always thought it was a cheap tactic to have people install their app.

zufallsheld · on Jan 5, 2019

Same problem for me on Android with Firefox.

sciurus · on Jan 5, 2019

Wow, so it's not just me! I'll ask around Mozilla and see if anyone knows why Twitter does this.

r3bl · on Jan 5, 2019

A refresh usually helps, and this "feature" isn't exclusive to Firefox.

For some reason, it is exclusive to the links opened from another apps and doesn't appear when accessing a link directly (via a refresh). That might narrow down your search a bit.

berbec · on Jan 5, 2019

I always was rate limited until I signed into Twitter. Problem solved.

I think they cap guest views/minute.

Infinitesimus · on Jan 5, 2019

It happens to me often when I open a reddit link from twitter in an embedded WebView (using the bacon reader app) so it's unlikely to be a mozilla issue

paulgb · on Jan 5, 2019

It happens to me too, also on Firefox mobile. I assumed it was because I block some trackers but maybe not?

Also t.co links from within the Twitter app don't load at all from me until I hit "open in Firefox" on the drop-down.

edoo · on Jan 6, 2019

"The team has allowed us to release the code, despite its current success."

First off permission was never needed to release the code. Second, Google's interest in captcha is not to protect websites but to further their machine learning algorithms. Unless the captcha mechanism was destroyed to such an extent nobody used it they will be happy to accept captcha requests from automated systems. Google may seem like it sometimes but it is not your friend.

dang · on Jan 5, 2019

Url changed from https://twitter.com/FGRibreau/status/1080810518493966337, which points to this.

fxfan · on Jan 5, 2019

Tldr:

Captcha -> request Audio verification-> download mp3 and send to google API for recognition -> 91% accuracy