Hacker News new | past | comments | ask | show | jobs | submit login
Tool can make photos undetectable to facial recognition without ruining them (dailydot.com)
228 points by 1cvmask on April 27, 2021 | hide | past | favorite | 115 comments



A silly thing. As others point out, if it spreads, photos altered by it will quickly become part of the training set, and the method will be defeated.

And in general, it's an unwinnable quest: by definition, you cannot have a system that makes photos "undetectable to facial recognition without ruining them", because if another human can recognize you on a photo, then a machine can be taught to do the same.


The logic works both ways. Given a recognition system you should be able to build/train another system to find and amplify errors in the former. The more robust the former is the more likely the latter adversarial system is to be of lower complexity (cheaper) giving it some advantage.


The problem is that when you put a file out into the world, you can't continuously update it with the latest advances in adversarial machine learning. It'll have a reprieve ranging from months to years, then be vulnerable forever more.


But there's an upper limit since the photos still need to be recognizable to people.


I think the better word is “trained”, not “robust”.

By making almost invisible (to humans) distortions to a picture, a previously well-trained algorithm is now facing a completely unknown problem.


That implies the amount of training is the only factor. It may be that new architectures or training methodologies could be developed to reject large swathes of these types of advesarial examples. Though of course, if adversarial training is used avdersarial examples will obviously exist at some level, and so the arms race continues.


But for every adversarial distortion algorithm, you’d in theory have to double your training set, wouldn’t you?


I'm not sure I follow the point about robust facial recognition being cheaper to break. Like others pointed out, I don't see how this could be a scalable tactic.


It's an arms race then, sure. But why not stay in the race?


I don't think it is an arms race; that implies if you keep plpaying you're really just buying time and raising the stakes until one side ultimately wins all in a zero-sum game. SPOILER: the image classifiers will not only win in the end, all that defense you played is now fuel to make them even better.

I think a better notion is that of sunk cost. We have lost any approach that fights AI head-on. We should stop chasing this path and switch tactics, like changing online behaviours, education about privacy or, since developers love technical solutions, pursue technology solutions that leverage AI against this sort of cause. Example: what if we all switched to a morphing, deep-fake enabled visual representation? The better image identification got the more intense the msitakes would be.


The term we’re looking for might be “cat and mouse game.”


The last I looked, Jerry is still in the running.


I'm reminded of the suits they wore in A Scanner Darkly.


With the Purdue trial. I keep coming back to how this weird story was actually describing our reality.


The difference is that if you 'lose', all your previous victories change to losses as well, as all the images you've released that were previously impervious are now fair game.


That still doesn't mean it's worse than not doing anything at all.


It may, depending on the resources required. The thing about arms races is that they can easily impoverish everyone involved for no ultimate gain relative to one another.


Fair enough. I'd argue that in this case the winning move is not to make the potential liability of the data available.

As an 'arms race', the outcomes are very unbalanced.


it does mean doing the wrong this is worse than doing nothing. See: opportunity cost.


Legitimate point.


In the long run, you are dead, at which point you probably don't care as much.


Not necessarily. The phrase "buy some time" springs to mind. By the time your previous "wins" become "losses" it may not matter anymore.


The end of the arms race is seen in captchas — where to defeat computer recognition they are largely outside the capabilities of human recognition.


That is the nature of captchas by design. 2000s captchas can probably be solved by off the shelves OCR.


Because arms races are pointless.

See War Games (1983)


But WarGames is a fictional movie about a mutually assured destruction scenario? And even though it was famously watched by President Reagan [0], his response was decidedly not to abide by the philosophy of "the only winning move is not to play". Reagan presided over massive budget increases to keep up and attempt to win the arms race, including the extremely expensive but impotent "Star Wars" program [1].

Unlike "Star Wars", increasing the computational complexity and costs of mass face surveillance is not a decades-to-soon monolith project.

[0] https://en.wikipedia.org/wiki/WarGames#Influence

[1] https://www.armscontrol.org/act/2004_07-08/Reagan


I didn't watch the entire movie, just that clip- but I think that if you consider it slightly differently, the only winning move is not to launch, because everyone will die. If you don't have enough stuff to make sure everyone dies, then there is a chance of a winning move that results in some percentage of one party's population surviving and as a result they might take that chance.

IMO, As a U.S. President, Reagan made the right decision to keep pushing, especially considering that he knew to some extent that the Soviets, their enemies, couldn't keep up forever financially. It's a really really cold calculation, but one that worked out.

As we stand now, I think that privacy is on the losing side here, unfortunately- at the end of the day, we're up against our own and other governments, some of which have quite authoritarian regimes and have absolutely zero problems with at the very least using force to acquire funding.

In WarGames, the only way to win is "not to play" when the playing field is level- this playing field is not level- to be able to beat this off, we need to be able to level the playing field somehow. Maybe I'm not smart enough or just can't see something, but I frankly don't know how that would be possible.

...holy cow, that's dark.


Regarding the Star Wars program, perhaps the high cost was part of the motivation? (To saddle the opponent with similar costs)


I hear this quite often, but I just wonder if that is hindsight rewriting/spinning the situation? Did the people in charge of Star Wars program know it was a sham and knew from the beginning that's what it was, or were they really trying to build a space based defense program that turned out to be too complicated and too expensive to do?


In reality "star wars" was a moniker applied to several different programs which were part of the Strategic Defense Initiative geared towards missile defense. Part of the Strategic Defense Initiative was the ISTO, which funded basic research at national labs, which included some promising but ultimately impractical technologies - as is typical of basic research. Tech like isomer lasers and particle weapons made headlines, while missile guidance systems and radar systems and other such technologies which were much more seriously pursued didn't. Many of the technologies developed by the SDI remain at the core of modern missile defense systems. In fact, when one looks at the history in context, SDI only started getting wound down when the soviet union stopped being a threat and thus there was no longer a need to stop a large scale nuclear missile strike; while programs got renamed and were refocused on stopping different threats, the work never really stopped. So it would be inaccurate to describe the program as either a deliberate or accidental boondoggle.

That being said, ever since the very beginning of missile defense, it was always understood that the system would never be a permanently impenetrable shield. At best it might take a few years for adversaries to develop a new weapon to get through, but most likely a system would only be able to deal with a portion of targets. But the purpose of missile defense is not to raise the costs for the adversary - if now they have to launch 3 missiles to guarantee a hit where before they needed 1, you've essentially tripled their costs, not including the development of any new technology which could also be considerable. The fact is the Soviet Union did spend a substantial amount on developing technologies to defeat Star Wars, and even today China and Russia are spending significant resources developing technologies like hypersonic missiles to deal with the successor programs. So while the implication that the US purposely pursued technologies it knew would never work to trick the Soviet Union into following suit is not true, the US was trying to hurt its adversaries economically.


This is outright wrong. At some point one side outperforms, or "in universe(a la Cold War)" outspends/demoralizes the other and there is victory - aka, control of the global financial markets and geo-political dominance.

Is it a net win for humanity? Having stockpiles of nukes isn't. The space, science, athletic, and social welfare race is.

Competition drives progress. Hegemony leads to stagnation.


Ah, so mutually assured destruction has been impossible this whole time? What a relief!


I think you may be casually dismissing the entire field of game theory.


As far as I know, the elephant in the room problem [1] is still unsolved? Without support for double take, an adversary attack would still be fairly simple by simply including out of place information?

[1] https://www.quantamagazine.org/machine-learning-confronts-th...


> And in general, it's an unwinnable quest: by definition, you cannot have a system that makes photos "undetectable to facial recognition without ruining them", because if another human can recognize you on a photo, then a machine can be taught to do the same.

When I was about thirteen years old and in secondary school. For the information technology class there was a quæstion on the test which essentially amounted to answering whether machines would ever be able to perform any task a man can, and the “correct” answer was “no”.

I protested this and thankfully the teacher agreed that such speculation should not be part of the curriculum and that though unlikely, one can never truly know.

It's funny how a.i. has advanced so rapidly know that it's almost considered a certainty that for any given task, a machine can be made that does it as well as a man eventually,


The machines are just nibbling at the edges of what we can do. I think the jury is still out whether GAI will be a thing.


I am wondering if there are some powerful biases in the algorithms, for example "cats" so you add a few small shadows that a human will not notice and the ANN will trigger "Cat" all the time. Then if they fix this you need to find all the other triggers that are hypersensitive and focus on targeting those.

So you would add pixels in an image that trigger cat,car, bus, street etc . You will make the technology more expensive and harder to use by majority of bad actors.


> You will make the technology more expensive and harder to use by majority of bad actors.

I mean, we do have encryption, which is cheap for users and expensive for bad actors. The question is whether an analogue exists for AI.

In a sense, there's three types of security: exponential, proportional, and logarithmic. Encryption strength is exponential, number of pins in a lock is proportional, and bug fixing may be logarithmic.


What if we use DeepFakes to overwhelm the internet with pictures of people who don't exist? There by giving them LOTS of photos that have to be shifted through?


That would require that all social media is flooded with fake pictures (maybe more than it already is) making the Internet less useful for humans too.


> You will make the technology more expensive and harder to use by majority of bad actors.

It's software, mostly. People will share SOTA algorithms for free, like they do today. I don't think the cost for bad actors will rise much (and it's not like most of them have money problems).

On the contrary, I imagine it'll just accelerate development of computer vision that reproduces human vision in a bug-compatible way. The only way to confuse it will be to make the image confusing to humans too.


>It's software, mostly. People will share SOTA algorithms for free, like they do today. I don't think the cost for bad actors will rise much (and it's not like most of them have money problems).

Can I get access to high quality voice recognition or text to speech software? The only way I know is to use cloud APIs to get access to latest and greatest.


The algorithms, yes (see e.g. wav2letter and others), but not the training data. Mozilla's Common Voice (https://commonvoice.mozilla.org/en) is trying to make training data available, but proprietary systems can use that as well as the existing data advantage they have.


Most bad actors do have money problems, or at least are meaningfully resource-constrained.

The really scary ones, sure, they may as well have resources: Yes. But that's almost the biggest part of what makes them scary.


It could be very useful as a tool to avoid over-fitting though :)


Isn't this basically how GANs already work?


Not really. Well, sort of. If you have a small dataset, you can use data augmentation to "stretch" the dataset into a larger size. It's unknown whether data augmentation helps in general though (i.e. when you have lots of training data).

An augmentation is color dropout (photo becomes greyscale), cutmix (splicing two different training examples together), blur, noise, yada yada.


> if another human can recognize you on a photo, then a machine can be taught to do the same.

Sounds like:

If a human can do X then a machine can be taught to do the same.

Which is not necessarily true.


It's easy to say that a super generalised version of another person's claim isn't always true. But face recognition in static pictures works pretty well by now, and it would just take a little bit of digging to find examples where a trained model can do it better than a human.


What object in a picture can be detected just by a human but not a machine?



mirrors? not the reflection but the reflector.



> Which is not necessarily true.

Yet.


Humans are machines, so why wouldn't that be necessarily true?


Machines are designed, constructed, and built by humans.

If humans are "machines," then who built us?


I think it's debatable whether that's a necessary condition for something to be reasonably considered a machine.


I'm not saying this is an exact analogy to facial recognition, but there are all kinds of long-term unwinnable quests we participate in, because the alternative of giving up is worse: vaccines, antibacterial measures, crime prevention, fighting poverty, and so on.

There's clearly value in all of those things, even though we have no clear path to (or even defensible expectation of) ever completely solving any of them. In these cases we define progress as mitigation or delay of the consequences rather than elimination of them.

Some people can imagine a technological solution to them, but that's currently science fiction — a counterfactual — and is it more counterfactual to imagine a perfect adversarial image generator than it is to imagine a perfect antibacterial agent, or a post-scarcity economy? I don't think so, personally.


There's a difference between something we can't "win" but can suppress to low levels, and something that makes no difference.

This tool makes no difference. If it's deployed in any non-trivial capacity, anybody doing nefarious facial recognition will just train their recognizer against it.

These sorts of stories pop up every few months on HN about some tool that will defeat deep learning. But invariably they are also deep learning powered tools trained against a specific recognizer. The takeaway should be about how fragile current recognizers are, and not about any one tool that's going to "defeat" facial recognition.


It's like using a lock against thieves: you don't have do make an unbreakable lock, you just need to have a lock which is harder to crack than its neighbors.


Disagree. The reason that logic works with thieves is that the thieves have limited time and resources, and both high risk and high opportunity cost.

In this metaphor, if Thief #1 can't do it, he hands it over to a "bank" of 1,000,000,000 Thieves who will get to work opening it, while Thief #1 moves on to the next house.


Agree w gparent. The thieves largely use the same toolsets and aren't making bespoke tools for each job. If the obfuscation volume doesn't rise to a level where it is worth addressing by a standard toolset then the thieves won't have access.


After all the snakeoil that depends on facial recognition is bypassed, that will still before it's retrained.


Unless you can generate novel ways to do it continuously


That's a viable approach so long as three things are true. First, you can continue to do so in a way that your adversaries cannot follow your stream of novel changes. If they can follow they can adjust as you do and your novel ways fail upon arrival. Think Star Trek Borg.

Second, this approach works so long as novel ways can continuously be found and the systems you are attacking are continuously vulnerable to incremental attacks. In other words, this approach requires that the recognition you are attacking never generalize in any way beyond addressing your attacks. As others have noted, human brains can recognize past many kinds of distortion, including novels ones. This implies that computers can learn to do the same.

Third, and perhaps most concerning, this approach is only really useful if your adversaries never have, use, or find value in performing recognition on stored imagery. Stored imagery in the hands of your adversaries cannot be subject to ongoing novel distortions. It's fixed and can therefore be subject to recognition that has adapted at a later date. In short, for a novel approach to be useful it needs to stand up to attack for a sufficient amount of time as to not be useful for the attacker.

Using a continuous stream of novel distortions is an intriguing idea! It's a type of polymorphic attack. I think it's worth considering in the context of the traits that would make it most useful: an unpredictable and unfollowable stream of mutations, an un-generealizable approach, and resisting attack for long periods of time.


Also, Google isn't best in class at reverse image lookup so they chose to test themselves against the lower bar. If you try this exercise in yandex it finds results of Biden, you just have to crop out the surround text.


You need a machine to beat a machine.


Nah, I can always just pull the plug on the machine. Who wins now?


Took both versions of a lady on a yellow background photo, flipped the altered one horizontally and overlapped them in Photoshop. Looking now at the layer difference and I see nothing other than slight outline highlight and some artifacts on the background (that can easily be cropped). The faces are identical, none of the biometrical features has been altered, once you flip the photo back to normal orientation, it's basically the same photo - diff is almost completely black.


You're right. As far as I can tell all they have done is flip the image and then put out a press release that they've created some new technology to beat "AI" without actually doing anything. The average reporter is beyond non-technical and doesn't have the chops to suss out charlatans.


Look up adversarial noise. That's a technology that can fool SOTA methods.


Adversarial attacks rely on a specific model’s gradient (since you’re essentially trying to find the most sensitive pixels).

Adversarial noise that affects model A won’t necessarily work on model B. That said, most people transfer train from well trained nets (ImageNet, Inception, etc).

Finally, not all SOTA methods are susceptible to adversarial attacks, eg capsule networks.


>Finally, not all SOTA methods are susceptible to adversarial attacks, eg capsule networks.

They appear to be susceptible: https://arxiv.org/pdf/1906.03612.pdf


That’s neat; hadn’t seen that paper. Thanks for sharing.


always wondered why a low pass filter isn't a standard part of the training pipeline?


The early convolution layers could implement a low-pass filter with the appropriate weights.

Presumably the learning algorithm would do so if it were beneficial.


And yet there are tools to confuse networks with high frequency artifacts. If the network isn’t trained to ignore that, it won’t - but you don’t need a neural network to perform a low pass filter step if you can do that efficiently before asking the net what it sees on the already preprocessed image.


For specific models.


It’s even worse than that.

I pasted the photo verbatim into the first online face recognition tool I could find.

Found the face with no problem whatsoever.

https://imgur.com/gallery/5beg8qg


I don't think there doing it can't detect the existence of faces, rather that they can't recognize them.

But this appears to be utter BS. They're not showing the results of a facial defection system, they're simply sticking it in TinEye. TinEye is a tool that helps you find the same photo elsewhere, it has absolutely zero, zip, zilch to do with facial detection.


I’m flagging this as false news - both me and others pointed out that even the example in the text doesn’t fool even the basic online services for face recognition.

https://imgur.com/gallery/5beg8qg


The fight against facial recognition will be a back and forth between the facial recognition itself and algorithms trying to prevent it (like this one). One issue is that parties using facial recognition have a big advantage: They can run new algorithms against old pictures.

Tools like these might be useful in some cases, like preventing stalking _right now_, but can't (sadly) guarantee a prevention in any way for the future.

I assume this tool will be marketed differently.


The problem is more in the case that if you have access to the ground truth data, and use such a tool to create images without detectable faces, you can artificially create more testing data. In the end this training data will be more robust, and the underlying method, like any form of CNN, will be able to generalize better.


> The new DoNotPay app acts as a “robot lawyer,” letting you “sue anyone by pressing a button.”

> DoNotPay also offers services that help you renegotiate a cheaper flight or hotel price.

> “We are in an AI arms race between good and evil,” Browder added. “We hope to help consumers fight back, protect their privacy, and avoid stalkers on dating apps!”

They should add video streaming and messaging, then it would be the only app you ever need \s


DoNotPay asks for my bank account information (routing, checking) before even letting me set a password.

That's a bit too quick for me, DoNotPay. Let's have dinner first.


How much of the effect is actually just a result of them flipping the image?

TinEye, as I understand, doesn’t even use facial recognition.


Yes, the TinEye example doesn't seem like a great metric. TinEye is pretty great -- it can still match images after scaling and resolution changes, which this technique obviously foils (for now) -- but it's not trying to do facial recognition. I think this is simply beating TinEye's "perceptual hashing" technique, mostly just by flipping the picture, and has nothing at all to do with facial recognition.

A better test would be to load a few dozen pictures of Biden into Google Photos, name the face, and see if it correctly names their Biden example. I'd bet dollars to doughnuts it can.

I don't think this is affecting facial recognition at all. Google Photos can often recognize people in the far background, turned away, and out of focus, so I don't see how bit-flipping a few pixels will affect that.

----

Edit: Indeed, the TinEye result can trivially be achieved by flipping the image, as they do. Link to a search of a flipped photo of Obama it can't find. I contend this has zero application to "facial recognition," which is not what TinEye attempts to do.

https://tineye.com/search/b9d6f95346a7636bbd85e1bcfb931a9e82...


I have a couple questions I didn't see addressed in the article. I like this tech, and am curious to learn more.

1. Are the changes detectable?

2. Are the changes reversible?

3. Are the changes deterministic?

1 and 2 go toward whether this can be counteracted in some way. 3 is concerned with whether we end up making an alternate population of photos that can eventually tie back to the same real person.


I use to flip horizontally some of my movie snapshots I feature on my movie quiz game [1]. I do that randomly, not on all images (like the ones who have text on it). It fools Google Images Reverse Search a lot.

Not a great solution but at least it slows down the cheaters and since it's a game where answering fast is important, it works nicely.

I tried adding noise to fool Google AI but that didn't work at all.

[1]: https://twitter.com/whattheshot


I've been meaning to print the EURion constellation[0] pattern (repeating) onto a t-shirt just to see what it might break. But unsurprisingly getting it printed onto a t-shirt is easier said than done due to multiple printers/software/etc directly blocking it.

[0] https://en.wikipedia.org/wiki/EURion_constellation


I literally just printed it. Copy pasted the image from wiki into Libreoffice Writer, and printed on my monochrome laser printer.


That is very funny. You could call it double irony: what you expected to happen is exactly what happened, but not in the way you expected it to happen.

I guess it's time for some old school analog printing. I'll see if I can whip up a 3d printable stamp


    clearance=17; // [50]
    height=150; // [300]
    r1=25-17/2; r2=25+17/2;
    for(xy = [[269,73],[85,170],[237,228],[475,280],[263,487]]){
        translate([xy[0],-xy[1],0]) difference(){
            cylinder(r=r2, h=clearance*2, center=true);
            cylinder(r=r1, h=clearance*2+.1, center=true);
        }
        hull(){
            translate([xy[0],-xy[1],0]) cylinder(r=r2, h=clearance);
            translate([237,-228,0])
            cylinder(r=r2, h=height-clearance);
        }
    }
https://www.thingiverse.com/thing:4841547


just tested example picture in tweet by ms azure facial recognition service and it worked pretty well, idk maybe test picture is just marketing crap, wanna see real world example of this tech before believing in it


Article about a picture has no pictures, and link to picture is 404.

The whole thing is based on the premise that the inventor doesn't understand what Reverse Image Search is (it's not facial recognition)

> But the changes confuse the A.I; even an altered photo of Joe Biden will return zero results in Google/Tineye Reverse Search!


This says that it "does not ruin the image," but that's somewhat subjective. Faces are naturally asymmetric, and it looks like they always flip the image, which in many cases will lead to a bit of an uncanny valley feeling on many photos, particularly for people you know well.


Stanford cs231n has a little demo illustrating how easy this is, if you know the model weights, called "Fooling images". In short, you backprop to the data to find out which pixels were important for the classification result. Then you ever so slightly modify these, right upon the point the model miss classifies. [1]

[1]: https://nbviewer.jupyter.org/github/madalinabuzau/cs231n-con...


Sounds very interesting! However, I am very curious on whether this works for simple workarounds such as taking a screenshot of the picture or reducing image quality. Since the original and the altered images are visually identical to the human eye, from the limited knowledge I have on the topic, I infer that the changes made by the algorithm are all in the high-frequency side of the spectrum.


For the next several minutes, hours, days, weeks, months, or years.

The technology that detects faces LEARNS. The training may take a bit of time, but this obfuscation will last about as long as masks did.

The Chinese Government has had over 90% success in detecting faces that are more than 70% obscured.

How long before their software finds this 'hiding' trivial to unmask?


There's research on the same idea but with a (presumably) sounder method behind it: https://sandlab.cs.uchicago.edu/fawkes/


Putting the photo of Biden through PimEyes shows that it finds matches just fine using facial recognition - https://pimeyes.com/en/search/fcfcfcff98181800fefec8187e4000...


I suspect they only tested this on google image search, which doesn't actually try to do any kind of fancy facial recognition - it just (I believe) drastically downsizes the image, and looks for images that are similar to the downsized image. So flipping the image 180 degrees would be good enough to "trick" google.


Is there a service that can give you as much information as possible regarding a photo? I want to know how anonymous my photos are. Thank you.


Yandex will not be fooled by the small screenshot of Biden they provide (if you crop out the text), and overall yandex is just much better at reverse image lookup than google or tineye.

I think I've seen rumours that the accuracy of the google reverse lookup is artificially lowered because of privacy concerns or something like that though, so it might not simply be superior Russian engineering behind this result.


What doesn't defeat AI just makes the GAN smarter


Maybe they are just using the fawkes app


...until someone writes a workaround


1. This will become popular

2. People will start posting photos edited with it "We're safe now! Everyone post your faces!"

3. It'll get added to training sets

4. It'll be sunk within 3 years


If you can defeat something for 3 years, id call that a win.


3 years is practically an era in information technology.


Try 3 minutes


everything eventually devolves into war games doesn't it?


If we keep building a more and more adversarial society, yes.


Cue arms race


PR spam




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: