Hacker News new | past | comments | ask | show | jobs | submit login
Convert Apple NeuralHash model for CSAM Detection to ONNX (github.com/asuharietygvar)
505 points by homarp on Aug 18, 2021 | hide | past | favorite | 178 comments



Ongoing related threads:

Apple defends anti-child abuse imagery tech after claims of ‘hash collisions’ - https://news.ycombinator.com/item?id=28225706

Hash collision in Apple NeuralHash model - https://news.ycombinator.com/item?id=28219068


> Neural hash generated here might be a few bits off from one generated on an iOS device. This is expected since different iOS devices generate slightly different hashes anyway. The reason is that neural networks are based on floating-point calculations. The accuracy is highly dependent on the hardware. For smaller networks it won't make any difference. But NeuralHash has 200+ layers, resulting in significant cumulative errors.

This is a little unexpected. I'm not sure whether this has any implication on CSAM detection as whole. Wouldn't this require Apple to add multiple versions of NeuralHash of the same image (one for each platform/hardware) into the database to counter this issue? If that is case, doesn't this in turn weak the threshold of the detection as the same image maybe match multiple times in different devices?


This may explain why they (weirdly), only announced it for iOS and iPadOS, as far as I can tell they didn't announce it for macOS.

My first thought was that they didn't want to make the model too easily accessible by putting it on macOS, in order to avoid adversarial attacks.

But knowing this now, Intel Macs are an issue as (not as I previously wrote because they differ in floating point implementation to ARM, thanks my123 for the correction) they will have to run the network on a wide variety of GPUs (at the very least multiple AMD archs and Intel's iGPU), so maybe that also factored in their decision ? They would have had to deploy multiple models and (I believe, unless they could make the models exactly converge ?) multiple distinct database server side to check back.

To people knowledgeable on the topic, would having two versions of the models increase the attack surface ?

Edit: Also, I didn't realise that because of how perceptual hashes worked, they would need to have their own threshold to matching, independent of the "30 pictures matched to launch a human review". Apple's communication push implied exact matches. I'm not sure they used the right tool here (putting aside the fact for now that this is running client side).


It wasn’t part of the original announcement afaik but is coming to MacOS Monterey: https://www.apple.com/child-safety/

Edit: cwizou correctly points out not all of the features (per Apple) will be on Monterey but the code exists.


Is it ? I checked your link and they separate clearly which features comes to which OS, here's how I read it :

- Communication safety in Messages

> "This feature is coming in an update later this year to accounts set up as families in iCloud for iOS 15, iPadOS 15, and macOS Monterey."

- CSAM detection

> "To help address this, new technology in iOS and iPadOS"

- Expanding guidance in Siri and Search

> "These updates to Siri and Search are coming later this year in an update to iOS 15, iPadOS 15, watchOS 8, and macOS Monterey."

So while the two other features are coming, the CSAM detection is singled out as not coming to macOS.

But ! At the same time, and I saw that after the editing window closed, the GitHub repo clearly states that you can get the models from macOS builds 11.4 onwards :

> If you have a recent version of macOS (11.4+) or jailbroken iOS (14.7+) installed, simply grab these files from /System/Library/Frameworks/Vision.framework/Resources/ (on macOS) or /System/Library/Frameworks/Vision.framework/ (on iOS).

So my best guess is, they trialed it on macOS as they did in iOS (and put the model there contrary to what I had assumed) but choose not to enable it yet, perhaps because of the rounding error issue, or something else.

Edit : This repo by KhaosT refers to 11.3 for the API availability but it's the same ballpark, Apple is already shipping it as part of their Vision framework, under an obfuscated class name, and the code samples runs the model directly on macOS : https://github.com/KhaosT/nhcalc/blob/5f5260295ba584019cbad6...


Ah good catch and write up. I believe you’re right and likely a matter of time for Mac. Hard to tell if this means it’s shipping with MacOS but just not enabled yet.


The model runs on the GPU or the Neural Engine, CPU arch isn't really a factor.


My bad, I edited the previous post, thanks for this. Assuming this runs on Intel's iGPU, they would still need the ability to run on AMD's GPU for the iMac Pro and Mac Pro, so that's at least two extra separate cases.


It's not a user facing feature, and x86 macs are the past already - I doubt they'll bother porting it.


my primary expectation is this tech will be used for dcma2.0 and "for the kids" is the best way to launch it.


This basically invalidates any claims Apple made about accuracy, and brings up an interesting point about the hashing mechanism: it seems two visually similar images will also have similar hashes. This is interesting because humans quickly learn such patterns: for example, many here will know what dQw4w9WgXcQ is without thinking about it at all.


> it seems two visually similar images will also have similar hashes

This is by-design - The whole idea of a perceptual hash is that the more similar the two hashes are, the more similar the two images are, so I don't think it invalidates any claims.

Perceptual hashes are different to a cryptographic hash, where any change in the message would completely change the hash.


> The whole idea of a perceptual hash is that the more similar the two hashes are, the more similar the two images are

If that is the case, then the word "hash" is terribly mis-applied here.


Hash is applied correctly here. A hash function is "any function that can be used to map data of arbitrary size to fixed-size values." The properties of being a(n) (essentially) unique fingerprint, or of small changes in input causing large changes in output, are properties of cryptographic hashes. Perceptual hashes do not have those properties.


Good explanation, thanks. I only knew about cryptographic hashes, or those that are used for hash tables where you absolutely do not want to have collisions. Anyhow, I'm not really comfortable with this usage of the word "hash". It is completely opposite of the meaning I'm used to.


Maybe the term fingerprint is better


It greatly increases the collision space if you only have to get near a bad number.


> The whole idea of a perceptual hash is that the more similar the two hashes are, the more similar the two images are

This is already proven to be inaccurate. There are adversarial hashes and collisions possible in the system. You don’t have to be very skeptically-minded to think that this is intentional. Links to examples of this already posted in this thread.

You are banking on an ideal scenario of this technology not the reality.

EDIT: Proof on the front page on HN right now https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX/issue...


I think you may have misread my comment: I did not mean that the similarity of hashes invalidates any claims.


> Wouldn't this require Apple to add multiple versions of NeuralHash of the same image (one for each platform/hardware) into the database to counter this issue?

Not if their processor architectures are all the same, or close enough that they can write (and have written) an emulation layer to get bit-identical behaviour.


Floating point arithmetic in an algorithm that can land you in jail? Why not!


This algorithm can not land you in jail. Nobody would be jailed based on this algorithm.

The algorithm alerts a human, who actually looks and makes the call.


I think it would just require generating the table of hashes once on each type of hardware in use (whether CPU or GPU), then doing the lookup only in the table that matches the hardware that generated it.


To re-do the hashes, you would need to run it on the original offending photo database, which -- as an unofficial party doing so -- could land you in trouble, wouldn't it?

And what if you re-do the hashes on a Mac with auto-backup to iCloud -- next think you know the entire offending database has been sync'd into your iCloud account :-/


They are probably using https://en.wikipedia.org/wiki/Hamming_distance to have a leeway which again adds to a potential of having more false positives.


Yes, this and other distance metrics are what are used to do reverse and image similarity lookups with perceptual hashes.


I don't understand the concept of "slightly different hash". Aren't hashes supposed to be either equal or completely different?


You're thinking of cryptographic hashes. There are many kinds of hash (geographic, perceptual, semantic, etc), many of which are designed to only be slightly different.


There is a class of hashes known as locality-sensitive hashes, which are designed to preserve some metric of "closeness".

https://en.wikipedia.org/wiki/Locality-sensitive_hashing


Now that the model is known, I wonder how hard it is to create "adversarial collisions": given an image and a target hash, perturb the image in a way that is barely perceptible for a human, so that it matches the target hash.



While super impressive I haven't seen the thing that would actually destroy the algorithm which is given a hash and random reference image produce a new image which has that hash and looks like the reference image.


What you've seen is worse than that.

All you need to do to cause trouble right now, would be to get a bad image, hash it yourself, make a collision and distribute that.

Let's say for the time being that the list hashes themselves will be server-side. You won't ever get that list, but you don't need it in order to cause a collision. You would need your own supply of CSAM to hash yourself, which while distasteful is clearly also not impossible.


It might be useful to read the threat model document. Associated data from client neural hash matches are compared with the known CSAM database again on the server using a private perceptual hash before being forwarded to human reviewers, so all such an attack would do is expose non-private image derivatives to Apple. It would likely not put an account at risk for referral to NCMEC. In this sense, privacy is indeed preserved versus other server scanning solutions where an adversarial perceptual hash (PhotoDNA is effectively public as well, per an article shared on HN) would trigger human review of all the account’s data.


I assume these "non-private image derivatives" are downscaled versions of the original image. But for downscaling there also are adversarial techniques: perturb an image such that the downscaled version looks like a target image. See https://bdtechtalks.com/2020/08/03/machine-learning-adversar...


For the server side component, the challenge is knowing the target of adversarial generation since, again, the perceptual hash used is secret. At that point, you are reduced to the threat present in every other existing CSAM detection system.


> the perceptual hash used is secret

Yes, although I'm sure a sufficiently motivated attacker can obtain some CSAM that they are reasonably sure is present in the database, and generate the NeuralHash themselves.

> At that point, you are reduced to the threat present in every other existing CSAM detection system.

A difference could be that server-side CSAM detection will verify the entire image, and not just the image derivative, before notifying the authorities.


> Yes, although I'm sure a sufficiently motivated attacker can obtain some CSAM that they are reasonably sure is present in the database, and generate the NeuralHash themselves

Remind us what the attack is here? The neural hash and the visual derivative both have to match for an image to trigger detection.


I believe something like: an adversary creates an innocuous photo where:

* The photo itself is benign.

* The photo’s NeuralHash matches known CSAM.

* The photo’s image derivative is not benign. It looks visually like CSAM.

* The photo’s image derivative matches known CSAM per a private perceptual hash.

The above, combined, could have a victim reported to NCMEC without being aware they were targeted. Since Apple operates on image derivatives, they could be fooled unlike other cloud providers. That is the claim.

At that point, the victim could point law enforcement to the original CloudKit asset (safety vouchers include a reference to the associate asset) and clear their name. However, involving law enforcement can always be traumatic.


Sure, but did I miss something that suggests that all 4 of those conditions can actually be met? That seems like the part that has been made up.

Afaik the image derivative isn’t checked for looking like CSAM. It’s checked for looking like the specific CSAM from the database.


CSAM in the database is confidential, so the human reviewer just needs to be plausibly confident that they have CSAM in front of them. However, it’s not clear to me that you can pull off all three simultaneously. Furthermore, the attack doesn’t specify how they would adversarially generate an image derivative for a secret perceptual hash that they can’t run gradient descent on.

If someone wanted to plant CSAM and had control of an iCloud account, it seems far easier to send some emails with those images since iCloud Mail is actively scanned and nobody checks their iCloud Mail account, especially not the sent folder.


CSAM in the database is confidential.

The question is whether the visual derivatives are checked against derivatives from the database or just against abstract criteria. That seems to be an unknown.

> However, it’s not clear to me that you can pull off all three simultaneously

Agreed. People here seem to keep assuming that you can, but so far nobody has demonstrated that it is possible.


> before being forwarded to human reviewers

Does that mean that Apple employs people who manually review images known to be child pornography 9-to-5? Is it legal?


Yes, and so does every other major cloud service provider. The working conditions of these people are notoriously difficult and should be the subject of attention.


I'd imagine the key is that it's "manually review images suspected to be child pornography". The point of the review process is presumably that there are possible hash-collisions / false-positives, so the reviewers are what cause that transition from suspected to known.


Privacy is preserved by a team of humans looking at your private photos. Sounds like a good deal. What are we preserving again?


See https://news.ycombinator.com/item?id=28105849, which shows a POC to generate adversarial collisions for any neural network based perceptual hash scheme. The reason it works is because "(the network) is continuous(ly differentiable) and vulnerable to (gradient-)optimisation based attack".


Spy agencies could add CSAM images adversarially modified to match legit content they want to find. Then they need to have someone in Apple's team to intercept the reports. This way they can scan for any image.


That reminded me of the AT&T Room 641A

https://en.wikipedia.org/wiki/Room_641A


If this is the level that a "spy agency" is going to get involved, they would already skip all this BS and just upload the images directly to themselves.


Or for criminals to generate perceptually similar illegal images that are no longer triggered as a 'bad' hash.


So, these adversarial collisions are the images that I need to send to my enemies so that they go to prison when they upload those images to iCloud? It seems trivially easy to exploit.


You can technically hide an adversarial collision inside a complete legit normal image. It won’t be seen by human eyes but it will trigger a detection. In addition, you can do the complete opposite by perturbing a CSAM to output complete different hash to circumvent the detection. All of these vulnerabilities are well known for perceptual hash.


So right there seems to be an issue to me. It seems like if you were trading in CSAM, you would run CLEANER -all on anything and everything. Because you know someone has already written that as proof of concept here.


You can also just send your enemies CSAM, it is more effective at imprisoning them.


They probably would not hold onto them and also now you have a paper trail of sending CSAM to people. But if you were to alter some innocent looking photos and send them to someone, they might store those.


Yes, but that won’t work since innocent looking photos won’t match the visual derivative.


That sounds like a good reason not to create a virtual SWATing vector then.


No. Because, as documented in the published material, hits are also reviewed by humans.


Downscaled derivative images are what are checked, and if the reviewer even suspects that the downscaled image contains sexual content of a minor, they're required to report it. At that point, it is the authorities who decide whether or not to continue to investigate and determine if the user possesses CSAM. That investigation alone can ruin lives.


The "downscaled" image 360x360. It's pretty easy to compare two images at that size to determine if they are the same. You're not going to end up investigated unless you actually have actual picture of child pornography. Also, not just one, the process doesn't even start until you have a fairly large number.


It's highly unlikely that the apple reviewer will have access to the actual image in the database, instead they will assess whether the blurry,b/w,etc modified thumbnail could possibly be illegal. Given the level of secrecy surrounding the contents of the DB, even the normal NCMEC reviewers may not have access to those for comparison. There are plenty of CG images, small legal models, or even images of erotic toys which could be used as the base image which would look very suspicious as a low quality thumbnail.


There will be someone along the line who will check that the image is, in fact, the actual listed image. Otherwise, there is no legal case to be made.


> There will be someone along the line who will check that the image is, in fact, the actual listed image.

That someone will be law enforcement, and they will get a warrant for all of your electronic devices in order to determine if you actually have CSAM or not. It's literally their job to investigate whether crimes were committed or not. Those investigations alone can ruin lives, even more so if arrests are made based on the tips or suspicions.


I wonder what scenarios are facilitated by sending adversarial collisions over simply sending over CSAM.


The images are sent for manual verification before you're turned in, so no.



An interesting tidbit: "Believe it or not, [the NeuralHash algorithm for on-device CSAM detection] already exists as early as iOS 14.3, hidden under obfuscated class names."


So it was quite literally introduced as a trojan horse.

"We're so excited to bring you all these new features and bugfixes in iOS 14.3, plus one more thing you'll hear about and object to in future. Too bad."


You've never heard of feature flags?


I'm sure apple would like everyone to call their trojan that way. "Feature Flag" lol.


You mean like...?

let scanFile4CSAM: Bool

if #available(iOS 16.0, *) {

    scanFile4CSAM = true
} else {

    scanFile4CSAM = is_iCloudPhotosFile && Device.Settings.is_iCloudPhotosEnabled
}

Edit: "These efforts will evolve and expand over time."[1]

[1] https://www.apple.com/child-safety/


In addition to generate the adversarial collisions, someone mentioned that it can also be used to train a decoder network to reverse any NeuralHash back to its input image.


That assumes that 96 bits of information are sufficient for (in some sense) uniquely describing the input image. Which, on the one hand, is of course the purpose of the system, but on the other is also clearly mathematically impossible (a 360x360 RGB8 image has 3110400 bits of information).

That is, for each 96 bit neural hash value, there exist (on average) 2^3110304 unique input images that hash to that same value.

Again, these are of course trivial facts, which do not rule out that image recovery (in a "get back something that looks similar to the original input" sense) is possible, but you should be aware that "similar" to the network need not mean "similar" to a human.


Just like any autoencoder, it is not about getting back the exact original, which is of course impossible. It is about summarizing the image in 96bits information, which is quite enough to leak the gist of the original image. For example, [1] talks about reversing Microsoft’s PhotoDNA.

> but you should be aware that "similar" to the network need not mean "similar" to a human.

With techniques like GAN and DLSS, it is quite possible to generate some photo realistic image being enough similar to the original one, or at least leaking some private information.

[1]: https://www.hackerfactor.com/blog/index.php?/archives/929-On...


"...but you should be aware that "similar" to the network need not mean "similar" to a human..."

EXCEPT... neural hash also claims to be robust to modifications to images that would result in a similar-to-human-image. If the 96 bits is enough to tag such similar-to-humans results, why couldn't a brute force approach yield such similar-to-humans images? Indeed, a nefarious person intent on producing CSAM could set-up something like a generational-adversarial system that the produced CSAM images using the hashes along with other clues.


Because there are still an absolutely overwhelmingly huge number of different, completely nonsensical images that all generate the same hash, and small perturbations of those nonsensical blobs also generate the same hash.

96 bit is just not enough data to generate anything meaningful, just give up on that thought.


This absolutely needs to be done. Also, does Apple deploy different models for different regions/cohorts?


> someone mentioned that it can also be used to train a decoder network to reverse any NeuralHash back to its input image.

That someone is simply wrong.


> ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers.

- onnx.ai

I have never heard of this before and had to look it up. Is it widely used? Can I define a model in onnx and run it "everywhere", instead of learning pytorch or tensorflow?


Yes. Could be that you have to check what operators is available when you move from one framework to another using ONNX. One example is ONNX.js that make it possible to run the model using JavaScript, made by Microsoft. If the operator is not available (ex. ArgMax for WebGL) you have to switch to something equivalent and retrain. List of supported operators for ONNX.js: https://github.com/microsoft/onnxjs/blob/master/docs/operato...


Onnx has really poor support by microsoft. I suspect they've basically abandoned it internally (their onnxjs variant is orders of magnitude slower than tfjs[1]). It's a good 'neutral standard' for the moment but we should all eventually probably move away from it long term.

[1] https://github.com/microsoft/onnxjs/issues/304


ONNX != onnxjs

ONNX is a representation format for ML models (mostly neural networks). onnxjs is a just a browser runtime for ONNX models. While it may be true that onnxjs is neglected, please note that the 'main' runtime, onnxruntime, is under heavy active development[1].

Moreover, Microsoft is not the sole steward of the ONNX ecosystem. They are one of many contributors, alongside companies like Facebook, Amazon, Nvidia, and many others [2].

I don't think ONNX is going away anytime soon. Not so sure about the TF ecosystem though.

[1] https://github.com/microsoft/onnxruntime/releases

[2] https://onnx.ai/about.html


I've tried inference on the python version of onnx and it usually varies between hitting a OOM limit (while with TF it works fine) to being an order of magnitude slower. Even if the codebase is still being changed I don't see much reason for people to use it other than as a convenient distribution format.


Interesting, I did not encounter such discrepancies in my work with these tools.

There could be multiple reasons for the degraded performance:

- Are we comparing apples to apples here (heh), e.g. ResNet-50 vs ResNet-50?

- Was the ONNX model ported from TF? There are known issues with that path (https://onnxruntime.ai/docs/how-to/tune-performance.html#my-...)

- Have you tried tuning an execution provider for your specific target platform?(https://onnxruntime.ai/docs/reference/execution-providers/#s...)


ONNX is a representation and interchange format that training/inference frameworks can use to represent models. However, ONNX does not have a tensor engine of its own. So you would still define and train your model in tensorflow, pytorch, etc, and then save the model in ONNX format, at which point it can be ported to any inference service that supports the ONNX format.


Yes the goal is to run it framework independent. Pytorch for example can export models to ONNX.


Well you would still need to learn Pytorch/tensorflow and define your network there, you then export it to ONNX format for deployment.


I think you need to use pytorch or tensorflow to create the network and train it. After training you export it to ONNX. I suggest you to convert from ONNX to Openvino.


So the naysayers were right all along, from the original repository uncovering the "NeuralHash" private APIs:

> Resizing the image yields same hash, even down to below 200x100. Cropping or rotating the image yields different hashes.


What did the naysayers say? That the algorithm wouldn't be able to handle cropping and rotation? Did Apple claim that it would?


That this tech is sufficient to spy on the population, but insufficient to defeat even basic evasion.


As far as my understanding goes, you can apply a distance measure on the (neural)hashes of the input image and the reference image. A proximity threshold will determine if it's a variant of the original image or not. So, it should probably be pretty good at defeating basic evasion.

It is not like a cryptographic hash, where altering a single bit will completely change the output.


I think you're overestimating the capabilities of neural networks here, and especially ones that we know the exact weights for. It is fairly trivial to generate invisible noise that makes an input image get an entirely different hash.


> It is fairly trivial to generate invisible noise that makes an input image get an entirely different hash.

What I had in mind when referring to 'basic evasion' was 'cropping or rotating', as per your original comment.

All that being said, I admit that generating adversarial examples for models with known weights is not a difficult task.


Perceptual hashing systems that do derivative image lookups don't rely on exact hash comparisons, but fuzzy hash comparisons using a distance metric like the Hamming distance to find similar images.

If two hashes are off by a bit or two, chances are that the two images are derived from the same, or similar, source image.


I can restate what I said again for you: I can generate noise that makes the Hamming distance or whatever metric you prefer arbitrarily large without changing the contents noticeably.


For those wondering why you can't just delete these files. On Mac OS Big Sur and later, the system OS is signed by apple and in order to actually delete these files you need to go through a whole bunch of steps that apparently requires completely disabling FileVault (full disk encryption).

https://apple.stackexchange.com/questions/395508/can-i-mount...

So in the end we'll be left with a choice.

1. Allow Apple to scan your files.

2. Disable any kind of encryption letting anyone who steals your laptop access all your files.


> For those wondering why you can't just delete these files

Maybe I'm missing context, what files are you referring to?


Probably the model weight files


Yes, if the model files are deleted then the software attempting to scan files would obviously have to error out.


On Apple Silicon Macs this isn’t true, as files are encrypted on disk with “data protection” (same thing as on iOS). You can enable FileVault but it’s just extra.

Also I’ve edited /etc without disabling FileVault, is it just /System which is protected this way?


System is separate I believe. Every single file is cryptographically signed.


Or 3. Disable iCloud Photo Library and have no scanning done at all.


For now. A simple change to hook IOImage load in a future update and it will be.


Except that everything is always like this. A simple change to location services on your phone to be a 24/7 reporting system. A simple change to your password manager to send all the passwords in plaintext somewhere.


This is something I don't see enough. Like I'm not terribly fond of these changes but people who are like "Well that's it, I'm leaving iOS" just strike me as either lying or they don't understand anything about technology. Like if you've used iCloud /anything/ then Apple /could/ already be scanning it or handing it over the FBI/etc. Similarly iOS (and Android unless built from source) is closed source and you already have no idea whats running on it in reality. So knowing all of that, on-device scanning is where you draw the line? That's just odd. Every "slippery slope" argument used for this photo scanning should have disqualified /every/ iPhone/Stock-Android phone already for these people. The fact these hashes and/or some of this code has been in iOS since 14.3 (not activated) already proves these people are full of it since no one noticed or said anything until it was announced (or a day before when the twitter thread went out).


Here's an interesting adversarial attack. If we reverse engineer the Apple NeuralHash format and replace these files, we could create a system that does a DDoS attack on Apple's manual verification system by flooding the system with false positives caused from a faulty NeuralHash. This would overload Apple's manual review system and effectively make it uneconomical to run.


Winning would always be easy if you didn't have an adversary. Apple (having access to the original low resolution photo) could build a relatively simple mechanism add a filter in their verification pipeline.

Even if they didn't have access to the original (for whatever reason), they train their own learning algorithm (supervised by their manual verification checkers) to detect the fake submissions.


I was under the impression that the hashes were the only thing transferred, as the source image is encrypted. (Edit: On second thought, manual reviewing needs the source image.)

Anyhow, adversarial attacks transfer reasonably well across models so you could create attacks on models you think Apple would use internally.

I’d imagine the first think Apple would do is put attack spammers on an ignore list. However, that would only work until the images start propagating on the wider internet via forums and social media.


I'd be curious how the attack could transfer across models. Apple would have access to a huge amount of information about target image that they could include in their analysis and would be denied to the attacker.

The crowdsourced attack idea would also be contingent on thousands of people willing being flagged as pedophiles.


For transferring, I mean specifically that it is known that attacks transfer in a general sense e.g., https://www.usenix.org/system/files/sec19-demontis.pdf It’s not easy in this setting due to the information asymmetry, but if someone wants to dedicate the resources it should be possible. They could get N random models and create images that fool all N.

For crowdsourcing, I mean that the crowd is a distribution network and not the attack creator (e.g., they are a self-distributing virus/“worm”). The attacker takes those images (the initial “worm” programs) and uploads them to reddit as a catchy meme for worldwide distribution. Meme sharers wouldn’t be aware that the meme has a hash collision with CP.

The entire defense of this sort of thing is obscurity (further obscured by using “magical” machine learning), since nothing is proven about how collision resistant the algorithm is. At Apple’s scale, it’s as careless as rolling your own crypto hash.


There is no feedback loop to support the development of the attack. Apple in this case would just be filtering and ignoring false positives without notifying the attacker if they were successful or not.

I still don't quite see how the 'worm' would work in practice. It's not just a matter of sharing a link to an image, but you have to add it to your photo album. Maybe I'm getting old, but I don't have a single meme-type image in my album.

I don't think this can be classed as a security by obscurity problem. Security by obscurity fails because they 'key' (the obscure bit) can't be easily changed if it leaks. But given Apple has both the target and candidate images available to it, it can in effect generate a new key at the expense of having to do additional computation.


The way I think about it is that the attacker has plenty of time (years) to either guess or discover the viability of an attack. For example, inspecting packets and reverse engineering, correlating photos with police visits, creating a hash database using black market data, etc. will all be used to determine if an attack is working.

Same thing goes with the worm. Posting infected memes is one way and that’ll passively get some downloads. Another way is to text the pictures via bots to people and hope their SMS is iCloud backed up. The point is if someone figures out how to make an attack, they (or entity they sell it to) will almost surely spend great effort in social engineering the distribution.

I agree that it’s probably not the right term. My main concern though is that the class of attacks that is effective against one model will likely be effective against the entire class of similar models. So Apple’s main defense is the limited feedback loop caused by keeping part of the modeling private rather than a traditional computational complexity defense, which would allow disclosure of the entire implementation. It’s taken one week for the proof of concept to be demonstrated and most of that was likely boilerplate API code; it would take perhaps even less time to retarget if Apple disclosed the entire implementation with a different model. The demonstrated attack was textbook material, so it’s not unreasonable to believe that all attacks are textbook material.


How does this actually help society? After this announcement child abusers won't user their iPhone for this stuff (can't believe they did in the first place).


Their approach is so poorly targeted at the claimed problem, and so ineffectual, I don’t think it is reasonable to take them at face value re: what they say they are trying to accomplish.

For the folks who are interested in stopping abuse of children, there are many other approaches that would break the market for new abuse and new CSAM. This just isn’t going to move the needle and I have to assume they know that.

I’ve completely lost trust in Apple because I can’t understand what their motivations are. I _do_ understand the technology, so I’m pretty tired of articles suggesting this is some sort of misunderstanding and not Apple taking a giant leap towards enabling authoritarianism, and of course building literal thoughtcrime enforcement into end user devices, which is beyond even what 1984 imagined.


This is why I suspect something shifted with the government. Like they're being forced into doing this somehow.

This was the minimal legal requirement or something.

The whole thing is baffling to me.


> This is why I suspect something shifted with the government. Like they're being forced into doing this somehow.

Seems like undue speculation. The government cares about far more than just CSAM. They care about terrorism, human and drug trafficking, organized crime, gangs, drug manufacturing, fraud etc.

This type of speculation only makes sense if Apple intends to expand their CSAM detection system to detect those other things, as well.


I 100% think Apple intends to expand on it. That's why I'm against it.


Apple does say something[1] to that effect:

> This program is ambitious, and protecting children is an important responsibility. These efforts will evolve and expand over time.

[1] https://www.apple.com/child-safety/


Does nobody remember the EARN IT Act?


I do remember it, it’s just very weird that Apple hasn’t cited it as a motivation.

If they had led with that, we’d be having a conversation about the evildoers in Congress rather than Cupertino.


Why do so many people get caught sharing CSAM on FB then?

I remember reading about the CSAM ring the FBI (IIRC) infiltrated and ran for a period of time, that group had strict rules on how to access and share material that, if followed, would have completely protected them but the majority of them were sloppy and got caught. Criminals really aren't that smart by and large. Will this catch the smartest of them? Probably not, but it will catch a good number I'm sure.

All that said, I'm not a fan of these changes, I just dislike arguments that don't hold water against it.


I haven't seen anyone here having an issue with Facebook scanning for CSAM for the exact reasons you described. Sharing CSAM on social media is frequently abused.


If I’m reading this right it seems like the neural hash runs on both MacOS and iOS, since the weights can be found on both systems. I though the neuralhash was only running on iOS?


Does apple even encrypt the actual image data on device? Their system document says "payload = visual derivative + neural hash" and only that is encrypted with a secondary level encryption. And they didn't go through with e2ee for icloud last I heard. This elaborate system makes no sense if they very well could have done it in cloud.

It feels like elaborate privacy theatre trojan horse to introduce in device surveillance.


The algorithm seems very simple. How does it perform compared with the Marr wavelet algorithm?


Simplicity is a good thing. One of the perception hashes that I found in an URL on HN was literally just compressing the images, converting to grescale, calculating the hamming distance and coalescing that into an n bit hash.


Not when it means that simple rotation / cropping evades detection.


Several articles stated that given a perceptual hash you could somehow reverse it into a (very) low resolution image. However the README provides an example hash and it's only 24 characters. How is that possible?


Really disgusting idea: I wonder if it's possible for someone to use this as a 'discriminator' in a GAN to configure a generator to recreate the CP this is trying to avoid distributing in the first place.


Not really; there's not enough information in the NeuralHashes. You'd get pictures like this,[0] (from [1]) instead.

[0]: https://user-images.githubusercontent.com/1328/129860810-f41...

[1]: https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX/issue...


That is assuming that adding plausibility constraints wouldn't fix this issue. I don't know if this is feasible though.


That's not really correct. That's a forcibly created colliding image, that's not the output of the NeuralHash. Also as reported elsewhere, it's absolutely possible to do so.


No, it’s not possible.

If you think there is a credible mechanism, please link to it.


It might have to do with the output possibly being a probability vector as opposed to a binary hash. The whole thing is thus differentiable and optimizable (if a dog image was incorrectly placed in the bad hash bucket it might only be on the border of it while the real CP corresponding to the hash is found at the probabilistic maxima of the hash bucket). Just guessing.


That isn’t correct, nor is it credible.

See: https://www.apple.com/child-safety/pdf/Security_Threat_Model...


Where am I supposed to look in that pdf to understand that it isn't correct or credible? It is certainly true that the model has differentiable and thus optimizable outputs.


Sorry - I posted that link in the wrong place.

Either way, if the claim is that it’s possible to reverse engineer CSAM from the hashes, proof is needed, and nobody has provided even a proof of concept.

The person I responded to was claiming it had been demonstrated. I asked for a link to evidence. You just made a hypothesis about how it might work. That’s not helpful.


Are you trying to stop abuse of children, or enforce a standard that the idea of images of children is bad?

If you’re actually trying to stop abuse, having the computer create fake CP seems like an ideal outcome, since it would avoid the need for abuse of children.

Flooding the market with fakes and then directing consumers of the fakes to whatever mental health resources are available seems like it would fit the claimed problem far better than what apple is currently trying.


This would be bizarre - wouldn't this mean that Apple are essentially shipping illegal images with their OS? (Subject to some as yet unknown decoder)


With the right algorithm you can turn any certain string of bits into a certain other string of bits. So is the image in the data, or is it really in the algorithm?


If the decoder was "trained" on and only works with predictable data, then it might be the algorithm that's illegal, but if a completely new illegal image is created, hashed, fed into the decoder and the decoder produces a valid illegal image, then the illegal data must be in the input, not the algorithm.

This is basically rule 1 of testing neural networks: if the testing data is different from the training data and the results are still correct, your network is "reading" the data correctly and not just memorising a list of known values. I guess this means you'd also need to prove that the decoder doesn't turn most hashes of non-illegal images into illegal images, but if you also did that, you'd have a pretty strong case that the illegal data is in the hash.


Did Apple use the bad images to train the neural network? If yes, I suppose that makes this possibility more realistic.


> Did Apple use the bad images to train the neural network?

NCMEC did, certainly, but I don't think Apple ever got the actual images themselves; just the resultant hashes.


Makes me wonder if there's a possibility of e.g. faces on fbi's most wanted being snuck into the dataset somewhere in the chain.


> if there's a possibility of e.g. faces on fbi's most wanted being snuck into the dataset

Sure, it's possible, but that doesn't seem to have happened in the past decade of PhotoDNA scanning cloud photos to match hashes provided by NCMEC - why would it suddenly start happening now?


> Sure, it's possible, but that doesn't seem to have happened in the past decade of PhotoDNA scanning cloud photos to match hashes provided by NCMEC

If it's happened, it's unlikely the public would know about it.


You really don't understand the difference in scale distributed sensor netwise between the two different capabilities do you?

Server centric is the primitive that gives you periodic batch. Client resident let's you build up a real-time detection network.

Also, as they say in the financial world: past performance is not indicative of future results. No one would have thought to do so because this step hadn't been done. Now that this step has been done it is an easier to sell prospect. This is how the slippery slope works.


> periodic batch [] real-time detection network

What's the realistic difference here between "my phone scans the photo on upload to iCloud Photos" and "iCloud Photos scans the photo when it's uploaded"?

Latency of upload doesn't come into play here because the scan results are part of the uploaded photo metadata; they're not submitted distinctly according to Apple's technical description.

(And given the threshold needed before you can decrypt any of the tagged photos with the client side system, the server side scanning would be much more "real-time" in this case, no?)


Why would Apple expose this API call?


Can someone help me understand how the model was found and extracted?


So apparently the code for NeuralHash has been on iOS devices since 14.3 hidden under some classes. This guy found it and rebuilt the whole thing.


For people wondering why Apple is doing this, does nobody remember the EARN IT Act last year, that was so close to passing?


Dear lord when I read that headline I thought for a second apple was working on a brain implant. Coffee time.


welp. i am no iphone user, nor a dev. just a random guy trying to wrap my head around the concept of this trojan horse of "CSAM" which WILL be used by tyrants and crony governments to spy and persecute their citizens and apple will "have to obey the law", something they had no business interfering with the rights of customers earlier, now they are active participants.

How much do you want to bet google will bring something similar to this to "keep up with the industry demands and partners requests". That would be the day either i go full lineageOS if they decide to not join the party or a dumb flip phone for ever. i will not subject myself to this because i know the government "WILL" hunt me down for being a dissident.


Google already implemented much the same thing many years ago. Apple is playing catch-up.


Google and every other cloud storage provider has been scanning images that you upload to their servers. This is perfectly justified, since they would be legally and morally accomplices to whatever illegal thing you were doing. But scanning someone's locally stored images on a device that they own is a completely different situation.

It's the difference between the airport checking my luggage for illegal drugs and the police showing up at every person's house once in a while to check if any drugs happen to be on the premises.


> But scanning someone's locally stored images on a device that they own is a completely different situation.

They're only scanning photos that you upload them to iCloud Photos - this is not (currently) a blanket "we'll scan all your local photos whenever" situation.

> It's the difference between the airport checking my luggage for illegal drugs

... and FedEx/UPS checking your outgoing packages for drugs.


>... and FedEx/UPS checking your outgoing packages for drugs.

Interesting USPS isn't mentioned there. Another example of 4th Amendment workarounds through private industry? Or are you just not aware of what the USPS equivalent program would be?

Genuinely curious.


USPS x-rays many packages - I’d venture almost all of them shipped from a major population center.



This looks like a good way to get a package marked as suspicious, and suspicious packages have their contents searched.


Until you DDoS the inspection framework. turns out the material is pretty cheap! Heck, I can pay for you to return the lead foil wrapping to be "environmentally conscious by taking on the onus for packaging recycling!" Doubling my in-flight noise ratio, and generating good press.

You have to think big. You need people to open a package, inspect, and reseal it.

I can make a dropshipping business, turn a profit, and swamp your unboxers to the point you stop trying. I'm not out to do that, so I wouldn't bother, but there are ways, and you'll be surprised the hijinks you can get up to when adversarial attack of infrastructure or process is done right. Quite a lot of modern systems that just werk do so because on the whole, people don't spend much time being dicks. Unfortunately, my brain seems to have a part that enjoys the challenge in its idle time.

Measuring is hard. Especially when someone has their hand on the "noise" dial.


This is quite off-topic from the fourth amendment workaround originally insinuated. I agree modern society depends on trust more than one may expect or wish. DDoSing the USPS to prove it doesn’t seem like a very pro-social idea, and I do like society.


> It's the difference between the airport checking my luggage for illegal drugs and the police showing up at every person's house once in a while to check if any drugs happen to be on the premises.

Actually, you're right. That's exactly what it's like. The former is how your Government routinely invades your privacy; the latter is an unequivocal violation of the Fourth Amendment of the US Constitution.

As it is with on-cloud versus on-device scanning. If Apple were compelled by the US Government to expand the on-device scanning to search for anything, that too would be an unequivocal Fourth Amendment violation.

Whereas any scanning that occurs on the cloud is not subject to Fourth Amendment protection. It's excepted under the so-called "third party doctrine", which effectively means the Government can rifle through whatever they want for any reason.


Do you have a link?


They have been scanning Gmail for a while using this, as nearly every provider does : https://en.wikipedia.org/wiki/PhotoDNA


From what I can tell, that is not used on Android or user devices though. So it's not the same thing as what Apple is doing.


Yes what Apple does is slightly different technically : they run a network on device, then when you upload the picture to their Cloud services, they attach the hash generated to the picture.

It's only on their servers that they will do the check against the database of CSAM content. So in that sense, it's pretty much the same that what other providers do, it remains attached to their online service, and they check the hash against the database instead of checking the picture (as others do).

If you don't use their iCloud service, the hash is never checked.

I still don't think having the client as part of the system is a good thing, but in terms of abuse it's about the same thing.

What Apple's system allow is a way to do a check while keeping the data encrypted in some 3rd party service. That part certainly raises questions should it be extended.


Thanks, that makes sense!



From the article:

> "He was trying to get around getting caught, he was trying to keep it inside his email," said Detective David Nettles of the Houston Metro Internet Crimes Against Children Taskforce reports the station. "I can't see that information, I can't see that photo, but Google can."

"email" here is presumably Gmail, which Google owns and is responsible for. The dude was storing illegal data on Google servers in plain form. His personal devices were only checked by LEOs after a warrant was issued. Definitely not the same as Apple scanning the images on your device (not their servers).


IIRC, Apple will only scans if iCloud photo is enabled:

https://www.apple.com/child-safety/pdf/Expanded_Protections_...

> This feature only impacts users who have chosen to use iCloud Photos to store their photos.


This is a red herring. It wouldn't need to happen on device then, because iCloud Photos are not end to end encrypted and Apple can scan them on the server side today and achieve the same result.

The only reason to scan clientside for a cloud service is to scan files that are not uploaded, or are end to end encrypted.

Apple already maintains an e2e backdoor (in the form of non e2e iCloud Backup) for the FBI and US intelligence agencies. It is extremely unlikely that they will e2e encrypt iCloud Photos.


I think you just hit on exactly what the plan probably was. I suspect this was the first step in making iCloud photos (not backup) E2E encrypted.

I suppose had they not gone down this road, the headlines would have been "Apple makes it easier to share child porn online".


I said unlikely: they maintain an e2e backdoor in iCloud Backup. Technically, e2e encrypting iCloud Photos at this point would be a no-op as Apple is already escrowing the device e2e keys in the backup (eg for iMessage).

I doubt they'd bother doing e2e for iCloud Photos if they're intentionally not doing it for iCloud Backup.


> Apple is already escrowing the device e2e keys in the backup

Citation? I don't believe this is correct, or at least it's an incomplete assertion.

Assuming they do get with the iCloud backup, these keys would be inside the device's Keychain file which is encrypted at rest by the Secure Enclave. Thus even with access to a full, unencrypted backup of your iPhone, the keychain itself cannot be decrypted by Apple

(It can't be decrypted by you either, if it's restored to different hardware. This is why iCloud Keychain exists. And that is end-to-end encrypted.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: