> that was actually what the outcry about Apple's version was about. The more th...

nullc · 2024-10-19T14:06:29 1729346789

> generate collisions

and as a minor point of clarification: The structure of the publish perceptual hash (and presumably the non-public one as well) were vulnerable to essentially arbitrary second preimage attacks, not just collisions.

This means that I can take an image and can usually adjust it to have an arbitrary hash, even a hash of an image I've never seen.

It's much more powerful than a collision attack (where the attacker must modify both images).

I doubt have any serious doubt that I would be unable to generate second preimages for two such hashes, but given the second hash was never published that just remains speculation. As AFAIK the first person to develop and demonstrate the second preimage attack against it, I'd like to think that my speculation on this is at least somewhat better than chance. :)

Ylpertnodi · 2024-10-19T08:58:03 1729328283

>That’s what Apple was forced to react to.

Perhaps apple could have made a better explanation available? Unless, there wasn't one?

nullc · 2024-10-19T13:38:14 1729345094

Plenty of commentary did address those points.

The users privacy is compromised at the point that the public hash had too many hits. An attacker that can implant one hit (e.g. by giving you an altered image that matched genuine illicit material or by using stable diffusion to generate fake illicit material and altering it to match the unaltered hash of an image you possess then submitting it to NCMEC) can obviously also implant multiple.

At that juncture the cryptographic keys are leaked to apple, and all further security depends on apple telling the truth about their process, not being compelled by administrative subpoena, and not ever being unwittingly compromised by hackers or intelligence operatives.

The extra steps of a second perceptual hash and human review are thus not all that relevant, and nor were they clearly enough defined for any analysis of their security properties. Particularly the second perceptual hash's security is apparently at least partially dependent on its obscurity, but you have no reason to believe that it won't be obtained by hackers, rogue employees, intelligence operatives, etc. (And if its obscurity isn't relevant, then why not publish it?).

Even if the hashes were flawless however, the system would be relatively straight forward to attack through less sophisticated means and would retain the overarching philosphical flaw:

Your computing device is your trusted agent-- you share with it material more confidential than your doctor, your lawyer, or your priest. You paid to purchase it. You pay to power it. Increasingly you cannot communicate with family, business partners, or carry out essential and mandatory interactions with you government without using it. Your computing device mediates almost every aspect of your life. As a trusted agent it has absolutely no business scanning your files against unaccountable secret databases, encrypted against your inspection, and undetectably phoning home matches like a KGB spy that you're forced to confide in and house. To do so is a gross betrayal, one that shouldn't just be a bad idea-- it ought to be unlawful.

Service providers scanning content is morally fraught itself, but in our unfortunate current legal standard you have little to no expectation of privacy in information to provide to a third party. And that against-your-own-interests scanning is done on computers owned and operated by the scanners, rather than you. And it's done using access to your information that they already have, so it's a realization of the consequences of existent poor privacy rather than an a new invasion.

The transition to your own devices scanning against you is a bridge to far, no matter how much technical obfuscation is layered onto it.

As someone who has developed privacy technology I found the entire presentation additionally offensive because apple misrepresented the PSI components as protecting the users privacy, when in reality the only purpose for their existence was concealing the list of hashes from the users and thus protecting it from review and criticism. It's one thing for a security scheme to provide insufficient protections, it's quite another to fraudulently present technology which is weaponized against the user as somehow being for them.