> You can't use a salt, the contact you want to match has to be able to compute ...

maxerickson · on March 15, 2016

Yeah, somehow I missed a couple of zeros, it would be 2.7 million.

Such a salt wouldn't tighten up the window though, for contact discovery to work, all the outputs have to be available to whatever is doing the matching.

jsnathan · on March 15, 2016

OK hold on. Say the rotating salt thing doesn't work, for whatever reason.

But you yourself made a better suggestion! If we do hash (phone #) contact pairs, and every client publishes multiple hashes, one for each of its contacts, then a cheating server would need to compute (max) (10^10 * 10^10)/3600 ~ 2.7e17 hours worth of hashes.

That would actually be secure, no?

Edit: messed up the math myself this time; I think this is correct now though

maxerickson · on March 15, 2016

For the contact discovery to be private, the phone has to calculate the hash. So the server cost also depends on the performance differential (many entities would likely still spend $50,000 to discover the contacts of a high value target).

Adding information increases the cost, but it also reduces the likelihood of a match.

You've edited your comment since I wrote the above. Anyway, if the attack is directed at a single user, there's only one 10^10 involved (and really, the phone number space isn't that big).

jsnathan · on March 15, 2016

Really sorry about the edits! Thinking in real time :)

So about the size of the phone number space. Interestingly I guess you could at least limit it by country. But, unless you already know the mobile phone numbers of the target's acquaintances, (in which case there is no real danger of 'leaking' your social graph), you can't really limit it more, because mobile phone numbers are pretty random within each country's phone number space (I think).

So say the US has at least 0.5 billion assigned mobile numbers, and you don't know which are assigned, so the space could easily be 10^10 for just the US. (I can't find any quick numbers on Google right now).

But even if we go down to say, 100 million, we would still get (10^7 * 10^7)/3600 ~ 2.7e11 hours worth of hashing to get all contact pairs. Which is still secure.

Now as far as limiting one end point to a single phone number - that would be a special case. I'm not sure if this isn't moving the goal post a bit.

As long as the server does not have access to your own phone number, it, or anyone who has hacked into the server, cannot really use that to limit their computations. Unless they are already targeting you through other means (possibly you are a person of interest to someone), and so know your phone #.

In that case yes, they could reveal (part of) your social graph for a (maybe) affordable cost. But in that very special case, I think we are dealing with nation-state adversaries.

And in that case, the telcos probably already have a list of your phone # contacts. So it would probably be easier to simply get it from them.

The only thing left to be revealed would be that you communicated with a specific acquaintance using that particular app, at some particular time.

So a targeted attack could leak that metadata.

It's not perfect.. but I think it might just be good enough for most users.

If we go off the deep end with an attack tailored to a single phone #, we also have to worry about a whole lot of other attack vectors that are probably more important.

Lastly, note that the user (if sufficiently paranoid) could simply opt out of the automatic contact search post-install.

dfox · on March 15, 2016

As for mobile phone numbers being randomly placed inside larger space of all phone numbers, while this makes some sense for US/NANP, in most of the rest of the world mobile numbers have some semi-well defined prefixes (as in "number that starts with 6 or 7 is certainly mobile phone, discerning what operator is sligthly non-trivial") and even for NANP there probably is some publicly available database that says whether given number is mobile or not.

maxerickson · on March 15, 2016

I don't think it's moving the goalposts, the ability to focus an attack on an individual number is an important consideration for an individual evaluating the system. That only their contacts are revealed is not much consolation when their contacts are revealed.

The ability to move out to the phone network at observe metadata there does make the discussion somewhat academic.