Hacker News new | past | comments | ask | show | jobs | submit login

As you may have read in the feedback request, Mozilla is proposing to use differential privacy – differential is very different from tracking.

For more information, see https://en.wikipedia.org/wiki/Differential_privacy for instance.




My point is not the way you label gathering information from your users but rather that it is about implementing something Google proposed.

If the mechanism works, fine, but why should I use Firefox over Chromium then? Opt-out data collection is in violation to my core beliefs and what I believed to be Mozilla's principles.

Collecting data without asking the user about it is - to me - in violation to the very definition of privacy and calling some way to anonymise data (who guarantees that the cryptographic approach to this is not obsolete in a few years?) "differential privacy" is at the very least dishonest.


Existing telemetry in Firefox already works on an opt-out basis. This changes nothing.


Existing telemetry dosent collect browsing data


So, I read that, and already see two problems. One - DP provides privacy by deniability. How does that apply to URLs (or even just domains)? For a domain to show up, I have to have visited it (unless Firefox will report back random domains).

Two - DP is only really private over a small data set per individual. If DP were enabled for even two days, you could get a very accurate picture of the sites I visit, since a majority of the domains reported would be necessarily be accurate values.


One: I'm pretty sure that the idea is to report back random (existing) domains, yes.

Two: That's an interesting question. You'd need to ask it to someone with more domain knowledge than me.


> I'm pretty sure that the idea is to report back random (existing) domains, yes.

Here's a concern that comes up from that implementation option: any outliers from the set of existing domains (which would likely simply be implemented as a list of strings) would immediately be able to be called out as a "True" value, while a single reporting of a domain could reliably called out as a "False" value. Unless, of course, you choose a randomization algorithm which exhibits a very strong clustering trait.

You could also limit reports to those domains which are in the whitelist, but that would voluntarily neuter the reporting; something they seem less-than-eager to do.

Ultimately, it will all come down to the implementation details, which are unlikely to be available until after the opt-in release, and auditable by a remarkably small number of people in the open source community.


RAPPOR uses a Bloom filter. It doesn't report the domain itself; it reports (a corrupted version of) a handful of bits of a hash of the domain.


Good info, thanks!





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: