So, from what I gathered it's just the favicons of the search engines, some mozilla country stuff (not sure why) and google safe browsing (which you can turn off and is a good feature for casual users)
So what's the issue here? All this isnt really a problem and safe browsing is, in my opinion, even good for normal users.
I'm pretty sure that these get requests don't even fetch with cookies, so fingerprinting is probably impossible here. The most info they can get is "hey, this ip uses Firefox". Harmless in itself. It can be compounded with more info to track someone, but all of this info already contains IP/browser info so this doesn't help at all.
If I had to summarize the bad security design that is epidemic in the software industry, this would be a good candidate. Security is about always being vigilant and minimizing potential risks, and not the roughly boolean categorization that most people use, where everything judged "safe" or "unsafe".
TL;DR - Stop using "default allow"!
You would think that programmers (and engineers in general) would understand that understand how small, simple pieces can become incredibly useful when you find a clever way to combine them. Unfortunately, the lesson of "information hiding" - and all of the stuff we derive from it, like {object,aspect}-oriented programming and other encapsulation techniques - end up being ignored when discussing security.
Applied to web browsers, the burden of proof should be to justify why a request is both necessary and safe. The "necessary" criteria is important, because we do not know how this data could be combined in the future, possibly in damaging ways. A good example of this is how the NSA uses Google's PREF cookies[1] to track sessions. Another example is panopticlick-style browser fingerprinting, where every bit leaked can make the fingerprint more accurate.
The fact that this is about favicons is a perfect demonstration of how this "default allow" attitude. Not only is that not necessary, it can be supplied with the browser[2], completely removing the need for any GET request. Minimizing external requests isn't even considered.
[2] copyright and trademark are poor excuses - if the search engine wants to be in the default set of search tools provided by the browser, they can trivially authorize the redistribution of a a simple icon.
>copyright and trademark are poor excuses - if the search engine wants to be in the default set of search tools provided by the browser, they can trivially authorize the redistribution of a a simple icon
Debian's policy does not allow them to accept "special dispensation" for copyright and trademark permission - either everyone has to get it or Debian can't accept. This is part of why Iceweasel was split from Firefox in the first place:
Per the Debian social contract:
>License Must Not Be Specific to Debian
>_The rights attached to the program must not depend on the program's being part of a Debian system._ If the program is extracted from Debian and used or distributed without Debian but otherwise within the terms of the program's license, all parties to whom the program is redistributed should have the same rights as those that are granted in conjunction with the Debian system.
Why would you assume that such permission would be "special" to Debian? It's an icon. There is zero reason for a search engine to restrict such use (and distribution).
>There is zero reason for a search engine to restrict such use (and distribution).
Because these logos depict their trademarked images, and the licensing that they'd offer would likely not allow use for any purpose. Debian's social policy requires that everything included in their distribution be freely licensed for any use.
> and the licensing that they'd offer would likely not allow use for any purpose
Then they don't want to be in the default set of search providers. Seriously, has the art of negotiating been lost completely? Google/Yahoo/etc need that strategic placement far more than they need to enforce some minor point about their logo's trademark. Debian's social policy doesn't mena they have to let themselves be browbeat by businesses without even an attempt at negotiating.
Do you really think Google or Yahoo would just say, "No, we don't care about being in Firefox/Iceweasel's default search list."? Would their shareholders be happy about losing market share over want of a trivial licence?
>Do you really think Google or Yahoo would just say, "No, we don't care about being in Firefox/Iceweasel's default search list"?
Given that they haven't freely licensed their images and this has been an issue for some time I'm going to have to say the answer is "yes" because Firefox/Iceweasel download the missing images on startup.
> simple pieces can become incredibly useful when you find a clever way to combine them
Here's the thing. There's nothing to combine this info with that doesn't already have this info. IP+User agent. This info is sent with every request and any other gathering of data will contain these anyway.
So it's not just harmless in itself, it's harmless, period.
How is this different from OCSP pings and stuff like that?
I find it difficult to believe you cannot answer that question on your own.
The difference is that OCSP provides a needed service, the very purpose being related to a TLS connection that the user requested. Checking the CRL is an important part of the TLS security process, and it would be stupid to ignore it.
A favicon provides no security benefit, and is entirely optional.
Yes, it may be true that making a GET request for both a CRL and a favicon might leak more or less the information. That isn't relevant, and misses the point entirely about minimizing the network. We don't know what data is useful. It is entirely possible that the situation can change in the future and previously harmless data can become a part of something greater.
Minimizing network use to what is necessary is an implementation of the "default deny" policy, and it is the only sane security policy because we cannot predict the future. Enumerating badness[1] is always going to end up playing catch-up.
I was talking about the safe browsing request. Arguably that's even more important to the average user than TLS safety. MITMs are rare, malware sites are plenty.
I do think that the favicons could just be integrated.
I don't think there's any point minimizing the network in this specific case. "We don't know what data is useful.", sure, but any other data this can be compounded with _already contains the same information_.
Though yes, stuff could change in the future. I can't see how it could change in any way to make this useful, but you're right, this isn't something we can predict. That point is valid.
Since we know NSA is reading this, it's telling them that you (and they have a good idea who you are if you are connecting from a residential line) just launched your browser. I'd say that's pretty intrusive.
They don't care about that you checked your mailbox.
The problem is that they saved your action for future. Everything you do is saved. That does not mean that they care. But in case they need to care in future. Everything you have done is logged.
In the case of Debian, it tells them you are using a specific version of iceweasel, which probably is quite deterministic in combination with the rest of the dragnet.
Whenever a new version of Iceweasel hits the repos, a bunch of people upgrade, and perform identical requests. You can figure out how frequently someone checks for updates I guess, but other than that what would you learn? The browser hasn't had the chance to acquire any information that would distinguish you from any of the other Debian users yet.
I recall reading something on the internet that IP + browser fingerprint is good enough to unique identify a large number of people. Has this changed or otherwise untrue?
Getting the type of "browser fingerprint" they're depending on here requires a bunch of Javascript. You can't get that data from just looking at a single request, which is all that they're getting here.
This is a handful of GET requests for images. You would need a page to fingerprint. The site could get an IP, likely the user agent. Safe browsing requests are sandboxed from all the other Google cookies.
Isn't the API key your fingerprint? The key is not shown here, because the author was showing what the requests look like on first run, subsequent requests would contain your unique ID in them.
Not picking on you specifically, I saw several responses saying the same thing.
I'd be willing to bet that Firefox would treat them as such, even if they returned something different. Still, those favicons should be bundled instead of making requests, IMO.
Because we can see the sum of the requests, I think it's safe to say that they are probably just images.
There's nothing to stop, say, ebay from serving PHP scripts on ICO extension, processing the request as a pageload and then ultimately returning the icon file, but for anything useful to have been gleaned, it would have generated more requests.
Either it did generate more requests and Mozilla didn't honor them (in which case, yay), or it didn't generate more requests (in which case, yay). The former would be slightly preferred as the latter doesn't prevent ebay from later changing their strategy.
Browser fingerprint requires at least Javascript, and to do it properly you need the ability to run Java/Flash too. These are HTTP requests, not HTML pages which are opened in the browser. Since its not using cookies, these requests only contain the user agent string, which is very far off from any useful kind of tracking-fingerprint.
Number one is that it's happening when users don't expect it. As an end user, I don't expect my browser to start making requests until I tell it to.
And it's silly to say the information is "harmless in itself," because most of the information Google, Facebook, Yahoo, etc. collect is harmless in itself. The whole point of those companies is to collect as many tiny pieces of "harmless" information as possible to build up profiles about people.
TBH, I don't trust Google at all on privacy matters any more. The way they try to tie my work Gmail to my personal Gmail, to my youtube viewing and cram everything into Google+ has really rubbed me the wrong way, and I've been moving away from their services as much as possible. The last thing I want is my browser contacting them without my knowledge.
The information "hey I have an IP and I'm using this browser, and I have a browser at this time" is going to be send A LOT when using the browser for what it's made for. The problems come later when sending every URL to another party (safe browsing). Also from google "Privacy: API users exchange data with the server using hashed URLs, so the server never knows the actual URLs queried by the clients." So it's possibly safe ?
I'm totally Google free save whatever traces of Google tech are in CyanogenMod on my phone and I can't say my life has been negatively affected by going this route.
This is what happens when folks with a radically-non-mainstream view of privacy try to use an app built for mainstream folks by folks with slightly more mainstream opinions about privacy.
The sole purpose of Iceweasel is to not be subject to any restrictions (mozilla approval of changes) that may come with distributing trademarked "Firefox" software.
I looked into it further, and know I see where my confusion came from.
[Debian] Iceweasel is a fork [from Firefox] with the following purpose :
backporting of security fixes to declared Debian stable version.
no inclusion of trademarked Mozilla artwork (because of #1 above)
Beyond that, they will be basically identical. (quoting Roberto C.
Sanchez post in debian-devel mailing list)
But there was another Iceweasel, GNU Iceweasel. To avoid confusion with Debian Iceweasel, it has been renamed to GNU Icecat.
GNU IceCat, formerly known as GNU IceWeasel,[3] is a free software
rebranding of the Mozilla Firefox web browser distributed by the GNU
Project. It is compatible with Linux, Windows, Android and OS X.[4]
The GNU Project keeps IceCat in synchronization with upstream development
of Firefox while removing all trademarked artwork. It also maintains a
large list of free software plugins. In addition, it features a few
security features not found in the mainline Firefox browser.
This article is about Debian's Iceweasel, which is why my comment was wrong.
Google maintains the Safe Browsing Lookup API, which has a privacy drawback: "The URLs to be looked up are not hashed so the server knows which URLs the API users have looked up". The Safe Browsing API v2, on the other hand, has the following privacy advantage: "API users exchange data with the server using hashed URLs so the server never knows the actual URLs queried by the clients". The Firefox and Safari browsers use the latter."
https://en.wikipedia.org/wiki/Google_Safe_Browsing#Privacy
> Safe Browsing also stores a mandatory preferences cookie on the computer[9] which the US National Security Agency allegedly uses to identify individual computers for purposes of exploitation.[10]
That may or may not be true, but must one be a radical to be concerned?
If you open firefox and browse to a few sites, it will send that cookie. If you then take your computer down to the coffee shop and keep browsing, even if you don't log into anything, it will still send that cookie in the clear.
There are other ways that the NSA can figure out a list of IP addresses you've been using, but this is 1) totally silent, and 2) is common to a lot of systems.
Firefox's safebrowsing feature uses a separate cookie jar, so if you are logged into Google those cookies will never be sent via the safebrowsing API.
Also, Firefox hashes the URL and compares the prefix of that hash to a master table downloaded from Google. If the URL matches a prefix in the table, Firefox requests all URLs that begin with that prefix hash. Google is never sent the full hashed URL.
Please tell me you didn't just edit that wikipedia article and cite your own comment.
Not only is that not a valid wikipedia cite, it's not even right. Only hash prefixes are ever sent to Google, and only if it's already tested locally that the hash prefix includes malicious sites.
- I couldn't find a better link for the citation and it seemed like rather important info that should be in the wiki. Maybe someone else would find a better one to replace it with.
- I figured linking to an HN discussion would serve as a great citation, even if I was mistaken about something. Looks like HN didn't disappoint. :)
EDIT: I've updated the wiki text to remove my previous edits and added a mention about the use of hash prefixes.
> Not only is that not a valid wikipedia cite, it's not even right. Only hash prefixes are ever sent to Google, and only if it's already tested locally that the hash prefix includes malicious sites. Humorously this exact same exchange took place in the linked conversation:https://lists.debian.org/debian-devel/2015/07/msg00232.html
Thanks for the link and pointing that out, I stand corrected. I'm curious to know how big the prefix is. Depending on its size this either remains privacy theater or not.
Is false. Google is big, but even they couldn't handle the load of a large fraction of the world's web browsers sending them a request for every page loaded. That'd be insane.
Google Safe Browsing is based on a Bloom filter. The browser downloads the filter in a series of requests when it first starts up, or when the filter is out of date. It also sends a followup request if it finds a hit, but this is rare unless you're actually about to visit a site that's been flagged as unsafe.
> It also sends a followup request if it finds a hit, but this is rare unless you're actually about to visit a site that's been flagged as unsafe
And the followup still doesn't send the URL or even a hash of the URL. It sends just a prefix of the hash to download all URL hashes matching that prefix to do the comparison to the actual current URL locally.
The request body is used to specify what the client has and wants:
* The client optionally specifies the maximum size of the download it wants to retrieve.
* The client specifies which lists it wants to retrieve.
* For each list, the client specifies the chunk numbers it already has.
It's interesting, especially since Firefox currently heavily uses "privacy" as their selling point:
>Committed to you,
your privacy and
an open Web [1]
>We’ve always designed Firefox to protect and respect your private information. That’s why we’re proud to be voted the Most Trusted Internet Company for Privacy. [1]
These browsers are all constantly accessing many sites. Here's[1] a comment I posted a month ago about Firefox. The summary is here's (at minimum) what Firefox accesses when it starts up as a Guest in OS X, and this is after I unchecked a bunch of boxes:
Try it yourself as Guest. But make sure that Parental Controls are on. That way OS X will popup these sites and ask permission. Firefox is unusable w/o opting in all of these.
The whole shrugging off and downplaying of issues like this are exactly the reason the internet is complete shit when it comes to security.
Why, after all the exploits, insecure software and bad decisions, can people still not see that they can't anticipate everything?
For instance, here's a scenerio I can easily envision: The NSA strongarms Ebay into letting them sniff TCP connections to their favicon, combines the TCP fingerprint with the browser useragent to uniquely identify you from perhaps millions of other users. Geolocate your IP to determine where you are and bam they know all about you they need to know. Tin-foil hat? Of course. Plausible? Totally. Doable? Absolutely. They don't need to be perfect, just good enough.
The problem is that it's a big responsibility. You can't just do it once and dump it on the internet. It has to be kept up to date with the latest Firefox versions, and it has to have prompt releases (within a day). You also need to be absolutely sure that your changes aren't creating more problems than they're fixing.
127.0.0.1 www.google-analytics.com
127.0.0.1 ssl.google-analytics.com
127.0.0.1 www.hosted-pixel.com # I Swear I'm Not Making This Up
On some but not all operating systems it's better to use 0.0.0.0.
It's better to block it with a firewall but your aged grandmother doesn't know how to configure them.
iOS and I expect Android have hosts files but you must jailbreak to edit them. On iOS you can do that with iFile from the Cydia store. iFile once cost money but it's free now.