Hacker News new | past | comments | ask | show | jobs | submit login
EFF Browser Tracker Simulator (firstpartysimulator.org)
169 points by anigbrowl on June 16, 2021 | hide | past | favorite | 80 comments



Fingerprinting seems to be a losing battle given the ever expanding browser API coupled with sites outright breaking if you disable Javascript, or blocking you as an attacker.

The best strategy I can think of is compartmentalisation. Essentially use several completely separate broswers. One with Javascript, tracking and ads allowed but only used strictly for logging into GMail or banking or stuff like that.

Another browser with reasonable safeguards (like say ad blocking) but nothing excessive like disabling Javascript, for authenticated sites.

And finally a highly locked down browser or the Tor Browser for general "browsing" of news sites and such.

This can reduce the ability for the tech giants and ad companies to track your activity outside each browser, interfering with their profile building.


I practically do this, with Firefox and the Temporary Containers addon. I have other stuff installed too so there are sites that don't work - these I open in a separate Chromium, in an Incognito window.

OP, the problem with your general browser is that most sites will be broken. You'll need to do captcha before each shitty modern news piece and that's even after you told your Tor browser to enable JS and less strict security. But still, I think with compartmentalization one can make tracking reasonably hard. For now.


Will Firefox containers do anything against fingerprinting however? My impression was that fingerprints was all about the specifics of the environment without utilizing a browser session. So different sessions in the same browser will have the same fingerprint. Happy to hear an explanation of why I am wrong however.


> Will Firefox containers do anything against fingerprinting however?

My understanding is that they do nothing. You are on the same FF version with the same preferences and extensions and all. AfAIK, the containers will only keep cookies separate


>Tor Browser

I can think of no better way to guarantee you'll be put on a surveillance list of state level actors than to use Tor.


Depends on where you reside. In some countries (Turkey/China), Tor can get you jailed/disappeared. In pretend-democracies like here in France, that is not the case.

So sure you'll be watched over to some extent. But the fact is if you don't use tor you're EVEN MORE watched over because the intelligence services have "blackboxes" (that's how they call them in law) placed on the network, and anyway your ISP is required by law to keep track of each and every move of yours for 2 years.

So they know i'm using tor, so what? We're millions of people using tor, and every day there's more of us. At least they can't say i'm reading mediaslibres.org one of the only free press outlet online that's not censored by the French state.

(well now they can because i just confessed, but i didn't have to)


> Tor can get you jailed/disappeared

Citation needed.


I can think of no better reason to normalize the use of Tor than to fight the chilling effects of mass surveillance.


That's a huge oversimplification.


> The best strategy I can think of is compartmentalisation. Essentially use several completely separate broswers.

This is exactly what Qubes OS provides out of the box, including Tor Browser. Has been working for me very well.


3 may be too much. I understand the need for a dynamic application platform with a lot of tracking avenues, and the need for a static marked-up documents platform. But why something in between?

Why is it acceptable for a website to require JS just to read some content? We should put more pressure on website owners to stop this nonsense.


another option could be to create a random fingerprint for every site


Note that the EFF's data set is heavily biased towards users that care about privacy (ie, the people who visit the website and test with it). My results on this site indicate that nearly 1/11 users disable Javascript. That is obviously not true.

Useful tool, and I'm really glad the EFF provides it, but take it with a grain of salt. There's nothing magic going on here, and there's no guarantee that the dataset of people visiting other more popular websites will be the same.


Most traffic on the web is from "bots" not browsers.

These "fingerprint" analyses all seem to fail to account for that fact.

"Bots" generally do not run Javascript.

The more information the client sends, the easier it is to create a more complex "fingerprint" and the more difficult it is for users to "copy" this fingerprint.

    GET / HTTP/1.1
    Host: firstartytrackersimulator.org
    Connection: close
No Javascript.

Of course, anyone can make a fingerprint from those three headers. However it is quite generic and very easy for any user to "copy".

Many website operators incorrectly assume this must be a "bot".

I can use a "modern" web browser that tries to send all sorts of information, but because I use a localhost-bound forward proxy that deletes, adds and/or modifies headers, these three headers are all the remote website will ever see.

The question is, what advertiser is seriously going to spend time and effort to try to show ads to a person who only sends three generic headers and does not run Javascript. There is little value in tracking such users if they do not convert.


I find the more you try to protect your identity, preserve your privacy, protect your security the less sites want you to visit. You get stuck behind CloudFlare interactive prompts (as half the planet seems to be using CloudFlare in some way) whilst it tries to work out if you’re a bot. The other sites either assume you’re an attacker and say bye, or try and stick you in some kind of CAPTCHA tar-pit. In short the web becomes unusable the better you get at this.


Indeed. This is one of the greatest (and most under-appreciated) irritations of the whole CDN revolution: if you're a site serving 1e6 (or 1e9!) people, and trying to tell humans/computers apart via statistical means, the tail of the "human/computer" distribution really matters. Most site operators also don't seem to realise that the same human appears at the outskirts of the human/computer classification each time, and therefore comes to mightily hate Cloudfare (etc).

My partner uses -- for work -- a vanilla Google Chrome on Windows browsing life. I use, for work, FF on Linux, usually behind a VPN or two in a different country to where we are.

I see maybe....100, maybe, 1000 times more captchas than she does? Embedded reCaptcha JS code on a page inevitably shows me fifteen pages of traffic lights and lets hers go through. Often sites geo-IP me (incorrectly) to the wrong location. It's a very far cry from the HTML 1.0 days. And nobody outside of HN even understands why this is a thing, and an annoying thing at that!


Well, I do not send the minimum as a privacy measure. I do it for speed and efficiency. Because I am so often using netcat, tcpclient or the like to make requests, I have learnt that most headers browsers send are totally unnecessary. I have had the opposite experience to what you describe. The web is really fast. Cloudflare especially. They offer ESNI, which I do use for privacy. No point in encrypting DNS if we ignore SNI. I make all requests to Cloudflare using an ESNI-enabled openssl as a backend for the forward proxy. Very rare to encounter the CAPTCHA. I am more annoyed by AWS who require SNI but offer no ESNI service.

Honestly, to me the web feels unusable if I try to use a "modern" graphical browser with Javascript and a DNS resolver enabled. The way most people use the web, I guess. To me it feels like a tar-pit. I do some basic stuff like banking, etc. that way, but for recreational use, I do not use the major browsers.


The problem I descibe is that barriers get in the way the more you try and protect your privacy or security. It's not a speed issue - the web works very fast with me using NoScript and uBlock Origin. It's just lots of sites now longer like you visting or treat you as an attacker - you're soft blocked. If you ramp it up, even more of the web becomes a pain to use: e.g. using privoxy/polipo/squid, pi-holes, vpns etc, removing or changing headers, screen sizes and geolocation, timezone etc etc to make you less fingerprintable. Of course all you're really doing now is making yourself stand out like a sore-thumb.

The web is even faster when I'm using telnet/openssl s_client -connect, but that's not the point is it or even a fair comparison? I just want to browse the web without being tracked and categorised/labelled. It seems that's too much to ask.

Regarding CDN/Caching:

Google made a change[1] to how caching works which also has some CDN impact.

Instead of saving a resource with its full URL only, they have added two more bits of data to the saved information. Chrome saves the top-level site and the current-frame site next to the full URL of the cached resource. The browser uses the information to determine whether it should serve resources from the cache or not.

[1] https://developers.google.com/web/updates/2020/10/http-cache...


You are likely getting stuck behind Cloudflare prompts because you are attempting to 'hide your identity' via tor or VPN service.

You're likely attempting to use the same channels as attackers, but want to be treated differently 'just because'

Kind of hard to write rules for criteria like that.


I do not use tor or a VPN service.


"Most traffic on the web is from bots"

That's true; but it misses the point. If some bot visits some website, that doesn't expose me to being tracked. It might expose the bot to being tracked. It's when I visit a site, using my real browser, that tracking becomes an issue.


"Bots", i.e., clients that are not graphical browsers running Javascript, never have to see ads. "Bots" do not get tracked; they generally do not run Javascript or store cookies. Users do, generally. This is because users generally agree to run company-sponsored browsers that, whether intentional or not, use default features and settings that cater to web advertising and tracking. Thus, if the objective is to avoid ads and tracking, so-called "bots" must be doing something right. That is the point being made here.


Firefox has "privacy.resistFingerprinting" setting which eliminated many fingerprinting mechanisms (at least time zone, resolution, canvas, webgl, maybe others). Anyone know what major sites it breaks or what other drawbacks there are setting it True? It defaults to False.


I just enabled it to try how it changes the results. Many data points vanished, but suddenly the screen size doesn't show my actual screen size (1920x1080, not very unique) but the size of my viewport (very unique, considering I have a sidebar active). Can somebody confirm this? Why would this setting affect this?


I made the same observation. I also have a sidebar, and after enabling resistFingerprinting, the screen size is a unique identifier for me.

Edit: Here [1] is a relevant link from the mozilla support forum. It links to an old bug report in Tor, and they claim that this actually reveals less information about you. Imo it still makes tracking easier, but I didn't really dig into it.

[1] https://support.mozilla.org/en-US/questions/1200851#answer-1...


Doesn't it mean that each time you resize your browser you get a different fingerprint? Sounds good to me. You just shouldn't make it full screen.


As a sibling stated, the immediately noticeable effect is most, if not all, websites showing timestamps in utc0.

I don't want to make such a strong statement as to wholly blame it on the fingerprint resistance setting, but within a week of turning it on, my Google account that I use for my Android devices had all payments suspended.

As in, I could not buy anything from Google nor use their Pay service (not that I otherwise use it for everyday purchases).

Until I gave them a photo of my ID.

I finally reinstated my account last week after this happened over a half-year ago.

I originally saw it as a fun experiment to see how long I could go without needing to buy anything in the spirit of deGoogling, but I decided at this point, just in case, that I'd rather not find myself in some sort of preventable, inconvenient tough-spot somewhere down the line (largely out of an uncertain sense of physical safety).


> Until I gave them a photo of my ID.

That's fucking nuts. Why did you even accept such a thing? Why is it ok for a private company to ask for your ID and keep a copy indefinitely?


When you want them to provide financial services


Google does financial services???? Why would you accept that? Don't you have a bank already? Sorry i'm trying not to judge but i seriously don't understand why we would give google even MORE power over our lives than they already have (too much).


For most people the convenience of Google Pay outweighs the risk that they won't be able to buy coffee with it one morning if their account gets suspended or that they'll be targeted coffee bean ads instead of other ads.


or if you want to look at some pornography - at least that seems to be the way the UK is heading...


Unfortunately, privacy.resistFingerprinting intentionally breaks the prefers-color-scheme CSS media query, telling all websites that you want a light color scheme. It would be nice if they could at least have made it so that it doesn't indicate any preference.


Maybe try the DarkReader plugin (for FF - not sure if there is an equivalent for other browsers)? I use it since hardly any sites that I visit have a dark theme available.


I had to reset it to False due to one annoying consequence: Since the time zone is set to UTC, all my workmates on Slack assumed I'm working from another country. And I had to mentally calculate time differences. It was the only major issue but was too much of a hassle.


it breaks pasting on facebook


Pasting what? I have this set and tried pasting various things on FB and everything worked as expected. The only downside I've noticed after roughly 12 hours is that zoom (ctrl+ in FF) is reset which makes text barely readable on high-dpi screens.

Maybe you have some other add-ons interfering?


on another try, you might be right.


> Your browser fingerprint appears to be unique among the 225,692 tested in the past 45 days.

Never was the confirmation that I'm a unique snowflake less welcome.


This only counts if they recognize you after you return. If they think that you're unique again, then the tracking didn't work at all. This was the case last time I used their tool, Panopticlick.

Edit: I tested this current tool and it recognized me after I returned. Opened it first normally, I'm unique. Opened it in a private window, unique. Closed that and opened a new private window, and bam, one other previous browser had my fingerprint already.


Same here, maybe it's my portrait-orientation screen.


It tells you how much each component contributes lower in the results page. Screen info should be under "Screen Size and Color Depth". For my 2560x1600x24 display 1 in ~1582 have it. The canvas fingerprint was the greatest source for me with 1 in ~2870.

That being said the goal is that the vast majority of users come out unique even if everything is mostly normal individually anyways.


For me it was Hash of canvas fingerprint

Bits of identifying information: 17.84

One in x browsers have this value: 233841.0


Tried it with Tor to see if Tor provided much better privacy. Relatively it is better than another browser, but still 12.n bits of uniquely identifying information (1:4096). From a k-anonymity perspective, I might as well transmit a full zip code to all the parties at each site the browser interacts with.

I don't have the data handy, but the idea of expressing this in terms of conceptual zip codes could be useful. e.g. a 5-digit US zip code has a wide range, but handwavy, let's say the average for all 5 digits of a given zip code is a population of maybe 5k people. 4-digits could be 50k people, 3-digits 100k etc.

12-bits of information is a one in a field of 4096, which implies to me that the Tor browser finderprint yields the unique information equivalent of sending a consistent zip code, just not your physical one. It means that your net level of anonymity requires other Tor users to also be in that "zip code" of fingerprints, which seems to have a very low probability.

I am surprised Tor doesn't appear to tumble these fingerprint features.


For me the most interesting part is where they break down how many bits of information is leaked by the various signals.

Do browsers ever try to fuzz any of these values? For example lying about what GPU I'm using, or occasionally swapping between equivalent timezones?


> Do browsers ever try to fuzz any of these values?

Brave does, and it is absolutely the right approach to fingerprinting. They call it "farbling" rather than "fuzzing":

https://github.com/brave/brave-browser/issues/8787

The alternative is what Firefox does (halfheartedly), which is to try to make users look similar to each other.

It really bothers me that Firefox is so stuck on this approach -- in every other way I prefer Firefox over Brave and I really do not want to switch just for this one feature. But Firefox's halfway implementation of "try to make everybody look similar" is doomed: a serious effort would turn Firefox into TorBrowser-with-javascript-disabled or worse, which is never going to work.

To Firefox's credit, one area where Brave is having problems with this is font fingerprinting, because they don't control their own rendering engine (it's Chromium). So Firefox could "do Brave-style antifingerprinting" better than Brave does, if they got serious about it:

https://github.com/brave/brave-browser/issues/816


GPUs specifically are hard to mask. Because gpus are terrible ( ;D ) and you have to expose a lot of details for webgl apps to work correctly (despite my efforts to limit feature set in the distant past), and hiding or adding details to that could break them so you have to assume that any site that's interested can know exactly what GPU you have.


IMHO for most users it would make all sense to just break webgl apps except if you explicitly whitelist a particular site where you really want to use a webgl app there. If some random blog or webstore pretends to be a webgl app for tracking purposes, the browser should just auto-deny everything.


The original safari webgl implementation did just that, and the dialog was useless as you had no way of knowing if a site was trustworthy.


It's not useless because it's not about trustworthiness but rather about expected functionality. I may consider google.com and amazon.com as trustworthy in general, but the user does not want anything from these sites that needs WebGL functionality, so if they try to enable WebGL, that doesn't serve a user need and needs to be denied; and shadyjoessuperfun3dgame.com.aaaaaaaaa.ru might be less trustworthy, but if someone wants to play that superfun3dgame and it needs webgl, they might reasonably enable webgl and the related tracking opportunities for that site alone - but not 99% of the rest of the web.

It's more about the general paradigm of recognizing that yes, there are some web apps that would need extensive functionality that might be abused (e.g. advanced copy/pasting features for an online document editing tool, or the ability to connect to USB devices, or camera/microphone functionality, etc), but also that most web browsing does not involve things that user desires to be a web app, and all those sites should be limited to a quite narrow sandbox of functionality. All these fancy features with privacy risk that are needed for some advanced web app replacing desktop features are fine and useful, but they should be exceptions on a per-site basis, not enabled for the whole browser so that they can get used by random blog that includes a standard ad-network javascript file.


I just wish they would also list mitigations for each of the leakages.


Apparently I have strong protection against tracking, but it’s also saying I have a unique fingerprint among all users that have used the app (220k)? I’m just running IOS/safari (same UA as 1/113 users) It seems I’d be easy to track based on my unique fingerprint? Or is there some other factor not mentioned that means my fingerprint isn’t expected to be unique in a larger set? If so, how large?


AFAIK, the only un-fingerprintable browser is Brave (because it randomizes certain web API return values between navigations).


Somewhat unrelated but the Brave website is broken under the latest Firefox 89 + uBlock Origin (I have no other extensions/themes installed). The entire site is just a blurred, acrylic like mess.

If you open in Firefox's safe mode though, it works fine. Which suggests it is uBlock Origin, but curiously the site doesn't work even if I disable uBlock for the domain.

PS. It seems Mozilla have removed the "report broken site" option from Firefox's Help menu. There is only report deceptive site.


The recommended place to report such an issue is https://webcompat.com/


The Tor Browser is designed to be "un-fingerprintable" as well, though instead of randomizing return values it prefers to return the same thing for every user.


I mean it sounds reasonable, right, but it just doesn't work that well. Brave's Tor windows might be a better way of using it.

Fundamentally, trying to make everyone look exactly similar is going to leave a lot of chances because you can only ever try, and it'll leave a machine learning algorithm or even a statistician with diff with a lot of information about what is important. The worst offence is that it relies on everyone else doing the same thing. It doesn't really work.

Randomising takes all that out. With no real way to predict it, no easy way of telling what's random and what isn't, and with no reliance on everyone else doing the same, you have just as many if not actually less bits of identifiable information, rely on no-one else, and feed garbage to algorithms which throws them off and possibly confuses you with many other people, protecting them at the same time.

I really don't get why the biggest and most well-known Tor browser, the Tor browser, would keep using such a flawed method. Better than nothing though.


If everyone looks similar you don't have to worry about trackers identifying individual sessions. In any case, if implemented correctly, both methods will achieve the same results.

Sure, an incomplete implementation of the randomizing method is less trackable than one of the same fingerprint method, but in practice

* even a non-perfect TBB method will protect you against pretty much all trackers

* and the TBB is, as a matter of fact, not fingerprintable on any of these tracking test sites.

Therefore I think saying that it "doesn't work very well" doesn't quite reflect reality (unless you have a good source on that, I might be mistaken here). And if your threat model requires 100% anonymity (which is an entire level above not being tracked by adtech), the only realistic way to achieve that is to disable JavaScript anyway.


It's wrong and has been for ages. It will claim things like I'm 1 in 500k even though I'm on a recent iPhone on the west coast of the the USA. Simple math will tell you it's just plain wrong. The only settings that leak on an iPhone are possibly the model, time zone, and language. There are at last 70 million people in that time zone. Right now it claims for me my iPhone is 1 in 240k so it's effectively claiming there are 291 iPhones of the same model in 70 million people. Yea, no, that's clearly wrong.

They don't care that it's wrong because the exaggerated numbers scare people into believing in their cause.

Yes I get that it's possible that I'm the only iPhone in 70 million people that checked their site in the last 45 days but that's still exaggerating their point. Only a site that has lots of traffic would need to fingerprint you. So reporting that a site that gets almost no traffic and tell a couple of iPhones visited per day in the last month isn't remotely representative of a popular site that actually is trying to track


You miss the point. Your iPhone make and model, time zone, and language are perhaps 20% of your fingerprint.

After testing, you can scroll down to see different test metrics and see exactly which fingerprinting methods (WebGL hash, AudioContext, etc) all contribute to your unique fingerprint.

It’s not just your phone model.


no..you missed the point. there is nothing else unique about my iPhone. every one of those other techniques will show up with exaclty the same fingerprint for the same model of iPhone.


I got the same result and I was testing it with chrome on android, with no extensions.

I think something is pretty off.


They explain in detail the various characteristics that go into your unique fingerprint. When combined, almost everyone can be uniquely identified by it.


What seems to really get me on this test is having an eGPU plugged into my Mac.


I’m not sure this site takes into account correlations between different attributes. If not, it overestimates how unique you are.

Also, it doesn’t include your IP address, which is the most identifying piece of information. All in all, a slanted site fit to push an agenda.


> All in all, a slanted site fit to push an agenda. What kind of agenda is it trying to push?


The EFF is a lobbying group. Presumably pushing the agenda they lobby for - namely digital rights and privacy as they would envision it.


If it doesn't take correlations into account, then the total # of identifying bits (aka entropy) will simply be the sum of # of identifying bits in each category, which does not seem to be the case. Therefore, I will guess it does take correlations into account (presumably by calculating the total # from uniqueness of the fingerprint only).


My IP address changes daily. As do the IP addresses of most people in several countries.


That’s enough to track you across recent sites, and if any of those sites use a cookie that spans a day, to track you across days.


Are Canvas and Webgl fingerprinting almost enough to identify almost anyone?


WebGL Vendor & Renderer

Intel Open Source Technology Center~Mesa DRI Intel(R) HD Graphics 630 (Kaby Lake GT2)

One in x browsers have this value: 13012.06

Also 1920x1200 isn't as common as I thought, "One in x browsers have this value: 95.68"


Yes, the 1920x1200 feature was surprisingly rare given that my h/w concurrency (16) was scored at about 1 in 39.


> Within our dataset of several hundred thousand visitors tested in the past 45 days, only one in 11983.11 browsers have the same fingerprint as yours.

Given the way iOS works, does this mean only one in about 12k users are on iOS 14.6 using a iPhone 12 (my configuration). That seems to be what the fingerprint boils down to?


Time zone, your phones memory, screen size (eg, 12 Max, Mini, otherwise), what browser you accessed it with (more using Safari on iOS than Chrome).


Time zone is good point. Just tested a iPhone 12 Pro and iPhone 12 and results were identical (despite having different amounts of memory).


It uses a lot more advanced techniques too, like rendering text and shapes in a hidden canvas + webgl, which will render slightly different between hardware, drivers, etc. Was also surprised to see that my HTTP_ACCEPT header was unique in 1 of nearly 5,000 devices, considering I am on a fresh (literally days old) MacOS install.


If my browser uniquifies my presence every time it is queried, then yes, I will always have a unique fingerprint.


Chrome does not protect from web tracking whereas edge has strong web tracking protection.


I'm unique!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: