If you give yourself the user agent of a terrible browser such as IE5, or if you give yourself the user agent of a less popular web crawler, a lot of companies won't bother to track your identity as it would be considered a waste of resources. I'm speaking as the minion of a company that deals in such things - for instance, my company won't track you for advertising purposes if you are running Linux. No profit in it!
What happens if I use Linux while browsing at home and Windows while browsing at work? How does your company handle this? I can see a couple different scenarios and am curious which is closest to practice.
1) Track data while at work. Display only at work.
2) Track data while at work. Display at home and work.
3) Track data at home and work. Display at work
4) Don't track data at all.
Of course there's a few cases I missed but from what you said I don't see the reason they'd track me at home and display at work or any other various combinations excluded.
If I understand your question correctly, the answer would be 1. In most cases, they're doing cookie based tracking and you wouldn't have any crossover unless it was site specific retargetting and you've logged into an account on both platforms. If there's any crossover, its because you have the same browsing behaviors on both platforms.
Oh, someone did beat me to it. I think too much (and slowly), it seems.
The useragent is much too random, though. As explained quite nicely on the eff page, it defeats the purpose. You've managed to uniquely identify yourself as someone with a very random-looking useragent.
I was more thinking along the lines of sending random "genuine" user-agents (combinations of various versions of Chrome/Firefox/Safari on various OS).
And with some more clever way to make sure the cookies match, or fixing the user-agent is a bit useless..
I was looking into creating a bot that would randomly click and browse around on the web whilst logged in to Google to add some entropy to my "persona".
Alas, programming is hard.
I don't know what it is--perhaps my get off my lawn mentality--but I am terrified of not owning my anonymity. Though obviously not scared enough to stop using these services...
Yes, would certainly be needing of some checks. Perhaps a dictionary of acceptable link-words that would be OK to "click". Maybe also a blacklist of "never-click" words and/or domains/netblocks.
Or even perhaps a whitelist of a few domains that are OK. Chances are slim that Time.com or USAtoday or CNN links would lead to trouble and the collection of "safe" links would grow exponentially with only a handful of "safe" domains.
This counters the tracking method used by the "Underpants Project" but does little to help against something more sophisticated that made full use of the remaining bits of identifying information.
According to Panopticlick, the biggest leaks of identifying information are the list of fonts and the plugin information string, neither of which seem to be guarded against by any existing tool except NoScript, though disabling Flash might help. However, even using NoScript, your information would still probably leak to whatever popular site is in your whitelist (e.g. Google or Facebook).
Unless someone hacks Flash to always sort the font lists and finds a way to fix the plugin list leak, one way to get rid of many of these leaks is to build a locked-down virtual machine containing a fixed set of fonts, a fixed screen resolution, UTC timezone, and possibly a fixed browser together with only the en-us locale to minimise HTTP_ACCEPT information leaks (which I presume to be the most popular one). Assuming the filesystem remains mostly untouched, the font list should always be in the same order for all instances of the VM. It might not be necessary to use any specific browser, but I presume there are many possible ways to identify a browser and perhaps even its exact version from its behaviour even without using the user agent.
Even after fixing all of the leaks identified by Panopticlick, the behaviour of the user could still be used for identification, perhaps by leaks caused by the user or by bugs and inadequacies in the "anonymisation suite". To pre-emptively counter this, an orthogonal approach like the one suggested by obituary_latte could be used (http://news.ycombinator.com/item?id=3880822). Such a "random browser" might need to be active most of the day to minimise timezone leakage, though that might not be very practical, or even necessarily a concern for many.
Finally, a yet another thing that might help would be to use popular VPN services which would presumably reduce the amount of information leaked by your IP address.
I think the purpose is to show how easily some of these scare tactics can be defeated. This took what, 40 minutes to get posted? Less than an hour to thwart what is being hyped up as a big deal for anonymity.
As an aside, I think you should name it CoverAlls.
EDIT: I guess I worded that a bit harshly. Underpants is meant to be a PoC for a tactic that may work in some situations, but the feeling I got is that people are taking it to be a lot more serious than it is.