Glad to see them succeeding, but personally the privacy of my web searches doesn't bother me - as long as they aren't being passed along with personally identifying information. I'm far more worried about emails, messaging, video, storage etc.
Can someone explain to me (or point me in the direction of something that explains) what Google and Bing store in terms of tracking when you are not logged in?
Obviously you can use VPNs or TOR to be really safe, but do you need to go that far if you want an untracked search on Google and Bing?
They have access to IP+time, your search query, and cookies for correlation to other requests. It's valuable information and Google openly documents that they are keeping and using it, along with anything else you leak to them: http://www.google.com/policies/privacy/
IP+time is enough to get your personal identity information from your ISP (physical location of the endpoint, billing information), I have no idea if Google's relationship with ISPs is good enough to buy that or if it's only available to cops.
> They have access to IP+time, your search query, and
> cookies for correlation to other requests. It's
> valuable information
Valuable for blackmail, but not really useful for anything else; the commercial value of information rapidly degrades over time. Knowing I want to buy a new fridge today is very valuable, knowing I wanted to buy one last month is nearly worthless.
> IP+time is enough to get your personal identity
> information from your ISP (physical location of the
> endpoint, billing information), I have no idea if
> Google's relationship with ISPs is good enough to
> buy that or if it's only available to cops.
Despite what the RIAA think, a user agent's IP is nowhere near accurate enough for use as identification.
I hope that ISPs do not release personal identity or billing data to arbitrary third parties. I know my ISP (sonic.net) claims they don't[1], and even privacy-insensitive companies such as AT&T have privacy policies that would forbid them from selling personal data[2].
Even if it were possible for random companies to obtain personal data from an ISP, I doubt that Google would have any interest in participating.
I love targeted advertising because it is so blatantly obvious and hilariously over-optimistic.
I search for a lot of random crap with more curiosity than intent to buy. I looked up the price for several windmills , the late 19th/early 20th century style, (~$1000, by the way). For weeks or months afterwards, I saw windmill ads on a sizable fraction of the websites I visited.
To be fair, it's far more likely that I am going to by a windmill than a random ad viewer, but the probability is still staggeringly low. There had to be a hundred other products I was more likely to buy than the windmill, that would be more valuable to show me. But no! I had viewed their product and I! Must! Be! Targeted!
I really wonder what the set of products that do well from targeted ads looks like.
It does well because the probability of you buying a windmill multiplied by profit per sale is still more valuable than something non targeted like tampons (some of the advertisers get the numbers wrong but not the ones at scale.)
Interestingly, from the CPMs I've seen re-targeted/re-marketed ads perform on par or below contextually targeted ads. No one even comes close to Google for contextually targeted ad inventory (unless you are operating in a narrow niche and you are selling inventory directly, but lots of time and money to even match them.)
>>Valuable for blackmail, but not really useful
>>for anything else; the commercial value of
>>information rapidly degrades over time.
>>Knowing I want to buy a new fridge today is very
>>valuable, knowing I wanted to buy one last month is >>nearly worthless.
I think that blackmail is already bad enough. Considering which topics somebody might want to learn about on the internet, various diseases for example.
>>I hope that ISPs do not release personal
>>identity or billing data to arbitrary third parties.
I hope that too. But this information is gathered and stored somewhere in flawed systems which are operated by humans which might decide to follow their own interests more than the interests of the customers. I know of at least one story where an employee of a search engine has been using his privileges to stalk other people.
> I hope that ISPs do not release personal identity or billing data to arbitrary third parties.
My ISP, Time Warner/RoadRunner, claims that they will do so. It's kind of ambiguous because their declaration combines several services and kinds of data covered by different laws. I think the applicable part for their cable ISP service is:
In the course of providing Time Warner Cable Services
to you, we may disclose your personally identifiable
information to [...] consumer and market research firms,
credit reporting agencies and authorized representatives
of governmental bodies.
Selling their DHCP logs and customer records to a commercial data aggregator (who could then sell it to anyone) appears to be compliant with their privacy policy.
Valuable for blackmail, but not really useful for anything else; the commercial value of information rapidly degrades over time. Knowing I want to buy a new fridge today is very valuable, knowing I wanted to buy one last month is nearly worthless.
That depends on what they can get out of the data, besides the obvious. I'm thinking of the story about Target knowing that a girl was pregnant before even her father did[1]. Even longer trends can probably be derived, regarding personality traits, income, etc. That information is worth a lot even months or years after it was captured.
"Despite what the RIAA think, a user agent's IP is nowhere near accurate enough for use as identification."
Maybe not when I'm coming from our company's NAT (700+ employees behind a single IP address) - but the number of people on my Comcast connection is limited.
The IP address from which you send your request to a search engines website may be regarded as a personally identifying information. In case that this information becomes publicly available the connection between your search terms and your IP address will be visible. In fact, this has happend in the past and there was a searchable database with the leaked information online where you could look up search terms.
I don't know what data Google and Bing are collecting, but here is one quote from the wikipedia entry on internet privacy concerning the AOL search engine:
A search engine takes all of its users and assigns each one a specific ID number. Those in control of the database often keep records of where on the Internet each member has traveled to. AOL’s system is one example. AOL has a database 21 million members deep, each with their own specific ID number. The way that AOLSearch is set up, however, allows for AOL to keep records of all the websites visited by any given member. Even though the true identity of the user isn’t known, a full profile of a member can be made just by using the information stored by AOLSearch. By keeping records of what people query through AOLSearch, the company is able to learn a great deal about them without knowing their names.
Based on the leaked PRISM presentation, they could send your search log to NSA, and NSA can identify you based on your IP address that your ISP provides.
If the exit node is compromised you’re fcked for good.*
It's not that simple, otherwise there wouldn't be any value in using an Onion architecture. Assuming you're using HTTPS, which every decent search engine supports, they either also need to create a fake but acceptable certificate for the domain, or to also control entry nodes and match the entering requests with the exit ones.
The NSA might be able to do it, but it's not just a matter of controlling an exit node.
Can someone explain to me (or point me in the direction of something that explains) what Google and Bing store in terms of tracking when you are not logged in?
Obviously you can use VPNs or TOR to be really safe, but do you need to go that far if you want an untracked search on Google and Bing?