Hacker News new | past | comments | ask | show | jobs | submit login

How do you know that they target search sites? This could also be a more generic approach: If there's a link from page A to page B also return page B if someone searches for page A's keywords and there are only few results.



Have you read the original claim? Google did a test with completely nonsense words and 100% unrelated search results with those terms. The results showed up on Bing. Whatever Bing is doing, it is using Google's search data.

http://searchengineland.com/google-bing-is-cheating-copying-...


I didn't deny that they are using Google's data. But it could be possible that they have some kind of generic approach not specifically targeting Google (or even search engines).


Exactly, they could just observe what user clicks after search (on any search engine, including Bing) and use it as a signal.

If that is indeed the case, then they are not copying results from Google, they are just tracking users' choices to improve position of interesting results on Bing. Personally, I don't see anything wrong about that.

Keep in mind that in Google's experiment, they always clicked on the first result, which is probably why their result was send to Microsoft and that the experiment "succeeded".


needn't even be limited to "what user clicks after search" could simply be "what user clicks"

They're trying to link Content A and Content B, where "Content" includes page content, URL(including GET string terms), and various other factors. Could happen on every page.

Google-fans then tend to get a bit knicker-twisty about privacy implications, but first I dislike that line of argument as it confuses the issue (what are we mad at, the copying or the privacy? Oh I know, we'll say copying and when someone rebuts that we'll start whining about privacy), and secondly, as patio11 eloquently said: Google doesn't really want to get into a heated discussion about the evils of a search engine knowing everything you've ever searched for. Stones, glass houses, etc. ( http://news.ycombinator.com/item?id=2165682 )


> Personally, I don't see anything wrong about that.

I, however, would call it spyware.


Yea sure but even the Google toolbar reports user behavior if you let it.


How long would it take for Microsoft do stage a similar trap for already installed Google toolbars? The fact it hasn't been done is a good indicator they couldn't.


One way of yielding the same effects without even factoring in search inputs would be to assume a probabilistic relationship between the words on an origin page (e.g. a Google search result page, or a bog standard web page) and the destination URL the user clicks through to. This seems like a pretty reasonable design parameter for the Bing Bar to learn "Suggested Sites" and if you're collecting that information anyway why wouldn't you add all those associations into your search engine to help it rank those tricky obscure search queries?

For synthetic words like "hiybbprqag" that rarely/never show up on the internet [except on Google search result pages generated by Google engineers systematically searching for it], the basic probability algorithm would weight heavily towards assuming an association between "hiybbprqag" and the destination URLs viewed by people immediately after looking at pages referencing "hiybbprqag". Since probably the only people looking at pages referencing "hiybbprqag" were Google testers searching for it, who had been instructed to always click on the "synthetic" Google result, the probability of someone viewing a page referencing "hiybbprqag" subsequently going to http://www.teamonetickets....wiltern-map.html* would be close to 1 - suggesting a pretty strong association between the terms.

If you incorporated these associations into the Bing search engine in any way, it would be perfectly reasonable for a search engine to assume that that page is the most relevant result for hiybbprqag*, given the lack of any alternative data on what to show.

Obviously this isn't as simple as the other solution (and Google could have done more sophisticated tests which rule out this kind of algorithm as being behind the results), but the competitive end of search isn't simple.


Right, but the nonsense word is part of the referring URL for the link that the user is clicking.


Sure, the question is how you differentiate those actions. Some one typed in [x] and then clicked [y] could certainly be generic, but that would be a massive amount of data to crunch and would require a significant amount of congruency to be enough to modify the Bing algorithm.

How would it differentiate between a form submission and a search? And if it's not and still sending all of that data back to MS, that's far more disconcerting than them just watching me on Google.

It's certainly more believable that the sites were either targeted or were extracted from massive amounts of data for this purpose.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: