Hacker News new | past | comments | ask | show | jobs | submit login

I didn't deny that they are using Google's data. But it could be possible that they have some kind of generic approach not specifically targeting Google (or even search engines).



Exactly, they could just observe what user clicks after search (on any search engine, including Bing) and use it as a signal.

If that is indeed the case, then they are not copying results from Google, they are just tracking users' choices to improve position of interesting results on Bing. Personally, I don't see anything wrong about that.

Keep in mind that in Google's experiment, they always clicked on the first result, which is probably why their result was send to Microsoft and that the experiment "succeeded".


needn't even be limited to "what user clicks after search" could simply be "what user clicks"

They're trying to link Content A and Content B, where "Content" includes page content, URL(including GET string terms), and various other factors. Could happen on every page.

Google-fans then tend to get a bit knicker-twisty about privacy implications, but first I dislike that line of argument as it confuses the issue (what are we mad at, the copying or the privacy? Oh I know, we'll say copying and when someone rebuts that we'll start whining about privacy), and secondly, as patio11 eloquently said: Google doesn't really want to get into a heated discussion about the evils of a search engine knowing everything you've ever searched for. Stones, glass houses, etc. ( http://news.ycombinator.com/item?id=2165682 )


> Personally, I don't see anything wrong about that.

I, however, would call it spyware.


Yea sure but even the Google toolbar reports user behavior if you let it.


How long would it take for Microsoft do stage a similar trap for already installed Google toolbars? The fact it hasn't been done is a good indicator they couldn't.


One way of yielding the same effects without even factoring in search inputs would be to assume a probabilistic relationship between the words on an origin page (e.g. a Google search result page, or a bog standard web page) and the destination URL the user clicks through to. This seems like a pretty reasonable design parameter for the Bing Bar to learn "Suggested Sites" and if you're collecting that information anyway why wouldn't you add all those associations into your search engine to help it rank those tricky obscure search queries?

For synthetic words like "hiybbprqag" that rarely/never show up on the internet [except on Google search result pages generated by Google engineers systematically searching for it], the basic probability algorithm would weight heavily towards assuming an association between "hiybbprqag" and the destination URLs viewed by people immediately after looking at pages referencing "hiybbprqag". Since probably the only people looking at pages referencing "hiybbprqag" were Google testers searching for it, who had been instructed to always click on the "synthetic" Google result, the probability of someone viewing a page referencing "hiybbprqag" subsequently going to http://www.teamonetickets....wiltern-map.html* would be close to 1 - suggesting a pretty strong association between the terms.

If you incorporated these associations into the Bing search engine in any way, it would be perfectly reasonable for a search engine to assume that that page is the most relevant result for hiybbprqag*, given the lack of any alternative data on what to show.

Obviously this isn't as simple as the other solution (and Google could have done more sophisticated tests which rule out this kind of algorithm as being behind the results), but the competitive end of search isn't simple.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: