Hacker News new | past | comments | ask | show | jobs | submit login

> The paper you mentioned appears to be saying that Microsoft is extracting spell corrections via clicks on Google.

Well, no, that's a research paper that says that they have made experiments in that direction, but this doesn't imply that this is currently done in Bing. But it gives an hint about what kind of data is available from the "log files from a commercial Web browser".

> Targeting Google specifically is quite different than using lots of clicks from different places.

From the article, they have handcrafted rules for both Google and Yahoo, that together with Bing have (I think) the 95% of the market. I'd say they are not targeting Google, they are targeting the majority search engine users. There just happen to be only 3 major search engines, so a few handcrafted regexes are sufficient.

I wouldn't be surprised if Google Maps has handcrafted (or manually tuned) scraping code to extract reviews from Yelp and other major review sites, and same for Google News for the extraction of the news body from the major online news sources. How is this different?

> It looks like you work at Microsoft--can you say any more about this?

Yeah, I should have been more clear about this. I am interning at MSR and have some involvement with Bing (and actually worked there last year), but my comments are personal and about facts that are public.

BTW, IMHO using the click logs can't be considered "copying", more like "a way to discover new sites to crawl and the keywords that lead to them". This is not copying the SERP results.

Since it "looks like" you work at Google :) can you answer this question (it was also asked here: http://news.ycombinator.com/item?id=2165963)? Doesn't Google use Chrome to get traffic statistics, through the opt-in "send usage statistics" and the malicious site protection?




>I wouldn't be surprised if Google Maps has handcrafted (or manually tuned) scraping code to extract reviews from Yelp and other major review sites, and same for Google News for the extraction of the news body from the major online news sources. How is this different?

Sorry, but Google drives traffic to their sites. That's what a search engine is supposed to do. Msft just scrapes Google's results and presents the data as its own.


> Sorry, but Google drives traffic to their sites. That's what a search engine is supposed to do.

Then why are newspapers not so happy about it? http://www.guardian.co.uk/media/2009/nov/09/murdoch-google

And, BTW, just to be clear, Msft can't "scrape". That would violate robots.txt.


> Then why are newspapers not so happy about it?

Rupert Murdoch and his kin are shortsighted, blustering fools when it comes to the 'net. Relying on their attitude to make your point is counterproductive at best.


"Doesn't Google use Chrome to get traffic statistics, through the opt-in "send usage statistics" and the malicious site protection?"

I saw that Peter Kasting from the Chrome team commented on this question at http://www.mattcutts.com/blog/google-bing/#comment-712619 . Here's what he said "I work on Chrome and we absolutely do NOT collect clickstream data through Chrome. Not even when you turn on the off-by-default “anonymous usage statistics”."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: