Hacker News new | past | comments | ask | show | jobs | submit login

are you claiming that google never scrapes bing search results pages? or any other search result pages?



poacher69, we crawl the public web. Anyone that blocks us out with robots.txt, we won't crawl. If you check bing.com/robots.txt, it has "Disallow: /search" . So no, we won't crawl Bing's search results pages. If anything, users tend to complain when search results from Lycos or wherever show up in Google.


http://www.bing.com/robots.txt User-agent: * Disallow: /search

Funny thing: http://www.google.com/search?q=site%3Abing.com%2Fsearch%2F

I was gonna call out Matt for crawling bing's search results but I'm guessing Microsoft hasn't realized they return results from the /Search/ folder. ;)


Once again Microsoft is bitten by expecting case insensitivity.


matt, how does google do competitive relevance evaluations without scraping Bing?


From my experience, Googlebot doesn't crawl pages that are blocked in robots.txt files. Check out Bing's robots.txt: http://bing.com/robots.txt - notice how /search is disallowed. That typically means that Googlebot isn't able to access that page. The same for the other search engines, it's more down to if they specify (through robots.txt) that Googlebot isn't allowed to crawl those results.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: