Hacker News new | past | comments | ask | show | jobs | submit login

Every search engine is a screen scraper. Google has been making $$$ off of re-published content since day zero.



No.

a.) Google follows robot.txt. You can disallow Google to index your website. Most of the websites, OTOH, want Google to index websites.

b.) Google does not republish the content. All the traffic is directed to the content owner, ie the other websites.


So you are saying that each search result page are handwritten by Google editors?

Of course they are not, they are simply re-published snippets of the websites along and Google surrounds the results with ads.

While it's a symbiotic relationship that most websites want - sharing their content for placement in Google's webpages - it's not necessarily universal and ROBOTS.TXT is hardly a "contract" covering your data's usage.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: