Hacker News new | past | comments | ask | show | jobs | submit login

I thought it was weird that CL didn't add a Disallow line for padmapper to robots.txt from the start (just from a PR perspective).

But robots.txt has no special legal authority, it's just a convention used to communicate a publisher's intent. I'm pretty sure the C&D letter made it 100% clear that CL did not want Padmapper crawling their site or using their data.




Padmapper doesn't crawl Craigslist. That's not how it happens.


...any more. Now they are using a third party, but at the time Craigslist sent the C&D they were scraping the site directly.


I know, but I thought the existence of robots.txt was why Google is allowed to crawl sites. If a site disagrees with the crawling they can add a robots.txt entry and Google will honor it. It at least shows that you are giving the publisher an option.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: