Hacker News new | past | comments | ask | show | jobs | submit login

if your site is popular and you have a problem with crawlers use robots.txt (in particular the Crawl-delay stanza)

also for less friendly crawlers a rate limiter is needed anyway :(

(of course the existence of such tools doesn't give carte blanche to any crawler to overload sites ... but let's say they implement some sensing, based on response times, that means a significant load is probably needed to increase response times, which definitely can raise some eyebrows, and with autoscaling can cost a lot of money to site operators)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: