Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
the_dege
on March 26, 2021
|
parent
|
context
|
favorite
| on:
Only Google is really allowed to crawl the web
Sometimes website admins will also try to report your ips to the service provider as a source of attacks (even if not true).
DocTomoe
on March 26, 2021
[–]
Given how often I've had misbehaving crawlers slow own servers in the early 2000s, I do not see how a crawler that disobeys robots.txt is not an attempted attack.
Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: