Hacker News new | past | comments | ask | show | jobs | submit login

I think the solution here is everybody masquerades as Googlebot so we can render the whole thing moot



Ignoring robots.txt is trivial, that's why some(many?) sites enforce it by verifying source IP and recognize Googlebot from its IP addresses - how will you get access to one of those?


What does "recognize Googlebot from its IP addresses" mean? If I'm a human and I access a site, I have some other IP than Googlebot, how should this side know if I'm a human or knuckleheadsbot?



if you're claiming to be User-Agent: Googlebot, but your IP doesn't seem like it belongs to Google, don't you think it's a clear sign that you're FAKING IT?

The check itself could be implemented for example with ASN or reverse DNS lookup or hard-coding known Google's IP ranges (though that's prone to become stale)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: