I think the solution here is everybody masquerades as Googlebot so we can render...

quantumofalpha · on March 26, 2021

Ignoring robots.txt is trivial, that's why some(many?) sites enforce it by verifying source IP and recognize Googlebot from its IP addresses - how will you get access to one of those?

p-sharma · on March 26, 2021

What does "recognize Googlebot from its IP addresses" mean? If I'm a human and I access a site, I have some other IP than Googlebot, how should this side know if I'm a human or knuckleheadsbot?

smarx007 · on March 26, 2021

https://developers.google.com/search/docs/advanced/crawling/...

quantumofalpha · on March 26, 2021

if you're claiming to be User-Agent: Googlebot, but your IP doesn't seem like it belongs to Google, don't you think it's a clear sign that you're FAKING IT?

The check itself could be implemented for example with ASN or reverse DNS lookup or hard-coding known Google's IP ranges (though that's prone to become stale)