I understand the point of using a special user-agent to crawl webpages for index...

wolco · on Aug 12, 2019

They do both or at least google does.

alasdair_ · on Aug 12, 2019

Even if they do both, if the bots always follow what is entered in robots.txt and humans do not, it won’t be long before that’s the primary factor.

derimagia · on Aug 12, 2019

And that's totally fine - if they add their articles (for example) into their robots.txt, it would cripple their SEO. It wouldn't happen.

chmod775 · on Aug 12, 2019

I've seen stuff in the robots.txt get crawled anyways if enough people link to it. In Google's results it will still only show up without any contextual information though.

londons_explore · on Aug 12, 2019

Google won't crawl it, but they can still include the link in search results, usually with the title guessed from the way it was referred to on another page, and no description.

zaarn · on Aug 12, 2019

The Google bot mostly ignores the paths specified in robots.txt

sieabahlpark · on Aug 12, 2019

What about sites which display different content depending on origin?