Google does index pages, in the database sense. An index in the database sense i...

logicallee · on May 19, 2017

>Seriously, there's no way even Google could intersect the sets of all crawled web documents containing those individual words in 30 seconds, much less two seconds.

I believe you're mistaken. What I've heard is that for every word, Google has a list of every web site that contains that word - they've flipped the database. So, I believe, if you search for (without quotes) neanderthal violet narwhal obsequious tandem then -- and I just did this query, which took 0.56 seconds, but decided to remove some of the words, so it can get it me results. When I did plus signs, making my query +neanderthal +violet +narwhal +obsequious +tandem it said it worked 0.7 seconds to determine that in all of the entirety of the Internet, there is not a single document that has those 5 words on it.

How do you think it determines in 700 ms that all of the sites it has indexed on all of the Internet does not contain those 5 words anywhere on it?

The answer is that it has a rather short list of sites that contain the word narwhal, which it then intersects with the somewhat larger list of sites that contain obsequious and so on. 700 seconds is plenty fast when you take that approach.

so, this explains why joining stop words (which consist of billions of pages, each) takes so very long.

using stop words it is easy to make queries that take one or two seconds each.