Applying BERT models to Search

colechristensen · on Oct 25, 2019

Maybe they're getting better at natural language, but for me Google searches have been getting gradually worse year after year. I want Google, not AskJeeves.

The "frustration" is "increasing" "when" I "have" to "quote" nearly every "word" to get Google to actually return results with what I searched for instead of what it thinks I meant to search for.

And there's the frustration, computing today tries as hard as it can to figure out what it thinks I actually meant. I don't know if it is worse that a person who knows what they want can't get it when the computer disagrees or if the computer is actually mostly right and its algorithms start to push your desires in it's own direction and whatever motive.

Facebook already does this really, radicalizing people by engineering the most dopamine-driving content to the top either towards self-obsession or an us v. them bubble.

In other words, I just want a fucking regular expression instead of our new data-science overloads ruining our minds with artificial non-intelligence (for profit).

whalabi · on Oct 25, 2019

I hear you, but if you look at the examples in the link, the before and after the update results, you'll see that there's far less of that kind of ignoring of keywords.

For example a search for "can you get medicine for someone pharmacy" used to just show generic information about getting a prescription filled, skipping over the "for someone" bit.

The new results understand what the query is actually asking, which is pretty impressive.

I'm kinda with you, I grew up with a ctrl-f Google so I sort of prefer that behaviour, I think because I don't want to rely on an unreliable NLP AI.

...I was going to say "but" but.. no I think I just don't want to rely on an unreliable NLP AI. It's so frustrating when it doesn't work, which is often.

londons_explore · on Oct 25, 2019

When you quote the words, do you find what you're looking for?

In my experience, the times google seems to have totally missed the point of what I'm looking for, it's usually the times that the answer I'm looking for isn't anywhere on the web. Things like "datasheet JK45690DFS" or "Types of asphalt available for local delivery today".

I wish Google had some way to understand your query and the results well enough to just be able to say "The answer isn't available on the internet".

pm215 · on Oct 25, 2019

Yes, usually it's because there's maybe 4 documents which match the search query. But if there are only 4 documents which match what I asked for then I want to see a page with those 4 documents! I don't want to see a page with 10 results which I have to manually scan through to see that actually more than half of them are irrelevant rubbish because they aren't hits for what I was searching for.

willis936 · on Oct 25, 2019

I usually fail to get hits on part numbers. Part of the issue is that part numbers seem to change, even while referring to the same device. Something that might have {8, 12, 16, 24} channels will have shared documentation, but the generic part number that documentation is under will be different than what’s on the package. If google really wants to show that they know better than me then they could identify these cases and show me what I want from a “bad” search term.

colechristensen · on Oct 25, 2019

I find some luck quoting individual words.

moultano · on Oct 25, 2019

If you can remember any of the queries that failed, I'd be happy to pass them along to debug. If you have it turned on you can look in your search history here: https://myactivity.google.com/myactivity?product=19

tempguy9999 · on Oct 25, 2019

I can't at the moment, but as a hint, if I search for X Y Z then I expect the first results to contain all those terms, not 2 out of 3. Like others complain, it seems necessary to quote each on "X" "Y" "Z" to get what I want.

If you have to have a clever read-my-mind search, then instead of blending it into the main search, have 2 search types of 'intelligent' and 'precise'.

claar · on Oct 25, 2019

Google already offers a "Verbatim" search type just as you describe -- it's the same as quoting all words automatically.

In Chrome, you can set your search engine to "https://www.google.com/search?q=%s&num=100&tbs=li:1" to get this behavior by default.

liveoneggs · on Oct 25, 2019

google silently replacing "interesting" with "cool" brings up advice on cooling things (temperature) instead of keeping things "interesting" as in "interesting"

uoaei · on Oct 25, 2019

Yikes, that sounds like a software architecture problem. The "infer which words are more likely to get good results" layer and the "search for things using words" layer are blind to each other's behaviors.

How do Google's engineering teams get away with this obvious error? For how much they're paid, you'd expect them to be better about things that passive observers readily notice. Don't they use their own product, anyway?

moultano · on Oct 25, 2019

Do you remember the full query?

liveoneggs · on Oct 25, 2019

how to keep an aquarium interesting

how to keep a cat interesting

looks like that pattern generally

moultano · on Oct 25, 2019

Awesome examples! Thank you!

DevoidSimo · on Oct 27, 2019

Duck Duck Go seems to handle that much better, although the first result is still about cooling. I assume it's synonym replacement

mattkrause · on Oct 26, 2019

Wow..that does fail spectacularly! The first few pages for me are all temperature-related.

colechristensen · on Oct 25, 2019

Here's one:

when are bonuses usually paid

I had great trouble finding results which were not advice for business owners.

TremendousJudge · on Oct 25, 2019

>I had great trouble finding results which were not advice for business owners.

This, so much. There are so many search queries nowadays that have been SEO'd to hell, and there are just pages upon pages of crappy 4 paragraph articles on B2B company blogs just parroting the same information over and over again.

moultano · on Oct 25, 2019

Thanks! Did you finally find a good page?

colechristensen · on Oct 25, 2019

I never really found the kind of thing I was looking for.

If you're looking for more search feedback, my email is in my profile.

riyadparvez · on Oct 25, 2019

Maybe this is a result of optimizing for most of the population, which probably decreases performance for tiny minority who search for niche things that are hard to optimize for because of lower amount of data.

huffmsa · on Oct 25, 2019

The ML layer is probably getting in the way of the end user getting to the smaller samples.

Used to be you'd get the best matches from the meta data on a page.

Now there's linear algebra both trying to determine what the meta data means and what the question means, so it's going to have grouping biases.

And do things like exclude seemingly random strings of numbers, because in the training data, that's usually trash, but for you, it's a part or serial number that you're looking for

dooglius · on Oct 25, 2019

I don't think that fully captures it, Google is an advertising company, and so its incentives are all out of whack. For instance, Google probably benefits from having the top results be slightly less useful as that makes the ads at the top more likely to get clicks.

thorwasdfasdf · on Oct 25, 2019

Yup, this is a huge problem with google search these days. It keeps trying to figure out what your intent was, and although it works about 70% of the time. the other 30% is a complete disaster.

oarabbus_ · on Oct 25, 2019

Especially when it comes to "less acceptable" things. Suppose, for instance, one listened to a rap song and then googled "lean and sprite" (codeine cough syrup and soda, commonly referenced by rappers as a way to get high).

Instead of returning search results of relevant rap songs, you get a bunch of links on drug abuse, rehabilitation centers, etc. Thanks for assuming I'm a harrowed drug addict, Google, but really I was looking for music.

mxuribe · on Oct 25, 2019

Yeah, when i used google more often in the past, my experience had been more like 60% (google works) vs 40% (disaster). Now that i use duckduckgo as my primary search engine, the resultant hits haven't really improved, but at least i feel like i have more privacy. ;-)

faizshah · on Oct 25, 2019

Yea, personally I would prefer google to have predictable results so I can fine tune my search. When I put something in quotes now they can still expand it by synonyms.

But my mom doesn’t type keyword searches like I do, she types out sentence/phrase questions. Maybe the average user benefits from this stuff?

petra · on Oct 25, 2019

What i want from the search engine isn't an easy life. I want a tool that would give me, the knowledgeable searcher(hopefully) the biggest possible advantage over non-experts.

On one hand, it seems that Google, by aiming for the least common denominator - greatly reduces that gap.

But not really. I still think there's a good advantage to having search knowledge. I hope it stays that way.

windsurfer · on Oct 25, 2019

This is why I really like the DDG system of !bang operators. Because each search engine works differently, if you're knowledgeable about the different engines, you can pick the best one for the job.

Just for fun, I thought I would try to figure out when the next solar eclipse would be: I'll use the simple phrase "Next solar eclipse"

I'd probably try Wolfram Alpha. On DDG I would type "Next solar eclipse !wa". It returns a date "Thursday, December 26, 2019 (2 months from now)" Nice.

Next I look at plain DDG: I knew it probably wouldn't be useful, but first result is a website that calculates the next solar eclipse. It requires another click, but it gets "Dec 26, 2019" right at the top of the page. Not bad.

Now for Google, who I knew wouldn't be sure (since their interpretation of the very exact phrase would be fuzzy): In big, prominant and confident letters it reads: "July 2, 2019".... thanks Google. Wrong.

The skill in searching is no longer in using the search engine to the best of it's ability, it's in picking the right place to look. This has sort of always the case, but has become more of a skill as Google has strayed away from improving for the knowledgeable searcher.

summerlight · on Oct 25, 2019

It's worth to note that the web has grown a lot recently. The number of web pages is more than 100 times larger than 10 years ago thus the size of information to get the exact page that you want should increase as well. Although Google did some works on this area, but it's possible (and very likely) that the speed of web growth simply outpaced Google's algorithmic improvements. I don't know if a single universal search engine can be the answer without very deep personalization plus pervasive tracking; maybe domain specific search engines can do a better job?

dmix · on Oct 26, 2019

This is what ruined Twitter for me.

They hired a thousand engineers to guess what I like when I already followed the people I want to see content from and I don’t (always) want it in some random order they think is best, mixed with liked tweets they never intended to publicly share, I just want a chronological list of actual Tweets from the people I followed again.

Not a dice roll the AI will get it right 51% of the time.

d1zzy · on Oct 25, 2019

Personally I was surprised how good the "intention search" algorithm is, most times I do not know the right exact terms to search for, especially when looking up things in a domain and the algorithm figures out what I actually wanted quite well. For the cases where I do need exact word match, like you said, quotes work fine.

freediver · on Oct 25, 2019

This is because in this example you are expecting Google to be Ctrl-F for Internet (and not even that because you alse expect it to weed out what you think is spam somehow in the process, which is not a feature of typical Ctrl-F).

This update addresses the other side of the search spectrum which is meaning. Google has a very tough job of moving the slider between exact keywords and meaning every time someone makes a search. This is a step in the right direction, but the fundamental problem still remains - Google interface is optimized for ad conversion, not user experience.

uoaei · on Oct 25, 2019

Such is life when we optimize for the majority. We end up optimizing to only about the top 80% of searches, which are sometimes the flavor of "what is that website where everyone is friends with each other and I can see my grandkids' photos?" or "word for ice in latin".

It would be great if the system could infer the level of specificity associated with the query. Some people are just exploring a topic while others want to get to a more detailed document sooner.

buboard · on Oct 25, 2019

is there a reason not to have a setting "i m not the majority" somewhere?

thesquib · on Oct 26, 2019

I am getting more and more annoyed as Google increasingly corrects my language to the wrong thing.

londons_explore · on Oct 25, 2019

Have you calculated how much compute time running a single regular expression over all content on the internet would cost?

$$$$$$'s I'd bet... One big tech company used to let employees do it on an internal internet mirror... It wasn't cheap.

colechristensen · on Oct 25, 2019

I say regular expressions tongue-in-cheek really meaning I want more mechanical and predictable machines that do what I tell them to in a straightforward way.

bratao · on Oct 25, 2019

BERT is truly amazing. Almost all inovation in NLP uses BERT and transformers somehow. ALBERT will be the next HUGE thing for the next months, as it show results better than BERT with a small fraction of parameters.

We did a "Semantic Similarity search" for some documents, where we represent a document as a vector using BERT, and had to look for documents close to a reference document.

The results where breathtaking. It really returned semantically similar documents. You can do it now using ElasticSearch(But you really should do it using Vespa.ai, it is much faster https://github.com/jobergum/dense-vector-ranking-performance )

calebkaiser · on Oct 25, 2019

The first project I ever put together involving (extremely trivial) ML used BERT, and something about seeing it just work opened my eyes to the ML world and got me excited to work in the space.

If anyone is interested in hacking around with BERT, I work on an open-source project called Cortex that handles model deployment, and we have full tutorial for deploying a sentiment classifier using BERT quickly and easily: https://github.com/cortexlabs/cortex/tree/master/examples/se...

woadwarrior01 · on Oct 25, 2019

That's very interesting! If you have the time for it, you should consider experimenting with swapping in SpanBERT[1] instead of BERT in your usecase. They train on full length length segments instead of masked half segments (as in BERT). I suspect that this, besides the improvements that SpanBERT brings over BERT should enable you to feed in bigger chunks (more sentences) to the model before the averaging step, leading to fewer vectors to average and as a result, perhaps better clustering.

[1]: https://arxiv.org/abs/1907.10529

bratao · on Oct 25, 2019

Thank you, I will read and try it. Looks very interesting!

binarymax · on Oct 25, 2019

I agree that it works very well for "more like this" document recommendations! But not great for user queries.

buboard · on Oct 25, 2019

but the question is how better they are compared to existing similarity measures. E.g. for documents in a domain, even simple cosine is pretty good.

bratao · on Oct 25, 2019

I did not understood. BERT it is not a similarity measure. For our use case we did use a simple cosine similarity to find the similar documents. But we have to represent those documents in a vector space. From our test representing the document as a MEAN of BERT Embeddings we got some very good results. Much better than BoW, Glove or the Lucene "More like this"

bitL · on Oct 25, 2019

Dude, you are probably releasing trade secrets ;-)

binarymax · on Oct 26, 2019

Nah, lots of people are trying stuff like this :)

buboard · on Oct 25, 2019

yes sorry. i meant a more naive, count-based vector representation instead of BERT embedding

binarymax · on Oct 25, 2019

I've been working on using BERT for search, for research and training development, with not so great results.

Note the quote "when it comes to ranking results, BERT will help Search better understand one in 10 searches". This is because of the "keywordese" point they noted earlier in the article. Most searches are 1 or 2 words - there isn't enough to grab onto for meaningful ranking with short queries and a similarity function for longer text documents.

Also, try keeping the systems afloat to handle search like this. BERT is not practical to use for search results by anyone without the scale of a company like Google. You need to have a server farm of GPUs to translate all your documents into tensors - and then keep them around somehow! A document of 10k text will balloon to ~1MB when converted to a multitoken vector representation. BERT uncased has 768 features - thats 768 floats per token you need to keep around. If you compress it using PCA or averaging across tokens, you lose all the juicy context that you need for the matching and ranking. Also, there currently isnt a good way to keep this stuff around yet (though there are active projects ongoing to get this into Lucene [1],[2])

I think this is definitely a great achievement in NLP - but it needs breakthroughs in other areas to be useable by product teams implementing search, with any reasonably large content size.

[1] https://arxiv.org/abs/1910.10208 & https://github.com/castorini/anserini/blob/master/docs/appro... [2] https://github.com/o19s/hangry

pheug · on Oct 25, 2019

Distillation is usually used today to tame its resource problems at scale - you run BERT to squeeze out maximum signal from your training data and then distill the model e.g. into cheap CNN for inference.

binarymax · on Oct 25, 2019

Distillation reduces accuracy and removes the contextual precision. For example reducing a whole document to some N (1k or so) dimensions have worked very poorly in my experiments for short queries - typically making the relevance worse than basic keyword search.

pheug · on Oct 26, 2019

You seem to be talking about dimensionality reduction, that's not what I was meant. Distillation is training a different model with a cheaper architecture (CNN, LSTM) on the outputs of an expensive teacher model like BERT. This has nothing to do with dimensions.

sdenton4 · on Oct 26, 2019

You might try vector quantization (instead of PCA) if you just need your 768 features to be smaller. ML features tend to be robust to some perturbation.

binarymax · on Oct 26, 2019

Well it’s one problem or another. If you compress too much you lose the value, and if you leave it too large you have the size problem.

Inverted indices are very efficient. How much of that can you give up at what trade off? If I’m only going to be better for 10% of queries, is that a cost effective solution? What if I spend the same amount of time tuning a traditional engine a bit more and get better accuracy for 5% of queries? Tradoffs rule the world of practical search implementations.

ma2rten · on Oct 26, 2019

Just an idea: Maybe you could either train a model or use heuristics to translate from keywordese to English?

binarymax · on Oct 26, 2019

Haha I wish! Too much fidelity has been lost already. The model would just be guessing.

The sniff test is if a person can’t do it, then a model can’t either. Lots of queries look fine for matching, but you really have no idea what the intent or information need of the searcher is.

ma2rten · on Oct 26, 2019

No, I mean before you feed it into the model.

binarymax · on Oct 26, 2019

I’m not sure what you mean. Keywords are keywords. The meaning behind what the user wants is in their head. You cant turn keywords into a sentence without guessing what they meant.

sargram01 · on Oct 25, 2019

Search and a lot of the AI based systems these days feels like talking to a hard of hearing grandparent; as long as you’re saying about what they expect then it’s fine, but if there’s any nuances or homonyms it turns into a comedy routine.

Pick-A-Hill2019 · on Oct 25, 2019

An interesting snippet from TFA - "..with this release, anyone in the world can train their own state-of-the-art question answering system (or a variety of other models) in about 30 minutes on a single Cloud TPU, or in a few hours using a single GPU." https://ai.googleblog.com/2018/11/open-sourcing-bert-state-o...

braindead_in · on Oct 25, 2019

How long till NLG becomes good enough that it can answer questions factually? I think integrating it with a knowledge graph might just make search obsolete.

ur-whale · on Oct 25, 2019

I might be mistaken, but I believe that's already what Google is doing for a lot of factual knowledge (as a matter of fact they do call it knowledge graph).

Try for example to type "when was jfk born" in Google, you should see a factual answer fished from a kg.

aasasd · on Oct 25, 2019

> In fact, that’s one of the reasons why people often use “keyword-ese,” typing strings of words that they think we’ll understand, but aren’t actually how they’d naturally ask a question.

Funny thing, I've seen people mention right here on HN that for DuckDuckGo you need to adjust the style of your queries. This notion puzzles me, probably because I kept the habit of ‘keywordese’ from the olden days. Most of the time, results are about the same for me in Google as they are in DDG and even in Yandex—with the exception that Google is better at grouping related or similar results, and also if there's one or two sources having the search phrase almost-verbatim then they're at the top in Google. Apparently, I already need to learn talking to the site like it's self-aware, to regret ditching it.

Now, if some wondertech helps me to home in on the answer to my exact software or programming troubles instead of hundreds of vaguely related SO posts—I could really dig that.

luckylion · on Oct 25, 2019

Funny that you mention Google's grouping and related SO posts ... I still regularly run into that annoyance that SO hasn't figured out how to do canonical URLs and Google then proceeds to put the identical content from SO on two consecutive spots in the results page. SO does not want to fix it (it has been an issue for years and has been pointed out on meta many times), but I'm confused by Google's fail to consider them duplicates when their content is nearly identical (the difference would be just "hot network questions" in the sidebar etc which depend on the time of the page request).

naringas · on Oct 25, 2019

I wish they would let us use the "keywordese" engine as it was 10 years ago if we want to instead of funneling us along with everybody else into the newer intent search engine.

binarymax · on Oct 25, 2019

This is the kind of search you see everywhere. 95% of queries I see for customers in most fields are only 1 or 2 words long (and they are typically a noun phrase).

6gvONxR4sf7o · on Oct 25, 2019

I wonder how this affects their costs of serving a query. BERT isn't exactly computationally cheap to evaluate.

summerlight · on Oct 25, 2019

I guess the performance impact could be controlled by applying some cheap heuristics, something like having more than 3~4 words with a preposition. They're giving a pretty specific number (one in 10 searches), so this might be the case.

tanilama · on Oct 25, 2019

It is google, it is probably their secret sauce. I won't be surprised that someday Google makes custom ASIC chip to just run transformer models.

6gvONxR4sf7o · on Oct 25, 2019

That's actually old news, and mentioned in OP. They build their own chips (TPUs) that are super cool. You can even use them as part of google cloud! Still, moving to BERT ain't cheap.

tanilama · on Oct 25, 2019

TPU is old news. I think what would the actual news will be a Chip that is customized/optimized to just run BERT.

The Transformer architecture itself stays mostly unchanged in the 2 years after it has been proposed, and with BERT/variants, most (competitive) NLP models are now Transformer based, it makes sense to make custom chips to just run Transformers, the same as CNNs.

sdenton4 · on Oct 26, 2019

Meh, not sure how much more there is to do to specialize for transformer specifically. TPU and GPU are mainly just fantastic matrix multipliers. And transformer is partly designed with this hardware in mind: the operations are basically the same operations you see in a CNN. And in fact, one of the nice parts of the transformer is that you can run it without RNN, making it even better optimized for the matrix multipliers.

Furthermore, tpus are a moving target themselves: as ML needs change, the team build new operations and optimizations into the next generation of chips.

lexpar · on Oct 25, 2019

Does anyone have a nice resource they recommend on what BERT does? I've gathered it was trained by trying to predict missing words in a sentence, but I don't have an intuition on how this is useful for downstream prediction (like, say, learning a word embedding is).

metasyn · on Oct 25, 2019

I appreciated this blog post: http://jalammar.github.io/illustrated-bert/

lexpar · on Oct 26, 2019

Thanks!

breadandcrumbel · on Oct 26, 2019

s there any public information on actually how BERT is being applied to IR?

For each of the scenarios they described they are just like "here's potential hard search query, and BERT adds magic language understanding which makes it all better ". It's non-obvious how BERT is actually being used though, especially at the scale and latency they need.

(I get that that this is Google's "secret sauce" and they might not saying anything in this particular use of BERT. But I'm curious if anyone had seen anything related.)

ma2rten · on Oct 26, 2019

This blogpost is very light on details. It doesn’t at all say at which stage in the search process BERT is used.

vagab0nd · on Oct 26, 2019

Totally off-topic. Is it just me, or is this "People also search for" the worst feature ever?

For those who haven't noticed, this is the box that shows up under a search result when you return to google from the page you've gone to. 50% of the time I click the back button, this freaking box shows up, the whole page shifts, and I click the wrong link.

Oh and did I mention I never ever, not even once, actually use it?

amiga-workbench · on Oct 25, 2019

>By using new neural networking techniques to better understand the intentions behind queries

Great, so the search results are going to get even worse?

dang · on Oct 25, 2019

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html

The comments below are much better.

on Oct 25, 2019

[deleted]

dang · on Oct 25, 2019

Ok, but accurate-and-shallow is still shallow. What's needed is a substantive accurate statement. It's particularly the shallow dismissals which harm discussion—hence that guideline.

Pick-A-Hill2019 · on Oct 25, 2019

uhmm, weirdly enough it might explain why some (myself included) have found that Google's results were shit /sub-optimal. Turns out I might have been using Google wrong for the last few years. I am used to using specific boolean search parameters while it seems that Google have been optimizing for natural language. An example from the announcement may clarify - "Here’s a search for “2019 brazil traveler to usa need a visa.” The word “to” and its relationship to the other words in the query are particularly important to understanding the meaning. It’s about a Brazilian traveling to the U.S., and not the other way around. Previously, our algorithms wouldn't understand the importance of this connection, and we returned results about U.S. citizens traveling to Brazil. With BERT, Search is able to grasp this nuance and know that the very common word “to” actually matters a lot here, and we can provide a much more relevant result for this query." Personally I would never have formatted a search query in natural language. Perhaps I should have been.

amiga-workbench · on Oct 25, 2019

Yes, this is the exact problem I'm running into, I hammer out my search queries like its a wildcard CONTAINS SQL statement, because a simple inclusive search should bring back predictable results.

moultano · on Oct 25, 2019

If you can remember any of the queries that failed, I'd be happy to pass them along to debug. If you have it turned on you can look in your search history here: https://myactivity.google.com/myactivity?product=19

amiga-workbench · on Oct 25, 2019

"AC7X0_R3.2.0.7d_ENG_NB.exe"

I was after a specific legacy driver last night which the vendor no longer has available. Google returns zero results for this, bing returned a few relevent results (but sadly still didn't help me get what I needed) in the end I went mooching through the way back machine.

It's this class of search that bugs me the most, I know for a fact something with that exact filename is out there on the web, I just can't find where easily.

If I run into more issues or can recall anything else, I'll forward it on.

mankyd · on Oct 25, 2019

I am curious why you take the pessimistic view. Their own testing shows an improvement for at least 10% of their queries:

"Google says it can now offer more relevant results for about one in 10 searches in the U.S. in English"

luckylion · on Oct 25, 2019

The improvements are likely for non-power users. As they make up the majority of users, it's understandable that Google optimizes for them. Still, it gives power users worse results because they know what they are looking for and they (often) use certain words for a reason. Google's default assumption ("the users have no clue what they are doing") leads them to a paternal "I think I know what you're going to ask ..." approach which often fails and leads to shit results.

The method to combat that is to start wrapping every word in quotes which is annoying. I'm sure intermediate users will catch on at some point, start doing it too and Google will drop the quote modifier. Let's keep it a secret so that happens later rather than sooner.

minwcnt5 · on Oct 25, 2019

Natural language is a lot more expressive than boolean expressions of keywords. Perhaps experienced searchers who use the same approach to query formulation that worked well two decades ago are no longer the true power users. Maybe they are just dinosaurs.

mankyd · on Oct 25, 2019

This assumes a simplistic view of how its core search algorithm works, though. It's obvious through even some basic querying that it uses different strategies for recognizing intent based on the nature of the input: single words, addresses, arithmetic, business names, full text queries, etc.

There's no reason to think that they will enact this to the detriment of other queries. It _could_ happen, but I am skeptical - optimistic even. As they mention, this improves ~10% of queries. The other 90% likely represent different forms of query input and I would hope remain unaffected.

xbmcuser · on Oct 25, 2019

I think most of us technologically inclined users are in the habit of using specific words to get search results as we have been doing that for years. But I think a large portion of newer users actually ask full questions so google is optimizing for it. For better search results we might need to actually start asking exactly what we are searching for instead.

skyyler · on Oct 25, 2019

I struggle to find substance in your comment.

Could you perhaps expand with a few supporting details for your thesis?

naringas · on Oct 25, 2019

I agree with his sentiment based on my personal experience.

But we'll see, maybe this will actually be better?

also, maybe we should stop googling in "kewordese"?

wolco · on Oct 25, 2019

Each time they move away from what people type vs what profile data collected from various sources suggest we move away from the ideal.

Best example is ads targeted to keywords on the page vs shows ads based on past purchases.

amiga-workbench · on Oct 25, 2019

Google works well when you are searching for things you don't know much about, your input is imprecise and it generally points you in the right direction.

However for the opposite case, when you are trying to find something highly specific, even down to an exact substring match I find the results to be very poor.

skyyler · on Oct 29, 2019

Aww :-(

I suppose you'll just have to make your own...

pokoleo · on Oct 25, 2019

I just queried "I struggle to find substance in your comment."