Hacker News new | past | comments | ask | show | jobs | submit login
Researchers confirm what we already knew: Google results are getting worse (theregister.com)
93 points by rntn 8 months ago | hide | past | favorite | 66 comments




To be fair, this is much more approachable than a dense 16 page white paper.


Very fair. I posted because (1) HN guidelines¹, and (2) because there was lots of discussion about it yesterday that may or may not happen again today.

¹ "Please submit the original source. If a post reports on something found on another site, submit the latter."


Makes perfect sense! Have a great day!


To be fair if you had read just the conclusion section of the whitepaper you'll find that the research doesn't actually answer the question.


I'm not sure why people in comments sections have to be rude by default? The parent commenter was gracious and made a good point and it was a friendly interaction. Neither of us asserted anything about the conclusion or the question. Maybe take a deep breath and count to 10 before commenting next time? It'd be really nice if at least one place was a reprieve from all of the toxic BS that the internet is.


If you come to Hacker News for niceness and kindness, you'll be disappointed. Everyone here is direct to the point of acerbic, argumentative to the point of combative, and somewhat rude. It's a performative erudition that masks recklessness [0].

If you want a reprieve from toxicity, go offline and speak to people in real life.

[0]: https://www.newyorker.com/news/letter-from-silicon-valley/th...


Seems to be just reporting on the research already discussed yesterday https://news.ycombinator.com/item?id=39013497

HN Guideline:

> Please submit the original source. If a post reports on something found on another site, submit the latter.


I feel bad for kids growing up today in some ways. When I was younger the web was smaller and Google was newer and working better. For the most part if it existed on the internet I basically just had to put a couple words in a box and I'd get to it within a couple minutes and there wasn't that much to go through if I wasn't that accurate with my search terms.

Now whenever I use Google to find even the most simplest of things if it's not a local business that has a Google Maps result the results are terrible and I would have to go through too.

I watch our kids and their friends do all sorts of other things to find things. This is one of the reasons for the whole "just put reddit at the end of the search" comes from.


It's upsetting and troubling how appending "reddit" to a search is your only hope of getting honest information, especially about products.

Everything else is ridden with SEO spam and faux reviews with referral links.


The fun part about that is that this method gives better results than searching on Reddit.


Also gives results at all. Reddit search doesn't even do comments.


Does Reddit have any special controls to stop astroturfing or fake reviews?


Sure they do, and the moderators, but it's getting worse.

Quite a few of the sub-reddits I liked went to shit over the summer when they made the API changes. The replacement mods aren't doing the same quality work as the original mods. What were nice sub-reddits previously are memified and gamified garbage now.


Users and (at least before the last run through the meat grinder) moderators.


They have human moderators who by and large care about maintaining the quality of the communities they moderate, and give those moderators buttons to delete posts and ban users.

Although FWIW, reddit seems to have it's fair share of astroturfing (at least, in my own anecdotal experience), it's just comparatively less egregious than rawdogging the google results.


> This is one of the reasons for the whole "just put ~ at the end of the search" comes from.

shushhhh!!!!! you're giving spammers ideas!!!!!

next thing we know, we'll have spammers creating fake subs with bots for this spam


too late for that


Google caused this situation itself. Sites that only wanted to get some clicks always existed. But the scale is due to the modern advertising industry which uses 20% of internet traffic just to serve mostly unwanted ads.


Given that Google owns Chrome, the most used browser by a wide margin, I am surprised they don't leverage this to improve their search results.

The consensus seems to be that SEO has won, that experts are stuffing websites with keywords and the search engines are simply unable to figure out what is good content anymore.

So why not use your own browser to measure how people interact with a website? How long do they stay on it? Do they just scroll and exit immediately? Do they seem to have a hard time finding information? Maybe randomly ask people if a website is helpful or not from time to time. Then aggregate that info and score a website. This should cut down on website that have nothing but fluff.


They don't need to abuse their control of the browser to do this. They were capable of and were inferring this info from people's visits to/from Google to determine whether a page was useful or not.

The problem is that SEO outfits gamified that metric as well. The idea of determining whether a site is useful based on how long the user stayed on it only served to cause sites to hide information halfway down the page.

That idea is THE CAUSE of the meme about recipe sites telling you how your long-dead great-great-great grandma wrote this recipe for a special event and how it has become a family staple and some of the ingredients might not exist anymore in the same way they did, before they get around to giving you the recipe.

Recipe sites were HUGE in the heyday of the DotCom boom and the only way to get your page to rank higher than every other online cookbook was to capture people's attention with a story before showing them what they came there for. If they can just get what they came for and leave, you got downranked as not a useful site.


That's correct. Even under CCPA analysis of anonymous aggregate data is consisted a legitimate use and not really a risk to PII.


And now many recipe sites have a link to skip to the recipe, so it's becoming unclear what the benefit is.


The benefit is that the site gets to serve all the backstory to the search engine crawler to boost the ranking, while real people can go right to the recipe.


It might have to do with recipes like fasion not being copyright therefore easily copied.


> I am surprised they don't leverage this to improve their search results.

Oh. Um. Well, hypothetically, maybe they do but the metric they're optimizing is their profit instead of user-facing quality?

I just assumed that everyone already knew this. Of course Google could fix their results. They want them this way.


If you enable synchronization, even with encryption enabled, by default Chrome shares your browsing history with the purpose of "improving search". The opt-out option is right there in Chrome's settings, which is really creepy actually.


I have to think that ad revenue is the prime metric, and Google is loath to change search non trivially to improve UX according to the measures you list.

Read the emails released in the recent antitrust suit and you'll see intense pressure not to make changes to search, or even to Chrome flows that drive users to Google search results.

Bottom line, if ads is performing, what incentive is there to change?


Doesn't it seem like search quality everywhere has gone down? Above the fold it is usually only "sponsored" results (ads). Not just Google or the web. Amazon results are garbage now in exactly the same way. Either ad results, or the 'organic' results are heavily SEO'd/gamed.

And as much as I love ChatGPT/GPT4, what Bing is doing with GPT4 is an abomination. Feels like Clippy "Looks like you are writing a document" chiming in when you didn't ask, only not as cute as Clippy even. It is a terrible experience for GPT4, and makes using their site for just web search way worse.


I'll throw in my recommendation for Kagi! Happily paying user. Perhaps my best recommendation is when I am having trouble finding something and tag !g onto my term to use Google instead (a reflex from using DuckDuckGo), Kagi will happily take me to Google where the results are, every time I've done it, much worse.

Kagi is exactly what I want my search engine to be, and feels like what Google used to be, to me.


I second this wholeheartedly. At first I was skeptical to get a subscription but to be liberated from the big G AND have excellent search is just worth every penny.


Dan Luu tested Kagi and found it no better than alternatives, but that it has the most inexplicably diehard users out of all the major options out there:

https://danluu.com/seo-spam/


That's pretty interesting - I'm able to replicate his results, even though they don't match my own experience.

I think it might be because I search for different stuff - for more technical subjects, I get better results. (A tiny example - searching "python dict" gives my python.org as the #1 result on Kagi, but the #1 result on Google is w3schools and python.org is #2)


That's what TFA actually says in the body; the problem is general, none of the major search vendors have a "magic bullet" to cut through the noise right now. Leave it to the Register to headline it with one instance (the most popular one) for clickbait purposes.


Except, the major search vendors are the biggest problem. They sell the majority of the good search result spots as ads. They add more noise than anyone.


It somehow feels like our society is crumbling under its own noise.


I don't know why Google doesn't grasp the nettle and just not index pages like those super low quality ones (ones that no human is ever ever going to want to view). They could surely infer them via a series of experiments: temporarily backing off on ads to a particular user and see which sites they click through to.

They needn't hit their ad revenue much at all, because each user would just be in an experiment briefly, and they would aggregate results to get a general view.


From Google's point of view, it's working exactly as intended. Once they've captured market share, there's no reason not to turn the screws and squeeze out every last drop of revenue. Google would love for the internet to consist entirely of ads. Other content is only necessary insofar as necessary to lure in people to see the ads. And in Google's case, just the hope they might find content is enough to suck a user into the ad vortex.

Perhaps this isn't a great long-term strategy, but that doesn't really matter because nobody involved is incentivized to think beyond the short-term metrics that determine their bonus/promotions.


What's truly amazing about this whole situation is that Brin and Page literally called it:

http://infolab.stanford.edu/pub/papers/google.pdf

Everyone—Everyone, who mattered (plus many others who didn't) knew exactly what would happen as Google went deeper and deeper into ads. But it made line go up real well, so....


They used to. For most of the '00s Google'd noticeably exhibit a kind of wave pattern: webspam would creep onto page 1 of results, more and more, then abruptly vanish. Repeat. Cycle of, IDK, a few months.

To all external appearances, Google simply gave up on this cat-and-mouse game around '08 or '09. Big sites get ranked high (whatever changed seemed to give those a huge boost, around that time), everything past that's mostly webspam.


What's the upside for Google? They get paid regardless with the injected top results as ads, as long as there's no competitor threatening their market share, the quality doesn't matter anymore.

They prefer to spend money to keep the monopoly going such as the Safari deal.


96% of the web already is not indexed. Clearly there is something wrong with the algorithm they use to determine what should be indexed and what shouldn't. All that AI doesn't seem to help them much


If i may ask where did you get this number? How are the other search engines on it compared to google?


I have no idea where they're getting that number from. The relevant search terms is "Deep Web" (not to be confused with the Dark Web).

https://www.sciencedirect.com/science/article/abs/pii/S03064...

The abstract from that article has a good introductory summary.


For me, the problem is what makes a webpage super low quality. The results I'm tired of seeing all contain the same watered-down language as the other top results that they either copy outright or run through a thesaurus.


Because those crap pages are filled to the brim with Google Ads that make up the most of their revenue, it's a conflict of interest.


How much of this is google getting worse vs how much of this is the fact that people aren't creating content on the open web anymore?

People complain that you used to get forum posts, blog posts, and guides for specific questions. But most of those have long since migrated to reddit or other social media.

Not a fan of google but think we should examine the other side of the coin on this one.


I follow many websites with my feed reader, the web is still very active if you're willing to go out of your social media bubble.

People still publish on the web, the problem is that the noise is unbearable due to the SEO crap. I predict a return of curated website directories and of "blog rolls".

Google could start banning SEO spammers to drive the cost and the risk up. The fundamental problem of the web is that spam is cheap, and the other problem is that Google's incentives are more aligned with the SEO spammers than they are with users.


> Google's incentives are more aligned with the SEO spammers than they are with users

100% this. Let's see a chart of Google search revenue vs result quality. My guess is that it would be an inverse relationship.


Why can’t it show me the good old content on the internet instead of pages of blog spam made by AI? Obviously they used to show me the old content but apparently users only want new content now…


Its the Signal to Noise Ratio problem. AI generated spam is easier than ever to create and harder to algorithmicly detect. This results in a rapid increase in the volume of webspam.


This does not excuse the fact that search results show me things that are unrelated to my search. Not ads, just pages that do not even contain any of my search terms anywhere on the page

This isn’t a SNR issue at all.


Go to google. Enter search for something very specific like "red baby buggy bumpers." Google decides for some reason that I didn't really mean "red" so it excludes it from the search. Now all my results are randomly relevant. I have to find the little link that says "Include Results with 'red'."

Which.. to the google engineers, why do you think I typed that in the first place? Just to waste my own time? Who are you to drop words from my query? I'm not interested in a page of results, I'm interested in ONE RELEVANT thing.

They've intentionally made choices that make the product worse. This is just one, I could type up a whole chapter on this anti-pattern nonsense. All they care about is worthless KPIs that don't correctly measure "quality."


When Google decided that all user supplied search terms were optional, result quality fell off a cliff.


I don't think that's going to change the fact that every time I search for some basic thing there is a GitHub repository for it somewhere and Google almost always shows me two to three pages of blog spam garbage and stack Overflow copy sites before I can get to it.


This isn't surprising and it's not AI's fault.

Google lives with a structural weakness – they sell display ads directly on websites which get their traffic from SEO. This creates an incentive the size of the GDP of Cameroon* to direct people to websites which make Google money.

Suppose you search for green goddess salad dressing. Is that recipe website the best representation of that recipe? No, and it's not close. It is a measurably bad, and measurably profitable, search result.

Google is betting, based on trillions of data points, that we are collectively willing to put up with a mediocre experience.

--

* Cameroon's GDP is $45B, and Google's display ad business is something like $55B+. Two facts which I gathered by googling. ;)


It's really noticeable and really frustrating. A few days ago I've found myself searching in DDG, then switching to Google since DDG gave me no useful results. The first page was blogspam and ads, literally zero relevant links. I suspect most of the results were AI-generated.

I wonder how do people still use google to find what the want on the web?


> I wonder how do people still use google to find what the want on the web?

We append "reddit" to the query.


curious - can you share what you were searching for?


Yet, they're still (for terms I search for; generally technical, journal etc) better than Kagi, Bing, and DDG.


I decided to pay for Kagi for a month or two to see for myself what they have to offer. Now, after several months, I don't think I ever want to go back. Sure, I have to pay for it but in return I get relevant results right at the top without having to double check every time if they might have been sponsored.

I must admit, occasionally I do still use "!g" but it's for some very specific things and very infrequently.


Kagi is miles better than bing, google or ddg


I wonder if the rise and fall of social media is just the way we as a market escape SEO blog spam hell.

I stopped using FB when the time line became 1% about the people I cared, and 99% about click bait. Market dynamics and the attention arms race seem to push every platform in that direction. Google is facing this.

So the response is to move our presence elsewhere, another forum another search, just another platform that hasn't yet been terminally infested by the attention arms race. But with our move, we also sooner or later bring the infestation with us.

We are digital nomads, trying to escape plagues that we ourselves bring.


Something I've noticed over the past year or so is that I've been increasingly relying on talking to other people directly to answer questions or find interesting things, rather than relying on search engines.

I still use search engines (Kagi, specifically), but not to the degree I once did.


I find I have to end up searching though time in order to get the results I want. Especially as a developer. A bug that has the same error 8 years ago may not be the same bug now. So I tell google to only show results for the last year and tada I get what I want. Usually.


Don’t need to be a researcher to notice that. All my last searches at google was always with site at the end.

Have not been using google for probably 5 years or so. First DuckDuckGo. Now Kagi. I am so OK to pay for a good search engine. Family Plan.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: