Hacker News new | past | comments | ask | show | jobs | submit login
Google made me ruin a perfectly good website (2023) (theluddite.org)
1112 points by MrVandemar 13 days ago | hide | past | favorite | 529 comments





Last year a friend and I made a website about the Nurburgring. It provided basic info for first time visitors that we were missing on our first visit. My friend spent a lot of time creating a UI with a custom map for displaying locations and routes. I wrote a bunch of content that was thoroughly researched.

At a certain point we ended up being invited by one of the largest rental companies to see whether we could work together. They invited us because the content was incredibly useful for their visitors and they preferred our calendar over the official one for ease of use.

So clearly, our site was adding value for the target audience we had in mind. We were also consistently getting visitors through different search engines that were looking for the info we provided. The number of visitors was growing consistently and pretty much all the feedback we got was positive.

In March, Google rolled out a new algo which all but completely removed us from search results. Out visitors dropped about 80% and growth has disappeared. What was a fun project that we spent many hours on is now a waste of computing resources.

I hate that Google gatekeeps the internet.


Google be like: this website performs well on the ranking, does it have AdSense, no, downrank.

OP's website is heavy with affiliate link listicles, which have been recently heavily downranked in favor of forums and content like Reddit and Quora.

We have a single list with affiliate links on there. Which happened because a company proactively approached us to share their link on our site because they liked the content.

Edit: I do see how it may have seemed like we had a lot of referral links due to the setup of outgoing links. Changed that now, so there is pretty much just the one left.


OP complaining about getting delisted when he was gaming the algorithm in the first place.

Website link?

To the parent commenter - have you set up google search console and reviewed what pages and keywords were affected? There could be other reasons why your pages aren't being indexed properly. If it's a small site, it could have been something as simple as changing an image or page title.

If something simple removes a useful site from the search results Google should fix its algorithm not the otheway around.

Not saying that’s what happened here, but if people are searching for “images of X” and you remove your image of X, how is that Google’s fault?

Google's USP once was not falling for content farms and their word lists.

Are we back to the early days of search engines?


But this isn't about “images of X”. Nor is it, likely, about whatever "listicle" links OP's site has[1]. It's about people serching for info about, say, "Day at Nürburgring". And OP didn't remove that, now did they? And I bet most of them don't specify either with or without listicles.

So if people were searching for "Day at Nürburgring" and used to find OP's site, then Google changed their algorithm, and now when people search for "Day at Nürburgring" they don't find OP's site, how is that not Google’s fault?

______

[1]: And Bog knows Google serves me up enough sites with those on...


You can't say what it's about without seeing exactly where the traffic and rankings decreased and looking at the Google search console. I'm happy to take a look out of curiosity if the OP wants to share.

I presume when you say "remove from the search results", you mean removed from the first page or top n spots or whatever.

But what if there are 20 good resources? They can't all be on the first page, or in the top spot.

Perhaps yhe other sites improved, or were better in some way?

Being on top for a while is no guarantee you'll be on top forever. It's precisely Google's business to change the algorithm.


I agree with your general point that it should not be the responsibility of the website to change to make a search engine happy.

I disagree with your specific point that Google should fix anything. Google can choose their own behaviors and motivations. Users can choose to rely on Google or trust them at all, or not.

Remember when Google was a successful business competing against the likes of Yahoo and AltaVista? Back when it was ok to be a successful small business? When you didn't count a millions users as a failure?

Nobody seems to believe anymore that you can operate a successful search company without having a trillion dollar war chest. Few people are willing to give alternative engines a try. They'd rather stick to Google like glue.

I'd like to declare the 28th as "Try something different day." On the 28th of each month people should try a different browser for the entire day. Or a different search engine. Or a different tooth paste. Every month, one day a month, take a different route to work, or maybe even a different mode of transportation. Use a Colemak keyboard. Go to a different church. You don't have to do the same different thing each month. Just start building some experience with shaking up your routine.

When you try something different and it turns out ok, share your positive experience with others.

I have not used Google search or Bing for years and it has worked out pretty well. I started with DDG, but have had good experiences with other search engines, too. Some may argue that DDG is not different enough. So perhaps next month I should try qwant.com for a day.


I have never found search console useful for anything other than random 404's and one-off server errors. When a site is hit by an algorithmic penalty there is never any clue as to why.

This is my experience as well. If someone could mention how to examine the console to improve a site after an algo change, I would absolutely love to hear it .

I also find it mystifying and I’m not a SEO expert, but one way I use it is to compare ranking positions over time for specific pages and queries. That will help you identify where exactly the traffic came from and where you lost it. A lot of times this has to do with gaining or losing the #1 spot to a competitor.

If you don’t see a ranking change, it could also indicate seasonality in your visitors or external events. If your website is brand new, this could be hard to detect otherwise. For example, recently I saw a traffic spike for a random blog article, and search console help me see that people were searching for that topic because Elon Musk tweeted about it the day before.

Another helpful feature is to inspect particular URLs and make sure they are indexed—sometimes if you have multiple similar pages and set the canonical URL incorrectly, Google will try to de-dupe the results.

Hope that helps!


Very helpful, thank you.

> could have been something as simple as changing an image

Or something nefarious like Google skewing their algo to favor websites with more placeholders for google's ads.

You know the consistent allegations against Google on this topic from long time insiders and my personal experience of terrible search results does not allow me to apply Occam's razor at all. Instead its the inverse of assuming malice.


Definitely. All of them were affected equally, pretty much.

If this were true then every site would implement adsense. Most don't.

This reminds of "Bidding Rank" from Baidu decades ago (and I think it pretty much still applies for Baidu) - Google was not only better technologically, but ethically because their search results were not that profit driven as "Bidding Rank" which was (and still is) very much despised. Now it seems Google only cares about profit and started to do things more or less the same way.

Sick.

Disclosure - I was so pissed by the degration of quality (an money-thirstiness) of the search results from Google that I switched to a non-profit search engine as my default for both desktop and mobile. The daily search experience doesn't have much noticible change to me. I do admit sometimes the Google search result could be better sometimes, but those occasions are quite rare for my needs, like maybe once a week.


What's the search engine?

DuckDuckGo works great in my experience as of late.

If this were really the case, wouldn't it be a painfully obvious anticompetitive move?

Investigating and prosecuting anticompetitive behavior simply takes to long to investigate and prosecute. And by the time this happens, the fast-moving nature of the tech industry usually means the issue is not nearly as relevant as it was when the issue arose. Then the company strikes a mea-culpa deal with the appropriate governing body and makes changes that don't actually matter any more.

Are there reasons to believe this is what actually happens? Did anyone document this?

How might you see solving this problem? How could we distribute the task or goal of curation amongst individuals? How do we incentivize and enable discovery of maven/curators?

I've been through this thought-process many times.

1. Google isn't working well any more.

2. Therefore bring humans back into the system of flagging good and bad pages.

3. But the internet is too big - so we have to distribute the workload.

4. Oh, a distributed trust-based system at scale... it's going to be game-able by people with a financial incentive.

5. Forget it.

---

Edit: it's probably worth adding that whoever can solve the underlying problem of trust on the internet -- as in, you're definitely a human, and supported by this system I will award you a level of trust -- could be the next Google. :)


> 1. Google isn't working well any more.

This is so true. It’s pain to search for anything undeterministic with nowadays. I usually find myself putting double quotes on every single word I’m interested in and Google still brings unrelated results.


I'm curious. Why do you use Google if you don't like their results?

PageRank isn't working well anymore. I use DDG, but it its quality is also flagging.

Yeah, DDG quality seems abysmal for many of my searches now. I'll then switch to Brave, which sometimes finds what I'm looking for. I rarely ever check Google or Bing as a fallback, but when I do it feels like a screenful of ads and it isn't any more helpful (except Google Image search).

Part of the problem seems like a recency bias in search results. I notice sites frequently update pages with new timestamps, but nothing of substance appears to have changed (e.g. a review of something that was released 4 years ago, but the page was supposedly updated last week). So if I do pretty much the same search that succeeded two months ago, but repeat it today, I might not find the useful result I remember coming across.

I'm sure there are a bunch of other issues related to search and SEO that are affecting search quality. It seems insane that the major search providers don't combat this trend by arming users with more tools to tailor their search, but rather steadily degrade the user experience with no recourse.

Honestly, I think if Google was wise, they'd have a skunkworks team rethinking search from scratch (not tied to AI/LLMs) that starts their own index and tries to come up with an alternative to the current Google Search. Maybe they have that already, though I doubt it. I'm sure if they do have such a team, it's intricately tied to existing infrastructure and team hierarchies which effectively nullifies any chance it has at success.


I would be happy with an even simpler solution. Give me a blacklist domains or sites. There are 50 or so sites that I never want to see posts from. It’s not one or two that I can add an exclusion in my searches.

Second, give me a way to express semantic meaning of something. If I’m searching for rust, let me choose programming language for example. I find myself adding various one word tags to limit the search results.


Personally I have one use case left for Google: Product Search.

If I'm looking to buy something, I will frequently end up using Google. The engine that matches my product search to a relevant ad is excellent. Basically anytime you need to search for something that could lead to a purchase of a physical product, Google will be extremely useful. For services and software you can't use Google, you will get hustled by fake review or top 10 XYZ for 2024 sites.


I can't believe how terrible product search is on Instacart - I place orders on there pretty frequently for my mom, and Petco is the worst.

I will search for "wellness chicken cat food" - and wellness has chicken cat food in a few different textures, so it seems like those should at least be on the page of search results, if not the top results. Not always so! At the very least I will have to scroll a ways down the page to get anything even wellness.

And sometimes the top results aren't even cat food, they will be random other pet supplies.

Or she wants a few different flavors of the food, and I find one and then the other flavors I have to search a few different ways to pull them up and they don't show up on any "similar" displays.

It's painful. I hope Google doesn't go the same way - I think with Instacart it's because they want to promote whatever it is they put at the top, but even that doesn't explain how terrible some of the search results are.


I just think wellness aside you should get other food than cat food for your mom.

Meijer (grocery store) is the same. Until a short while ago, you even had to match the capitalization of the product you were searching for. And we are not even talking about different writings of the same Unicode characters (ê etc.) with different bytes.

I use it for deterministic things. If I’m looking for something specific or a well known thing, it simply gets me the results I’m looking for.

For anything that’s indeterministic, it just brings me garbage. The same with YouTube as well. I’m searching for a specific thing about a particular library and it’s showing me stupid definition blogs or useless garbage. I apologize for my crudeness but no other words can describe it.


Have you tried verbatim mode? I use it by default.

Could you explain how to use verbatim mode please? For anyone else reading in the future as well.

Google used to work closer to verbatim mode but would get common synonyms to give you comprehensive results. Now it stretches synonyms and alternate spellings to the point of uselessness.

They also changed "did you mean Y" into "showing results for Y". So annoying.

Try using kagi. The results are so much better.

I think the solution to this is both unique and trivial: you cannot trust something that is freely expandable,or did not require some amount of stake from the other party. That stake can be anything, time, work or money.

If you want to trust a review, it's needs to have required a non expandable resource from the reviewer. That amount of resource should be an optimum of what an average user would be willing to expand without missing it (so that barrier of review is low), while being prohibitively expansive if an actor want to cheat the system and generate millions of reviews.


I like your thinking, but there's a middle ground before full automation: when humans are incentivized, one way or another, to provide the biased reviews. This might be via straight-forward employment of people in lower-cost places (e.g. via Mechanical Turk) or other incentives. For example, note how a proportion of Amazon reviews are gamed and unreliable.

At the moment, the only tasks (that I can think of) that come close to the 'time-consuming-enough to not scale, but not quite annoying enough to put off committed individuals' are the various forms of CAPTCHA - which is unsurprising, given that we're discussing a form of Completely Automated Public Turing test to tell Computers and Humans Apart. (And of course, there are CAPTCHA-solving farms.)

But would people invest time in a review system that required them to complete a form of CAPTCHA regularly?


> That stake can be anything, time, work or money.

I think you'll find that money should be removed from your list. There are some untrustworthy people that have tons of money. Sadly, I think trust must be EARNED, and that requires giving effort (work) and time. You cannot buy trust with simple money.


> You cannot buy trust with simple money.

yet rich ppl are granted more upfront trust. maybe because we assume less incentive to rob.


This clearly isn't true, since if you dress and make a rich person smell homeless, they simply won't be trusted in most parts of civilization.

No, what you're talking about is POWER and AGENCY. Rich people have the power to override trust through the fact that they can operate with near impunity; so you have very little agency to not trust them. If you choose to not show trust, you may invoke their wrath.


Then web of trust. Means SSO (as a way to link the review to the trust).

In order to prevent hacking trust the SSO again must ensure:

- unique human, or

- resource spend


> unique human

Here's the thing. A sovereign nation can "generate" as many "unique" humans as it wants (via printing "fake" but official identities). No one would be on to them until there were more users than probable people in the country.


Doesn't stop nation state troll armies though

How about Wikipedia's approach?

Of course Wikipedia is way smaller than the internet, but still one way to go could be by having themed "human curated niches"


> 3. But the internet is too big - so we have to distribute the workload.

> 4. Oh, a distributed trust-based system at scale... it's going to be game-able by people with a financial incentive.

These are solved by being transparent and surfacing the agent (maybe even the sub-agents) for ranking, and allowing us to choose.

This way, if someone/something is gaming the system, I can just say "this recommender is garbage", and consequently it and all its sub-choices are heavily downranked for me.

This'll make filter bubbles even worse, but that ship has sailed. And I'm sort of a progressive-libertarian-centrist (in the classical sense, not in the American sense). If I get put in a bubble with people who have similar balanced tastes: yes please!


Freenet FMS; more specifically: Web of Trust.

IMO it's how all moderation should go: you subscribe to some default moderators' lists initially and then mutate those subscriptions and their trust levels. Mod actions are simply adding visibility options to content and not actually removing anything.


I think a distributed del.icio.us could work, a p2p version not based on crypto. Something like https://veilid.com would be perfect actually.

The really obvious thing to avoid is DMOZ which got captured by spammers immediately.

Reddit could've been it. This is my default search engine at the moment:

https://www.google.com/search?q=%s+site%3Areddit.com&tbs=qdr...


Obviously the solution is to create a centralized system to electronically ID every human on the planet, and track what they post, talk, think, which medications and food they consume, who they are friends with, who they fuck, how much of this decade's evil chemicals they exhale, where they spend their money, and their real-time location.

Or, you know, just make your own open-source search engine, with blackjack and hookers.


> How might you see solving this problem?

Break up google, disentangle AdSense from Search. Then the search division doesn't have incentives anymore to prioritize websites based on AdSense presence.


I don't think that's necessarily required. Google has always done well, even without pushing ads as hard as they are now. The problem is that Google / Alphabet has become to reliant to the money from ads to finance other non-/less profitable products.

We've seen Google sell a profitable business, Google Domains, because it didn't make enough money for Google to bother with. For some reason Google believes that it is entitled to make some insane profit, rather than being content with having a reasonably profitable ad supported search business.

Google Search sucks because of the current financial model they, and most of society works under.


How would the search division make money? Serious question.

Search can still serve ads on their own pages (which Google calls "AdWords"), and (iirc) makes the most revenue for Google.

AdSense is serving ads on third-party websites, which is very different.


Does it need to? How large does it needs to be? Can it be treated as 'essential infrastructure'?

I don’t know, webrings?

we definutely need a return of webrings AND curated links page on decent internet pages.

Webrings were a lot of people doing work for free or with little return. Nobody has time to do this anymore, and the only way to support such project would be paid membership (hard to gain traction) or ads (back to square one).

Sorry that I sound pessimistic but I just am.


>> Nobody has time to do this anymore

By every measure don't people have more time than ever before? I think it's more a dramatic shift in how people use & view the internet. The (unfortunate in a lot of ways) answer is probably create something new and exclusive, but universal enough for the target audience that's going to do all the work to align incentives, try to pull along the good and leave behind the bad. There has been and will continue to be a lot of false starts & failed attempts. If successful it will eventually be co-opted and ruined too. So it goes.


> Nobody has time to do this anymore

Can you say more about that?


Thank you for the question. It really needs an explanation.

Times have changed. While there are significantly more open-source projects, online communities, and all kinds of available help than ever before, the proportion of active contributors relative to the number of people online has decreased. Consequently, the time required for each volunteer to process input for a project like a "webring" has increased significantly.

I also feel that it's become harder to motivate people, but not because they've become lazier. Rather, the tasks have become more demanding, so "nobody seems to have the time for them anymore".


I think people in general have more time now than they did 25 years ago, not less?

Yes.

The scale of the internet is too large for individual consumption search engines and word of mouth are the methods I see for distributing access. Search engines need individual Judgement to evaluate results and word of mouth provides context clues and trust.

Business prefers search engines to scale their monetization efforts but the quality of results are unknown.


IDK; the sites I regularly visit nowadays are all either one-off personal projects I learned about through word of mouth or user-centric portals like HN where I get that word of mouth. I guess it's kind of what digg used to be.

Pivot to paid search engines where you are the customer not the product.

Thanks @Kagi - I attribute three solutions last week directly to you.


Plurality of search engines?

The fix is a decentralized search network with nodes linked to people or legitimate businesses. You manually distribute trust to friends and businesses you like, and you can manually revoke that trust, with some network level trust effects occurring based on spammy/malicious behavior.

You know why you don't see systems like this at any scale

Because they don't work.


Decentralized communication absolutely works, and we have plenty of decentralized tools now. Decentralized search is a legitimately hard problem, both because it needs strong protocols (which are getting there but still WIP) and user adoption (which isn't really happening).

Ask an email administrator how well decentralization works at scale.

It should be concerning that email has re-centralized to deal with all the problems that come from federation and decentralization.


That’s not the reason email has re-centralized. It’s because big-tech companies provide it for free in order to bind users to their services, and at least in the case of Google, also to collect data and drive ads.

As someone who administrates their own email server and works at an SMB who administrates their own email servers, it works quite well and doesn’t require centralization.


"When mice march with elephants it's in the mices best interest to be cautious"

I've done a lot of a lot of email administration over the years, servers with 100,000s of accounts, and you walk a very fine line in being able to communicate with the rest of the world. Your IPs must exist in certain blocks (or at least not be in banned blocks). You must block outgoing spam messages at a much higher rate and quality than google/hotmail do. Get yourself blacklisted for any reason and expect to disappear from the internet, meanwhile no one is going to block the largest services.


Mail servers aren't real people or real businesses.

50 year old decentralized protocol has problems, so decentralized protocols can't possible work. Just like 50 year old computer can't play doom, so doom doesn't exist.


Search systems aren't real people or real businesses either. Never mind details like how you define and enforce "real" in a decentralized way across N legal systems with wildly varying ideas of what it means to be a "legitimate" business. There is no system or ledger you can query to find out if a given person is real or a business is legitimate. This is exceptionally unlikely to change on a useful timescale.

The basic problem with email is that it assumed good-faith participation from the parties involved. It was assumed that only legitimate actors would have the resources to participate and they would always be well-behaved. This, it turns out, was flawed on several counts. For one, it assumed legitimacy could be assured. For another, it assumed legitimate users would be well-behaved and would never abuse services for gain. For a third, it assumed account takeovers or other impersonation attacks wouldn't happen.

Every de-centralized system that aspires to not have email's problems needs to take them seriously.


Decentralized communications may work, but decentralized internet search is probably always going to be impractical.

The actual logic involved in search is itself pretty simple and almost trivial to shard, the problem is dealing with a mutable dataset that is of the order of a hundred terabytes.

Search gets enormous benefits from data locality, such that distributed approaches are all significantly more expensive.


I think they would work fine if the surveillance and advertising industry were overnight christened criminal enterprises and users had to start paying for what was once '''free''' (cost of entry: pieces of your soul, bit by bit).

Decentralized has a small convenience cost and the crime of surveilling everyone to advertise to them (among other uses) is too cheap for decentralized to come out on top. Note I am Atheist when I say this: if law doesn't hammer the scourge, those who get religious about this stuff are the ones who'll enjoy any modicum of sovereignty.


Except the whole fediverse... ahem...

Yep, create a small closed off community that shames outsiders and you can have what you want. Everything in it will be local and very much a bubble. It won't have the failure modes of a Google or Facebook though. So I guess pick your poison.

You’re writing this comment on a site with an upvote/downvote based algorithm.

The answer is simple, allow some level of user feedback from proven real users (for example, only people with gmail accounts that are over 5 years old and who use them at least 3 times per week to eliminate fakers—-but keep this a secret) and apply it mildly as a ranking signal.

As long as it doesn’t become the only factor in ranking, you still retain strong incentives to do all the old SEO stuff, yet with a layer of human sanity on top.


Google Maps reviews are working like that and are often gamed.

If you pay close attention, you can spot fake reviews because they usually come from “Local Guides” (so supposedly the most trusted users).

Reddit is somewhat better at ranking and filtering spam, due to local mods, like there were in the times of web directories and webrings.

One of the former bosses of Google search explained that the key metrics they follow to consider the success of “Search” are the number of page views and the total revenue.

So if a user doesn’t find what he needs but keeps coming back it’s a win for them.


> for example, only people with gmail accounts that are over 5 years old and who use them at least 3 times per week to eliminate fakers—-but keep this a secret

tbh this is just a bandage, its just going to get botted once people discover the pattern (a lot of premium bot farms do offer mature or hacked gmail accounts anyway) and its going to be worse for legitimate discovery


Then you quietly change the rule, and get years of success until it's widely gamed again.

The point is, there's experimentation that could be done and there's absolutely solutions that could be found.

Early Google did tons of experimentation with the search algorithm to maintain the integrity of the results. There was definitely an active game of cat and mouse back then that Google actually cared about staying on top of.

But as a decades entrenched monopoly, Google lost all incentives to tinker with anything anymore. The "operational" folks took over and any change to the search algorithm is now a multi-year endeavor involving thousands of stakeholders.


> Then you quietly change the rule, and get years of success until it's widely gamed again.

You won't though, unfortunately. Too many people know the rules to keep it a secret, especially given that corruption exists.


We are working towards this if you'd like to get in touch. I feel like "web3" as an interface/discovery platform is a very good topic.

What are you working on?

This was already solved in the past. Don’t just build a web site, build a community. This way the community will advertise the site and attract more users.

How do you build a community if google prevents users from finding it?

You go to the myriads of car communities and systematically promote your content for Nurburgring. It’s a niche site, and by the OP’s description we can safely assume that it’s quite usable since he’s been already approached by a sponsor. And since this is a site about a specific car circuit you just go to the actual place, or make an arrangement with the organizers to promote your site. Give a leaflet, or build an accompanying mobile app.

I'm not saying that circumventing Google is the easiest thing in the world, or the cheapest, but it's not mission impossible either. I didn't find Hacker News from Google, not the dozens of other tech communities I'm following.


The existence of meatspace never stopped the early web from flourishing, so why should the existence of the modern web stop anybody from making a second web? The only reason that Google was useful is because it tapped into the trust network that already existed before it.

I feel like the social media churn has destroyed people's brains, because they're more interested in stopping people from doing things they don't like than doing something awesome themselves.


Before people knew the web was vast and required digging through. Now people think google is the web, so if it doesn't come up by the third search it might as well not exist.

You're not wrong, but this is also a great acid test. We need to work to help people understand google is most definitely not the internet, and where we fail or don't get through leave them behind. I don't know what comes next, but there's not room for everyone, and I include many here (and likely myself) in those who won't make the transition. We love to imagine people physically leaving Earth for Mars and beyond, but what follows the internet is going to happen far sooner.

That's exactly it, the people are different now. They don't have the same time or energy as before.

Start local. With actual people. Advertise with posters on utility poles (oh well they do not exist anymore in most places so replace with a suitable alternative and go to your local church/mosque/dance club/whatever relevant). I'm not saying this will work, but it is how it used to work.

Same way people built a community before google?

it's not really a community if it relies on an influx of random strangers

facebook/twitter ads still works.

I know what you gonna say, but... ya... you knew.


Honestly I think Google needs to be broken up. It's not a novel idea but the more I think about it the more I like it.

So, Google becomes two orgs: Google indexing and Google search. Google indexing must offer its services to all search providers equally without preference to Google search. Now we can have competition in results ranking and monetisation, while 'google indexing' must compete on providing the most valuable signals for separating out spam.

It doesn't solve the problem directly (as others have noted, inbound links are no longer as strong a signal as they used to be) but maybe it gives us the building blocks to do so.

Perhaps also competition in the indexing space would mean that one seo strategy no longer works, disincentivising 'seo' over what we actually want, which is quality content.


I’m afraid the problem is not indexing, but monetization. Alternative google search will not be profitable (especially if you have to pay a share to google indexing) because no one will buy ads there - even for bing it is a challenge

The hope though is that by splitting indexing that puts search providers on an equal footing in terms of results quality (at least initially). Advertisers go to Google because users go to Google. But users go to Google because despite recent quality regressions, Google still gives consistently better results.

If search providers could at least match Google quality 'by default' that might help break the stranglehold wherein people like the GP are at the mercy of the whims of a single org


People go to Google, because it is default search engine in most browsers, they don't seem to change it.

> Google still gives consistently better results

How sure are you about that? I find them to be subpar when compared to Bing, especially for technical search topics (mostly, PHP, Go, and C related searches).


Not a bad idea, but there are lots of details need to be fill in and, you know, devils is in the details.

Google's index is so large that it's physically very hard to transfer out while being updated. Bandwidth cost is non negligible outside google's data centre. In terms of data structure, i can imagine it is arranged in a way that make google search easy.


A friend of mine owns _the_ best website by far on how to become a student in Czech Republic. 15 years of effort, hundreds of excellent articles, all the content is regularly updated, etc. Google's ranking for "education in Czech Republic" (in Russian)? Not even in the top 100.

The #1 website in Google's ranking belongs to a company that significantly overcharges future students and has outdated/incorrect information on their website.


I'm being a bit contrary, but: it sounds like 80% of your traffic was coming, for free, from Google. Is the claim here that if you killed SEO, some more equitable, consistent method of content propagation would spring up to take it's place? Because I have a feeling people - especially young people - are abandoning Google, but for more opaque, less equitable algos (like Tiktok).

Tl;dr Google is imperfect but for a while it was helping people find your site. I worry there are darker paths in our future.


That would have been a good excuse/explanation in the days before Chrome existed. But since Chrome is THE browser, users have a hard time escaping Google. So, GP is right.

Windows is still the most popular desktop / laptop OS, and while it might come with a Chromium browser it defaults to Bing. Users who want Google search need to either change their browser settings, or install a new browser (two things this community claims that no average user would ever do on a platform where the default was Chrome and Google web search).

I know it's imperfect, I know it's getting worse, I know it's an obscenely profitable money making machine. But a lot of people seek it out because it's a functional product that (at least for me) is free and still outperforms the competition.

I don't want to like Google, but I'm not going to pretend the product sucks just because I'm unhappy with the business model and the decline in quality.


> outperforms the competition.

I've been using chromium and firefox side by side at work and play all day for abt 3 years now. Indistinguishable except chromium uses more memory and crashes and hangs. I get hundreds of tabs open in firefox for weeks and months. I reach about 50 before chromium gets lethargic.

I used to do this under ubuntu 18 and 20 with 32GB ram, now under win11 w 64.

I don't understand the Chrome reality distortion field.


I think that's a slightly orthogonal issue - I'm talking Google Search vs other search providers. I doubt there's a significant gap between Chrome and Chromium.

> Users who want Google search need to either change their browser settings, or install a new browser (two things this community claims that no average user would ever do on a platform where the default was Chrome and Google web search).

Did you miss the part where Google would directly advertise and ask if you wanted to use Chrome instead on Google's search page? Or how it would be bundled with every installer under the sun? Chrome isn't the most popular browser because we collectively decided it's the best. It's because they leveraged their position as the world's search engine and advertiser.

I've worked with tons of your average PC user. They don't even know what a browser is or what a search engine is. If Google asks them if they want to install Chrome, they will always answer yes because why not. It's Google.


Bing on windows does the exact same thing. M$ and Google have roughly equivalent resources and audience reach to push their product. Google still comes out on top because via both reputation and average use case it's quality is better than bings.

This is not likely to change unless OpenAI finds a way break the monopoly. It's the only currently existing search that can claim to be better than Google. Which is why Google is pushing Gemini so hard.


Are you saying that people would search for congressional apportionment on TikTok?

Probably not, but I reckon they'd have a crack for Nürburgring holiday planning.

The only thing you find on social media is influencers showing expensive cars. Good luck planning a holiday based on that.

I'm curious now, can you give me a link to the site?


Love the slanted design, works great given the context

Hi, OP's friend here. Happy to hear you like the design I've made!

Even with the latest update blogspam from India still dominates my niche. Welcome to the internet of the 2020s where investing in your product means jackshit because wordpress idiots can press a button and yoast an article.

If you're a creator-type why on earth would you ever build a web product in this type of environment? Join a corp or create trash and ride the wave -- at least then you'll have some semblance of a normal life instead of a living like a starving artist into your 30s


Did you any problem building a site on a proprietary subject like Nurburgring? Its private. Dont they have copyright, etc.

Did you get an agreememt with them or is it not an issue in Germany?


You can write a web site about Coca Cola if you want. If it's just factual information that's all within fair use.

As long as we don't use the trademarked name 'Nürburgring' or their logo or an outline of the track in branding, it's all fair game. If we were to start selling t-shirts it would be a bit more tricky and we'd have to be pretty careful.


Can you mention the word "Nürburgring" anywhere on yout site, but just not in the website name/url? Please dont be angry, I am honestly trying tonumderstand the limits of what I can do on my own field of interest.

Are you for real? Honest question.

Absolutely real. I want to know because there are large privately owned infrastructures in every country. Can one set up a website discussing it, and can one show pictures, maps, venues, tickets, etc., without being sued? (Or is only text and numbers permitted).

I think mine is a relevant question.


To be fair, it’s not Google who is gatekeeping the Internet, but the “dumb masses” who are using Google.

Google is just gatekeeping Google.


Do you have a god alternative? I am interested.

I've had some luck with excessive consumption of alcohol

To what? Search engines or using the Internet in general?

You know there is this nice idea of hyperlinks. These don't have to come from Google. Just as an example, I use HN as a source of new sites to discover quite often.


I've been using DDG for years, works just fine. My remaining usage of Google as a search engine for sport results, cause they have that nice widget to see matches/tables, and on mobile.

> I hate that Google gatekeeps the internet.

They really don't. People reach enormous audiences thorough social media.


It's been a long time since I thought about any of this, but last time I did, social media traffic was some of the worst quality. Search traffic was seen as golden because it came with intent. Social media traffic was wandering and aimless, so converted poorly even if it was 1000x search. Search still got more conversions and it was no contest. Did something change?

Commercially I agree with you, search traffic is much better than social media traffic.

But people are reaching huge audiences through social media without even having any domain that can be indexed by Google. That is also the world wide web.


We made this site for people who search. We were those people, searching Google for where to watch cars racing on the Nürburgring and then getting nothing but cryptic route descriptions on forums.

So we created a map with actual walking routes and people were finding them. Now they're not finding them and they're back to lots of searching.

We get some traction on social media too. But it's people who already go there and know a lot of what we provide already. People don't search Instagram for walking routes to POIs.


What type of site took your place?

This situation is made even worse by us. Yes, us. Inbound links used to be a good quality signal: the more people link to you, the more important your site is. And there were always link farms and SEO lowlives that abused the system. But these days it is nearly impossible to get any legitimate inbound links, because people don't have web pages and web sites anymore, instead entering all the information into silos like Twitter, Facebook, etc. These tag your links as nofollow/ugc, so they don't count towards SEO.

The net effect is that pretty much the only link signal is from link farms and paid media. If you don't crap over the internet with shady tactics, you will not appear in search results.

We lost our vote, by our own choice.


Are you sure that search engines don’t count nofollow links at all? I know that’s kind of the purpose but I would be surprised if they would really completely ignore them.

Edit: On this page https://developers.google.com/search/blog/2019/09/evolving-n... it even says that they use all links to rank websites, even if you set them to nofollow.


It's not nofollow links. It's content behind a login wall. If the content can't be found with a crawler it might as well not exist. So sites like Facebook and Xitter that block a significant amount content for non-logged in users are the problem.

I mean, if the only signal is from link farms, that's still pretty good correlation with crap content, just not the way Google thinks.

So if I have a content site and some SEO company offers me money to publish an article am I supposed to refuse for the well being of Google’s SERP?

No, you refuse because affiliating with spammers is a bad thing to do.

Who said they're spammers?

You did. In your above post.

A SEO company represents a plastic surgeon. They offer money to a news site to publish an advertorial. How's that spam? And that article about how to properly wash your clothes that suggests a few washing machines is also paid. That's how news media works for ages. So by your definition all these legitimate companies are spammers? Would you care to inform me how should PR be performed then?

Correct.

If you're offering me money to link you, then presumably you believe your content is not of high enough quality for me to link you otherwise.


It's not even hypothetical. I've seen these offers. Every single article is hot garbage.

What do you mean, "legitimate companies"?

You seem to have misunderstood the difference between an advertising agency and a SEO company. What you're describing is what advertising agencies do, SEO companies are spammers that sell access to their spamming tools. It's similar to the difference between an agency practicing law and having them send a cease-and-desist, and buying a DDoS.

Not that I'm particularly fond of any of that, though I have lawyers practicing in another area as customers.


A plastic surgeon is a legitimate business. How on earth would a business like that promote its site and gain visitors? By paying exuberant fees on keyword auctions on AdWords? It's way more affordable to just pay a PR firm, or a SEO company to publish an advertorial on a news site. And no, not all companies that sell SEO are spammers. Grow up, please.

Some of you guys need to get your heads out of your butts and realize how the real world works. It's not just black and white all the time.


Personally I don't think for-profit surgery, or for-profit medicine generally, is a legitimate business. It might open up as a possibility when every person has access to the medical care they need, but that seems far off.

Email spam is cheaper than news paper adverts, does that make it "legitimate"? Because that seems to be your argument here.


Personally I don't think for-profit surgery, or for-profit medicine generally, is a legitimate business.

Sorry, but I'm not willing to have this discussion. As I said, you'd better grow up and face the realities of the world you live in.


Why did you start it then? A lack of self control?

>A plastic surgeon is a legitimate business.

At this point, the argument has lost good faith.

ADDENDUM: for comprehension, not plastic surgeons.


Never thought I'd see a pro-advertising industry "hacker" on this site.

I'm not affiliated with SEO in an way. I just understand how things work. Advertorial articles existed decades before the emergence of the web.

You refuse because visibility of an article shouldn’t be based on who can pay the most money to push its visibility.

I suggest you read the following. Because it's pretty obvious that a lot of you guys live in some parallel universe where things are just black and white.

https://paulgraham.com/submarine.html


I disagree about having PR other than on the entity’s own website, newsletters and such. There is nothing parallel-universe about that.

You should refuse for the simple fact that this company is unlikely to be only contacting you. They will have a large footprint that you don't want to be part of.

The internet is an SEO landfill (https://news.ycombinator.com/item?id=20256764)

This is a related discussion from about 5 years ago about how SEO is ruining search. Google still seems to have a thick enough skin and a monopoly to get away with crap even after so many years of ruining search.


Calling it a Landfill seems accurate. I just searched (on DDG) for the tap size for a 5/16-24 bolt. I got garbage like this:https://shuntool.com/article/what-size-drill-os-used-for-a-5...

This isn't even the worst example, since it does at least have the correct info buried amongst tons of Ai generated garbage, but I can't use this for reference, since it tells me 4 different drill sizes. I've had to switch back to a paper copy of the machinist's handbook, since I can't trust the internet to give me accurate information anymore. 10 years ago, I could easily search for the clearance hole for a 10-24 fastener, now I get AI junk that I can't trust.

How have we regressed to the point that I'm better off using a paper book than online charts for things that don't change?


I find myself using yandex more and more. They’re like old Google, but obviously based in Russia.

https://www.americanfastener.com/tap-and-drill-size-chart/

That was the first result.


Unfortunately, yandex is destined to fade into irrelevance for the reasons that has nothing to do with the tech.

Can you elaborate?

Half of web in Russia is blocked. Literally, powers that be think of Russian tech companies as of their servants and nothing more. Yandex basically sold their main asset, domain name to other entity.

If there is any chance I’ll use some web content again, I generally copy and paste the bit I want into the notes app on iOS.

You know it’s bad when you trust Apple’s search function over Google.


This, I am a terrible note taker. For years a huge part my knowledge and skills relied on "if I found that information once, I'll find it again". My brain compressed the information by memorizing the path to retrieve it again.

Now that does not work anymore. You know some information is out there, you found it once when google worked, now it's lost in the noise.

I'm learning to take notes again and organize them so I can search them easily.


Yep, i print to pdf a lot now.

Googling "tap size for a 5/16-24 bolt" gives the drill size in the first line of the results page.

For queries like that I now turn to Gemini / ChatGPT first. Of course, this is only a good idea if I have some way of sanity checking the answer. If I doubt the answer I get back I try Google search instead.

I really like Kagi's approach to this, which is to give a list of references. There's still no guarantee that the answer is correct, but you can at least check the references :).

https://kagi.com/search?q=what+is+the+tap+size+for+a+5%2F16-...


You can ask a model to provided an analysis of its answer including a probability that it is correct as part of the prompt, helps with doublechecking a lot.

Is there any evidence that these probabilities are based on any real calculated heuristic?

They're consistent to the model, particularly if you ask the model to rationalize its rating. You will get plenty of hallucinated answers that the model can recognize as hallucinations and give a low rating to in the same response.

If the model can properly and consistently recognize hallucinations, why does it return said hallucinations in the first place?

Models can get caught by what they start to say early. So if they model goes down a path that seems like a likely answer early on, and that ends up being a false lead or dead end, they will end up making up something plausible sounding to try and finish that line of thought even if it's wrong. This is why chain of thought and other "pre-answer" techniques improve results.

Because of the way transformers work, they have very good hindsight, so they can realize that they've just said things that are incorrect much more often than they can avoid saying incorrect things.


You’re right back at square one hoping you can trust the analysis is correct.

No, you absolutely are not. It's like an extra bit of parity, so you have more information than before.

Does that extra information come from a separate process than the LLM network? If not then, assuming the same output is not guaranteed from the same input as per usual, then all bets are off correct?

Sorry for the late reply, but if you read this, there is research that shows that prompting a LLM to take variety of perspectives on a problem (IIRC it was demonstrated with code) then finding the most common ground answer improved benchmark scores significantly. So, for example if you ask it to provide a brief review and likelihood of the answer, and repeat that process from several different perspectives, you can get some very solid data.

> How have we regressed to the point that I'm better off using a paper book than online charts for things that don't change?

because products that require iteration lend themself to subscription models which in turn mean a recurring revenue which is deemed superior to onetime payments for a 'finished product'.


We need to collectively stop using Google, but the alternatives are just not as good for some things.

The best one is probably Kagi, but let's be real: "normal" people would never pay for a search engine service. Well, "normal" people don't even know the difference between Google, Google Chrome and probably the internet.


I say this as someone who doesn't yet use Kagi but is increasingly warming to the idea: I think normal people may pay for a search engine, one day. People used to think the bottom had irrevocably fallen out of paying for media once internet piracy became a thing. Streaming services may be in an unappealing state now, but they at least showed that people can be persuaded to pay for something if it makes access to the things they love easier. We might be years away from it, but I wouldn't say never. And $10 to find things on the internet again seems like a more persuasive offer than what people are currently paying for streaming.

I'd pay 10€ for DDG

I wonder if there is a market for search engine where sites pay to be listed. Not a substantial amount like they would for ads. But a small compute fee for the spider plus a contribution for being indexed.

People will pay for clicks. If a small search engine with pay per list sends a lot of visitors website owners will for sure pay. The key I think is to have verified and unverified listings, and give people a free trial of being verified before bumping them out so they know what they're paying for.

I’d love to go full Kagi. I feel the family pricing, as a new user, is just a little too high for me.

A family of four, two kids and two adults; 20 dollars ex GST (excluding tax) is.. gosh.

With the cost of living growing so quickly.. It’s quite a defeating experience.

So in conclusion; I expand your criteria above. Even “not-normal” people may not pay for a search engine, though for financial reasons in my case.


My default is Duckduckgo but in the recent weeks they have massively upped how abrasive their ads are. Nothing wrong with there being search related ads but the sheer volume is getting out of control.

Hi, I work at DuckDuckGo - thanks for the feedback. Nothing much has changed in the past few weeks on our side, and so would love some more info so we can investigate. Do you mind sharing what country you are searching from, and whether you noticed on desktop or mobile? Also just FYI — you can turn ads off completely in settings if you want.

I am based in Australia. I am not sure if it is the quantity has increased, but their size definitely has. I usually have to scroll a full screen down before I start seeing non-paid results. I am on a desktop machine, mobile has been absolutely fine.

Also didn't know you could turn them off. I don't mind ads to support businesses when it is reasonable and DDG is a decent business. I just hope you folks don't try to keep up with the giants by playing their game. ;)


Got it, thanks for the follow-up, and for your support! It should be pretty rare that you would see no organic content above the fold. But we'd like to take a closer look, and if you have any particular queries where the ads are surprisingly tall, it would be useful to know.

Google was a decent search engine until the gold rush years.

Try kagi

With the forthcoming winter of synthetic content, we may easily find ourselves, in the coming few years, forced to resort once again to directories a-la AltaVista and Mozilla's. I really see no way Google would stop their ads activities, as these provide the financial backbone.

In a sense we resorted to the searchable message board, once an university homework assignment, in the form of HN here.


Rest assured the existing tech monopolies who are flooding the internet with hallucinated AI garbage will be there to sell us our own "Verified Human-Written Content" back to us.

Mind you this is then before the recent article that alleges Ben Gomes was pushed out of Google[0]. This was my feeling regarding that post, that search had been getting worse from before 2019.

[0] https://news.ycombinator.com/item?id=40133976


Seems like Larry Page and Sergey Brin should return to Stanford to do a postdoc. Irony intended.

The earlier 2019 discussion about how Google deliberately worsened search results to make people see more ads

https://news.ycombinator.com/item?id=40138486


It's a fair point about how awful recipe sites look without ad blockers, but this part is just plain incorrect:

> You can tell just by looking at the URLs that those sites are going to be worthelss blogspam.

At least two of the three results in the screenshot are from legitimate baking sites (Cookie and Kate, Sally's Baking Addiction) which are generally trusted sources online. I don't know anything about the third. But Google seems to have actually done a good job of highlighting recipes from reliable blogs.

The points about the compromised experience on those sites due to intrusive ads remain.


I just looked up Cookie and Kate. On my iPad I had to flick 7 times to get past the exposition on Crispy Roasted Chickpeas and find the actual ingredients. When I found the ingredients, they occupied a small squeezed sliver of the page. As I was counting the number of simultaneous ads surrounding the ingredient list (4 separate ads), a pop up covered them all and suggested I sign up for her newsletter.

The recipe looks good (chickpeas, olive oil, salt, spices, oh shit I stole her blog post). I also think the site counts as "worthless blogspam".


The problem is that Google started weighing time spent on page very heavily in their ranking algorithm - I don't remember at what point this happened but it must be about a decade ago by now. Every time a user clicks a Google result without using "Open in New Tab" and clicks the back button, Google gets a signal about how long they spent on the page. The longer a user spends on the site, the stronger the signal. Once all the SEO vampires figured it out, everyone started to pile on prologues to all their content, not just recipe sites. In my experience that was the beginning of the end.

Any recipe site that survived had to adopt the tactic or die, leaving only the spammers and the odd outlier with actual content to write about like Serious Eats. Same thing happened to Youtube and their preview photos; even the legit content creators had to start making those stupid bug eye images.


Yup. This is the Long Click metric.

Evaluating search is difficult because it's a tension: if users click a lot, is it because they find many valuable things, or because they didn't find what they were looking for?

If a user clicked just once, is it because they found what they were looking for or just that the rest of the results were so bad the user gave up?

The long click (user clicked, then didn't click again for a while) is a better metric, but also not ideal: did they stay because they found what they were looking for, or was the result just that confusing they had to stay to comprehend whether it was the right thing? Most often it's because they found what they were looking for, but the pathological cases hide in the middle: many similar correct results, winner is the one that makes the user a little slower.

(This has nothing to do with tabs or back buttons, by the way. It happens any time they can detect subsequent clicks on the search result page.)

I've worked in the search space (though on less evil projects than Google) and I still struggle with the question on how to evaluate search. If you have ideas, let me know!


One idea, but people will probably hate me for it: If you return to e.g. the google search site (hence: when the long click metric would be triggered) have a dialog on top saying ‘result great / OK / bad-or-confusing’. Can probably be gamed (bot nets trying to destroy the reputation of others) but at least a long time would not automatically mean ‘great result’. (In the arms race to combat destruction, it could be so that a ‘bad-or-confusing’ click would not actually push a value down, just not make it go higher).

Kind regards, Roel


This was tried with a +1 button around the time of Google Plus's launch.

> if users click a lot, is it because they find many valuable things, or because they didn't find what they were looking for?

Why do you care as a search engine? This is a natural human problem that can't be solved with technology, only by humans.

It used to be, that I went to page 5 of Google instantly, because that was where the real results were. The first few pages were people who knew more SEO than sense.

These days, that doesn't work since "semantic search" because now it appears to be sorted by some relevance metric and by about page 5 you start getting into "marginally related to some definition of what you typed in but still knows too much SEO to be useful."

The point is, this was already a solved problem if you knew to go to about page 4-5. Then people started trying to use a technical solution to a very human problem.


> Why do you care as a search engine?

Wait, are you really asking why a search engine would care how well it finds what the user is looking for?

Granted, there are a lot of search engines that sell themselves on other metrics ("it's fast!" or "it uses AI!" or "it's in the cloud!") but any serious search engine player strives to learn how good it is -- in practise -- at helping the user find what they are looking for. That's ultimately the purpose of a search engine.


> Wait, are you really asking why a search engine would care how well it finds what the user is looking for?

While a useful metric, it's an unknowable metric.

1. You have no idea if the user even knows what they are looking for, so how would you know that they found it?

2. You have no idea if the user found what they are looking for, maybe what they are looking for isn't on the internet?

3. You have no idea if the user is even looking for something, maybe it was just a cat running across the keyboard?

The only way to learn the answer is to have humans talk to humans. You can't game your way through it by using metrics.

It reminds me of this one time the CEO asked our team to add a metric for "successful websites" (we were a hosting provider) and we rebuffed with "define successful." They immediately mentioned page views, which we replied "what about a restaurant with a downloadable menu that google links to directly?" and back and forth with "successful" never being defined for all verticals and all cases. It just isn't possible to define using heuristics.


I disagree. It's unfortunate that some users don't know what they want, some want things that don't exist, and that some are cats. But most users are humans with a rough idea of an existing thing they are looking for. It's worth it for a search solution to find out how good it is at helping them. The cats add noise to that measurement, they don't invalidate it.

Do you philosophically agree there are websites that are more successful than others? If yes, then there are tangible qualities that distinguish this group from the other. They may be subjective, fuzzy, and hard to pin down, but they're still there. If no, a success measure is irrelevant to you but other people might disagree, and once thoroughly investigated, you sort of have to agree the measurement coming out of it reflects their idea of success.

In none of this am I saying it's simple or easy (I started this subthread by saying it's difficult!) but fundamentally knowable.

Yes, humans talking to humans is definitely the start. But then I'm posivistically enough inclined that I think with effort we can extract theories from these human interactions.


I didn’t go into all the problems with “successful websites” but it really is impossible to measure. For me, my business site is successful when I capture leads, my blog is successful when I write posts, a restaurant is successful when people show up to eat. There’s no way of knowing what variables and metrics constitute success without asking the person.

I had a CEO who searched for the related business search terms every morning. No clicks, he just wanted to see the ranking. The other day, I was searching for an open NOC page that I knew existed but couldn’t remember the search terms. Eventually I gave up, but I’m 90% sure I left the tab open to a random promising search result that had nothing to do with what I was really searching for. There’s a pdf that archive.org fought over and simply mentioning it results in a DCMA, you can find it now, but for nearly 20 years, you could only find rumors of it on the internet and a paper copy was the only way you could read it.

Even when I know what I’m looking for exactly, I sometimes open a bunch of tabs to search results and check all of them, (This is actually the vast majority of my non-mobile searches) especially because the search results are often wrong or miss some important caveats — especially searching for error messages.

The only way you could find out these searches were unsuccessful (or successful) is to ask. There’s no magic metrics to track that will tell you whether or not my personal experience found the search successful.


I feel like the problem is trying to turn human experience into a metric. Probably the better approach would be to have a well staffed QA team.

We should be mad at Yahoo for having fucked up. If anything, they could have spun out the search part and be remembered for it,.

I honestly don't think it's possible to have a QA team large enough to handle the gajillions of websites that come up and disappear every day. They just have to come up with better and better metrics until they find one that approximates the human experience the best.

You don't have to cover the long tail... Maybe just top 10% of topics would be a big improvement.

Google also massively reduced AdSense payouts over the years as well.

Result? Adsense-based websites started jamming in more ads per page to maintain their old revenue levels. Pages became longer so that more ads could be thrown in.


Why did people continue to engage with such trashy sites?

Where do you find out about metrics like this?

There are SEO industry nerds that scour Google patents for clues (this long click metric was an early 2010s patent that was granted in 2015), and Google lets information slip from time to time, either officially or unofficially.

The first site "cookieandkate" might look like blogspam but it wasn't.

After going through some random archived posts from 2011 & 2016 , I think it probably fell into the same trap the article mentioned and kind of proves how needless seo spam ruins websites.

[1] is a link to a recipe on the same site from back in 2011. It has some content at the top giving personal context and plenty of normal pictures of actual recipe, not those fancy artistic photos. It has that personal touch with no hidden agenda type feel.

[2] is a link to another recipe from 2016. The content and format is more or less same as 2011 with a bit more long form content.

Compare that with current posts on the site. The content looks similar but there is a lot of needless use of bold/emphasised content probably for seo. Every paragraph is worded like it has some call to action or has an agenda.

[1]. https://web.archive.org/web/20120109080425/http://cookieandk...

[2]. https://web.archive.org/web/20160108100019/http://cookieandk...


That's pretty depressing. I don't really do any kind of content marketing work these days and haven't really been around that industry for a decade, but I can only imagine how disappointing it must have been to start seeing your traffic drop off, seeing which results were winning in search compared to your own site, seeing how they were winning, and then having to add more and more shit to your own site in order to climb back up the rankings.

I got so fed up with this that I made a browser extension for it. It's in the Chrome Web Store and Firefox as well, but you'll have to build the xcode project in the Safari directory if that's your preferred browser.

https://github.com/sean-public/RecipeFilter


That's not entirely fair.

The problem is that Google forces actual good cooks to make their recipes look like worthless blogspam, but a good original recipe is not actually worthless blogspam, even when disguised in the way Google requires.


When it looks and acts like the spam sites, then what difference is there really? If I have to scroll 4 pages to find the ingredients and then scroll around like crazy to find the instructions (then scroll back and forth while cooking/baking) then it does not matter how good the recipe is, the page killed it for me.

I'd argue that most web users have a higher tolerance for ads than HN users, so they put up with the scrolling. And if it results in a tasty recipe, then they'll do it next time too, since that's seemingly the (tolerable) price to be paid for good food.

But lots of recipe sites now have a "jump to recipe" link at the top, so they've realised the junk is annoying for some fraction of their users. Although page junk is a pain, shortcuts for low-tolerance users seems like a good compromise.


Look it's not OK to milk humans like this. It's manipulative and rapey. Just because the NPC meme is true does not mean you get to hack their programming for a buck and call yourself a good community member and businessman.

Enough has to be enough!


Nobody forces you to put ads on anything.

The idea that every website or tool with lots of visitors should be monetized is sad.

Original author made a tool, why do you have to make money on it?

Perhaps it sad that websites without ads aren't ranked higher.


Because websites aren't free to build or run. No one is obligated to put ads on their site, sure. They're also not obligated to work for many hours to provide you with free content or pay $X/no to serve it to you.

But they can also have a separate job that doesn't ruin the internet and produce out of generosity, like some of us, free content that is not span ridden.

Also web hosting doesn't cost much when your website is well made with some frugality in mind.

And there are also better, cleaner ways to make money on the internet: getting rid of the ads and spam and having the content accessible to paid members.


While it is admirable that you are willing to produce content out of your own generosity, it seems a little optimistic to assume that everyone making content on the internet is both willing and able to share it for free.

I am somewhat curious to hear more about the better and cleaner ways to make money on the internet, but I have a suspicion that in some circumstances (such as recipes) they may put you at a competitive disadvantage. I certainly have no desire to pay to access recipes I find via Google searches.


Not engaging in fraud also puts you at a competitive disadvantage to those that do. Doesn't mean we have to be happy to be defrauded.

We need to find a metric for anti-profitability. I think that index could yield much higher quality results.

Detect sales/commercial language and structure,* and specifically target that for removal from results as if sales-oriented sites were hardcore porn and the child safety filter is turned on.

*Buy and cart buttons/functions, tables containing prices with descriptions but don't look like long-form reviews (which would be it's own filterable tag), etc, and domains trying to obfuscate are blacklisted permanently.


Really just removing all sites with ads would be a huge improvement. Regular old websites trying to sell you something are usually not nearly as bad as those that want to monetize you while pretending to be free.

Nowadays, there are numerous free hosting services for static sites.

Websites are practically free to build and run (if you treat it as a hobby and don’t count your time). I agree on the rest though.

The thing is, even if you don't put ads on a page or tool, Google will sometimes not index it because it doesn't think there's 'enough' content, no matter how little sense that makes. At least half the issues with recipe sites and company sites come from them trying to get a site that doesn't need reems of text content indexed by a search engine that seems to blindly value the quantity of content and time spent on the page over all else.

The people who have bad content are the ones to get money, while those who have good content are not. Logical result is that people with good content stop producing that content while the people with bad content continue producing it and being rewarded for it.

Look I hate these SEO-laden pages just as much as the next guy, but I think the binary classification of "good content" and "bad content" lacks nuance. I would refer to it instead as "bad packaging" of (often) good content. As much as I loathe having to hunt for the "jump to recipe" button on my phone each time I open one of these pages, I also appreciate being able to freely view recipes which I enjoy and cook regularly.

I just stopped looking for receipts online if I can avoid it. It became literally faster and easier to search in old school cook book. And there was period when I considered those completely outdated.

An earnest writer and spammer might reach the same method in different ways, but the result is still blogspam.

>I also think the site counts as "worthless blogspam".

This is a strange complaint. You're visiting the blog of a woman who writes about cooking. Can't speak to the ads (I block them), but her site looks pretty good. Why do you think she should list her recipe like some kind of index? Perhaps she blogs for her own enjoyment, not for yours?

Have you ever read popular cook books? They aren't simply listings of ingredients, either.


You should try viewing the site without your ad blocker turned on. Here's a preview: https://imgur.com/a/FDI0L6i. The red arrow is where ad #4 was when I checked it out last night.

Edit: real cookbooks was basically my answer to this problem to be honest. Some of them actually have fun stories in them. Most of them have a standard-ish "recipe on one page, photo on the opposite page" format. But none of them have promo codes for shoes, supplements, or terrible Canadian coffee chains in them.


I created this simple site exactly for this: https://recipebotpro.com/

You enter the name of your desired dish and have a plain recipe with steps in 5 seconds. No ads etc


I suspect you’ve bitten off more than you can chew.

I checked four recipes. One was a joke made out of genital references. Three began with near identical “embark on a journey of flavour” pseudo-SEO bullshit.


FWIW, I tried a few recipes too and they came out just fine, without the usual clutter. I further anticipate that this is the direction we'll be going in general, "search" as we know it was a ~30 year period where Google reigned supreme. The world since moved on.

Yeah, but the new gatekeepers and tech are going to be worse. Ai companies, where you never see original human content any more. Just what the company’s ai shows you

lol. "Cups"... No serious recipes there.

I generally use https://www.taste.com.au. No bullshit prologue about how a distant relative used to make the recipe in question. Just and overview, photo ingredients and steps. Everything else is secondary and usually worthless.

Why when i try to click that link it links me to tags.news.com.ua ? My dns filters are blocking it.

Hmmm. First one I clicked from their home page:

https://www.taste.com.au/baking/galleries/autumn-cakes/p6d5x...

> When the weather starts to finally cool down and the evenings ...

Just No.


That's not a recipe, it's a short intro to a list of recipes. Just Learn To Read.

"Just Learn To Read" adds nothing to the sentence that precedes it. The point was already made correctly and well. You should avoid when possible starting a comment you want to actually be read with an insult or ending it with a snap. It degrades the quality of the conversation.

I really don't care.

> it's a [...] intro to a list of recipes

That's exactly the point. It doesn't need to be there, doesn't add any value whatsoever, etc. ;)


I laughed out loud at this. You haven’t looked up many recipes in the last few years, have you? 95% of recipe results are nonsense and ads. It can take a few minutes of searching just to identify ingredients sometimes. My wife and I have been improvising recipes lately to avoid digging through all the junk. I actually recommend this: you can sort of make stuff up based on prior experience and things turn out pretty well sometimes.

Or, put your simplified recipes in a binder near the kitchen

Anything to avoid going to google to find a recipe


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: