Compare Google, Bing, Marginalia, Kagi, Mwmbl, and ChatGPT

marginalia_nu · 2023-12-31T03:04:35.000000Z

While I've made huge improvements to the algo recently, I do think Marginalia Search got a bit lucky with the sample queries, as it is still IMO far more hit and miss than many alternatives, but that also speaks for how hard evaluating search quality is.

Its efficacy is also strongly dependent on understanding that it's a keyword search engine with no semantic understanding.

someguy5281 · 2024-01-01T03:19:51.000000Z

> Its efficacy is also strongly dependent on understanding that it's a keyword search engine with no semantic understanding.

Good. I love keyword search.

"Semantic understanding" can be so biased and ... just shady sometimes.

marginalia_nu · 2024-01-01T13:18:05.000000Z

It's tricky though. I think a lot of people think they want raw keyword search, but what they really want is a search experience that makes intuitive sense.

If you lean too much into embeddings and so on, it's easy to get errors that don't make sense to a human being. It's extremely frustrating when you experience "I typed X, why am I getting results about Y?!"

That said, I think there's a sweet spot with some magic, where it genuinely just makes search better. But it's like perfume, if it's immediately obvious that it's there, it's probably a fair bit too much.

scarface_74 · 2024-01-02T14:09:06.000000Z

Keyword search leads to things like every website would put meaningless words in the meta section of their website so it would be picked by Altavista

chupapimunyenyo · 2024-01-03T00:41:14.000000Z

No, if it's done right. Source: I made my own search engine

tentacleuno · 2023-12-31T04:24:14.000000Z

> [...] but that also speaks for how hard evaluating search quality is.

Would you be able to share some of your personal highlights regarding this?

I've partially kept up-to-date with the DIY, non-corporate search space (YaCY and friends). I'd love to understand a bit more behind the engineering decisions made when creating a search engine; it seems like a very hard problem to solve.

P.S. Marginalia is a very impressive piece of work, overall -- I've heard nothing but positive remarks from users on here. I've been meaning to try it for a while, but time constraints have... well, constrained, thus far.

golol · 2023-12-31T05:34:04.000000Z

I just tested Mariginalia and it was completely unable to lead me to a Wikipedia or imdb page when searching for "driver ryan gosling" and variations. It just listed lots of random articles.

wisemang · 2023-12-31T06:04:40.000000Z

That.. is kind of the point of this particular search engine.

> This is an independent DIY search engine that focuses on non-commercial content, and attempts to show you sites you perhaps weren't aware of in favor of the sort of sites you probably already knew existed.

golol · 2023-12-31T07:55:36.000000Z

Well that makes sense, but I wanted to push against the result that the OP seems to take away from their test, which was that Marginalia seems to work well for the common user.

marginalia_nu · 2023-12-31T12:24:22.000000Z

There's also a known bug with Wikipedia in particular, I do index it but the results are never ranked particularly high. I haven't fixed it because I don't want Wikipedia to be the #1 result for every search. Feels like most people are aware of Wikipedia and don't need help finding it.

lbalazscs · 2023-12-31T13:21:44.000000Z

I often do a Google search, and then go directly to the Wikipedia result. My reasoning is that during the initial search, I don't know if there's a Wikipedia page about that topic, and I might need a fallback option.

strbean · 2024-01-01T05:19:02.000000Z

Unless it's something related to medicine; then you have to explicitly add "wiki" to the query. Some public health thing to discourage hypochondriacs I guess, but it's very annoying.

treetalker · 2023-12-31T14:50:54.000000Z

Thanks for your work!

I have a suggestion for the “About” section at the top of Marginalia’s landing page. I think it would read better like this:

> This is an independent DIY search engine that focuses on non-commercial content, and attempts to show you sites you perhaps weren't aware of [instead] of the sort of sites you probably already knew existed.

Showing one thing “in favor of” another seems contradictory in this case.

marginalia_nu · 2023-12-31T04:37:31.000000Z

Honestly I understand it well enough that I see it is surprisingly hard, but not enough to have good solutions...

ta988 · 2023-12-31T17:19:20.000000Z

Just my feedback after trying to finally get to what it is exactly.

I tried to find marginalia on DDG, not on the first page. Google has it after some garbage. If I go to marginalia.nu I get a SSL error. search.marginalia.nu works

If i search on marginalia for duckduckgo there first link is somewhat relevant but is about the app, all the other links are related to DDG but of curious relevance.

If I search for ublacklist mentioned above, I do not see anything directly relevant.

marginalia_nu · 2023-12-31T18:30:37.000000Z

Hmm, what's your browser? I renewed the cert today... Only thing I can think of is that it might not like a wildcard cert for the bare marginalia.nu domain.

jldugger · 2023-12-31T19:30:13.000000Z

Safari doesn't like https://marginalia.nu. Probably because *.marginalia.nu is not valid for the base domain. Add it as a Subject Alt Name

marginalia_nu · 2023-12-31T19:57:32.000000Z

Try now?

jldugger · 2023-12-31T20:14:30.000000Z

Looks like you've fixed your bug.

kzrdude · 2024-01-01T11:53:06.000000Z

Hi, your encyclopedia experiment(?) is also very inspiring. I really think it works, it makes it much easier to read the articles.

marginalia_nu · 2024-01-01T13:18:47.000000Z

Yeah, it's pretty cool. I wish I had more time to polish it up a bit. A least make the mobile experience work a bit better.

ta988 · 2023-12-31T19:25:42.000000Z

Firefox android

marginalia_nu · 2023-12-31T19:57:27.000000Z

Hmm, can't reproduce it myself, but firefox has a nasty habit of quietly "repairing" these types of misconfigurations by redirecting from one subdomain to another. I've added marginalia.nu as a SAN, should hopefully work now.

ta988 · 2024-01-01T06:16:07.000000Z

It does indeed work. Thanks for the quick fix, and I realize that there is more than a search engine here!

bombcar · 2023-12-31T04:24:09.000000Z

I notice you completely avoid the question on how a single developer can do so well ;)

I do think that search has gotten much worse but my ability to know the magic words like “ublock origin” instead of “Adblock” and “yt-dlp” instead of “download YouTube” and phrase my search has gotten better.

We’ve all been doing prompt engineering against the Internet-wide LLM that is the spam houses.

marginalia_nu · 2023-12-31T04:35:58.000000Z

> I notice you completely avoid the question on how a single developer can do so well ;)

As much as I enjoy the notion of somehow being a 10,000X developer, it's probably mostly that modern search is a filtering problem, and MS does filtering fairly well.

endisneigh · 2023-12-31T02:47:36.000000Z

I reckon these days search is pretty difficult and everyone knows how to game it. I recommend using a search engine that lets you effectively change which sites are shown. You can do this with Kagi, or with Google's Programmable Search Engines - I'm sure there are more too.

In particular I block Youtube, not because they aren't sometimes correct, but because I don't want videos polluting the regular results - it just takes too long to get info from videos.

An ability to upvote results for a given query seems tantalizing but I bet it would be gamed too. The DIY approach seems to be the only tractable one.

In my case I only only results from domains I believe are correct. The whitelist approach does have downsides. Usually I'll vet new potential domains through social means like Reddit and this site, rather than identifying them through the search results. I believe there's an inherent tradeoff between discoverability and the gameability of the results.

Though I do sympathize with folks who reminisce about 2008 Google Search results, there were probably orders of magnitude less content out there and a complete ignorance to how valuable your place is on your business and thus no SEO.

I also personally disagree that yt-dlp is the "correct" result for the average user when they search Youtube Download. I highly doubt the average user would know or care to use the command line. A website front end would be more actionable for them.

ysavir · 2023-12-31T03:31:01.000000Z

> In particular I block Youtube, not because they aren't sometimes correct, but because I don't want videos polluting the regular results - it just takes too long to get info from videos.

Funnily enough, lately I've been prioritizing YT videos more when searching. So many sites now are just regurgitated SEO farms with minimal quality, and easy to see why: it's minimal effort to produce and cheap to host. But making a video takes time and effort, so has a much higher barrier to use as a click farm.

More than once when traditional search failed me, I went to YT and found some video from 2009 clearly and eloquently explaining what I'm looking for in detail, and without any distractions because the person authoring the video clearly didn't specialize in the media format or show interest in experimenting.

I've found it to also be a better source when looking a product to buy. Want to know which fan to get? Turns out there's a channel from a dedicated guy who keeps finding ways to test different fans and their utility and with multiple videos demonstrating his approach and findings. The mainstream channels aren't all that useful, but there's a ton of "old web" style videos (some even recent) passionately providing details for almost anything you'd think to search. And they're a gold mine.

imiric · 2023-12-31T04:25:44.000000Z

> But making a video takes time and effort, so has a much higher barrier to use as a click farm.

> The mainstream channels aren't all that useful, but there's a ton of "old web" style videos (some even recent) passionately providing details for almost anything you'd think to search. And they're a gold mine.

This won't be the case for long. YT is already starting to be polluted with spam and AI generated content, which will get more and more common. The same thing that happened to the web in text form, will happen to videos.

I think the only solutions are using allowlists for specific domains, and ironically enough more AI to filter specific results. Or just straight up LLMs instead of web search, assuming they're not trained on spam data themselves.

danieldk · 2023-12-31T07:33:59.000000Z

Yeah. I was recently looking for videos comparing two smartphones and among top ranked videos there were videos that just show the phones side by side and the video consists of showing specs side by side and videos that just have LLM-generated text, added to the video with TTS.

ysavir · 2023-12-31T12:46:55.000000Z

One critical difference is the date attached to youtube videos. It's easy to verify that a video was made before this tech was available, but you can't do that with websites, or search engine result pages.

It does limit utility for more modern needs, unfortunately.

lrem · 2023-12-31T13:32:49.000000Z

Note that the problem of filtering bad data out of learning material isn’t inherently easier than filtering same out of search results.

robrenaud · 2023-12-31T03:44:19.000000Z

Would a browser feature that skipped to the relevant parts of the video based on closed captioning and understanding search intent be useful? It seems like this would be a good way for Google to fight to stay relevant in UX vs having the chat bots just quickly spitting out a readable answer. Hunting through ad laden webpages is annoying. Seeking to the relevant section of the video is a solvable problem, especially for videos above some viewership threshold.

tentacleuno · 2023-12-31T04:21:31.000000Z

> Seeking to the relevant section of the video is a solvable problem

...and it has already been solved, though partially: SponsorBlock allows people to add a "Highlight" section to a video, which denotes the part of the video which the user most likely wanted to see (sans the "what's up guys", "like and subscribe", etc.)

Of course, it's not perfect: it relies upon humans doing the work, though some may see that as a positive over something more computerized.

nulld3v · 2023-12-31T03:55:34.000000Z

I've definitely seen Google do this already: https://searchengineland.com/google-tests-suggested-clip-sea...

tentacleuno · 2023-12-31T04:32:06.000000Z

Google seems to be taking much more advantage of YouTube's transcription feature lately. The first addition was the (ok, gimmicky) animation on the Subscribe button when someone says the dreaded like. Hopefully a sign of things to come.

Overall AI summaries are very welcome for a certain subset of YouTube which is sadly dominated by sponsored, clickbait, and ad-driven content.

dcow · 2023-12-31T04:00:04.000000Z

Didn’t Google try this already? It seems useful to me, at least. IMO the next frontier of search is not better hypertext, it’s podcasts, audio, and video.

plagiarist · 2023-12-31T03:45:49.000000Z

Do you have some tips for finding concise videos that answer the question you are asking? I am finding more and more obvious LLM bullshit in results, so I am willing to try some other tactics. But I am not ready to spend the minutes watching videos to see if it is actually relevant or a waste of time, always artificially long to increase ad revenue.

crznp · 2023-12-31T04:49:01.000000Z

For me, it really depends on the type of video. For fixing cars, I'm usually looking for something specific enough that there isn't a lot of chaff. It was probably recorded and edited on a phone just to splice the clips together. Probably the default thumbnail that youtube extracted from the video.

For product videos, if Project Farm did it, look there first. Otherwise, I look for someone has a lot of videos for competing products with basically the same format, not over 10 minutes.

Tech videos are the hardest, I often still prefer text. Maybe look for links to the docs in the description? I still get duds though.

williamcotton · 2023-12-31T15:59:14.000000Z

I don’t know much about fixing cars, but yeah, YouTube is a treasure trove for tacit knowledge.

ysavir · 2023-12-31T12:49:52.000000Z

Wish I did, but here you're at the algorithm's mercy, unfortunately. One possibility is subbing/accruing watch time on channels that you find provide you the right value, so that the algorithm might recommend similar channels on other subject matters.

necovek · 2023-12-31T11:16:21.000000Z

That's curious, I generally hate video due to inability to glance over content, and the few attempts I made to actually find useful information I searched for resulted in... spammy extra low effort video content that did not answer my questions.

williamcotton · 2023-12-31T15:56:15.000000Z

Depends on what you’re looking for. A blog post about how to play Search and Destroy by The Stooges is not as useful as a video of James Williamson himself showing you the riffs!

necovek · 2024-01-01T06:13:41.000000Z

Well, I don't think I'd be able to learn much just from watching the concert: teaching is fundamentally different from doing.

So I think even that example does not universally hold. I'd still appreciate a write up with tips on what's important and if there are any transitions to focus on with only the bits on video where some of that is demonstrated.

Now, I can barely contort my fingers into one riff, so I lack the knowledge to understand what I am missing, but I'd still have a hard time learning that from video.

williamcotton · 2024-01-01T15:07:37.000000Z

I’m not talking about concert footage, I’m talking about James breaking the song down and showing you the riffs at quarter speed.

Until a recent YouTube video I was playing the song incorrectly. It’s blazing fast and the mix is sort of insane so it’s very hard to hear exactly what is going on. And the tablature isn’t going to let you see how his body fits into the groove.

This is tacit knowledge we’re talking about, not book learning. Guitar instruction is always hands on.

necovek · 2024-01-02T07:59:14.000000Z

Almost everything is hands-on (everything apart from things you really can't do hands-on, like exploring black holes): I don't remember seeing someone come out of reading a book on programming and being a master programmer.

But video is not hands-on any more so than text: if it was, live concerts and sports games and other performances would not be such a big deal. Sure, video is richer in some signals (audio/video), but poorer in others (introspection, pacing and focus...).

That does not mean I can't read to understand a new topic or to be prepared to look for subtleties in a hands-on performance.

If anything, to a great student, they should be complementary, but still, each student will have one or the other contribute more to their learning, and that depends both on the teacher, but also on the student.

lamontcg · 2023-12-31T04:28:42.000000Z

> Though I do sympathize with folks who reminisce about 2008 Google Search results, there were probably orders of magnitude less content out there and a complete ignorance to how valuable your place is on your business and thus no SEO.

That was a decade after Google was created and people certainly understood SEO and Google was constantly updating its algorithm to punish people who were trying to game the algorithm.

The wikipedia page on "link farming" for example references it happening as early as 1999 and targeting SEO on inktomi:

https://en.wikipedia.org/wiki/Link_farm

I remember some internal presentations at Amazon around ~2004 about how boosting Google SEO on Amazon web pages increased traffic and revenue (and Amazon was honestly a bit behind-the-curve due to a kind of NIH syndrome).

bee_rider · 2023-12-31T15:43:16.000000Z

At the time it seemed like Google was winning, though. SEO seems to have gotten really good, or maybe Google just gave up.

stevage · 2023-12-31T04:34:56.000000Z

I have a hard time believing it's so difficult for a search engine to distinguish between a credible, respected website that has been around a while with some generated garbage that exists to be a search result. We humans can tell them apart, so in principle, computers can too.

Nextgrid · 2023-12-31T04:50:51.000000Z

Yes, this should be table stakes for a classifier - a company with the resource of Google can definitely solve that problem if they weren't themselves in the business of spam (advertising) and benefited from spam sites (as they often include Google ads/analytics).

navigate8310 · 2023-12-31T07:19:00.000000Z

Google is quite quick in plugging holes in AdSense but AdWords.

ametrau · 2023-12-31T17:17:36.000000Z

> table stakes

Always “table stakes”. Do you think in buzzwords also? I’ve always wondered this. Or do you think normal words and then translate it into this bandwagoning / membership proving garbage ?

dang · 2024-01-01T20:07:52.000000Z

Hey, please don't cross into personal attack on HN. It's not what this site is for, and destroys what it is for.

If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.

Edit: unfortunately your account has been breaking the site guidelines in a lot of other places too—here are some recent examples:

https://news.ycombinator.com/item?id=38825624

https://news.ycombinator.com/item?id=38825543

https://news.ycombinator.com/item?id=38783196

We eventually have to ban accounts that post like this, so if you'd please stop doing that, we'd appreciate it. On HN the idea is: if you have a substantive point, make it thoughtfully; if not, please don't comment until you do.

pixl97 · 2023-12-31T17:30:32.000000Z

I guess this brings up the question of how good are humans at doing this across a wide number of domains on average?

The other question I have is how long do these garbage results stay up for a particular query on average?

dandrew5 · 2023-12-31T20:58:44.000000Z

Google's PSE is neat but there isn't a good way to manage switching between them. They could easily add a little dropdown to let you select which one to use as part of the public link UI they provide for each one individually. Giggle[1] gives me this ability and I run it locally (alongside Kagi) for more specific things to target domain lists I've been building over the years.

1. https://github.com/dan-lovelace/giggle

kristopolous · 2023-12-31T03:14:46.000000Z

I'm a big fan of the non commercial site search engines because of the gaming aspect. If you're not generating revenue from the clicks the game mostly goes away.

I'm not saying people aren't entitled to make some money, but it clearly incentivizes user hostile behavior.

Maybe make it an option because legitimate sites like journalism also use this model.

Renaud · 2023-12-31T03:23:25.000000Z

Subscription model like Kagi seems to work pretty well against gaming the results.

Their only remaining incentive is to be good enough that people keep paying for the service.

Nextgrid · 2023-12-31T04:44:22.000000Z

It works not because they're somehow smarter or have more resources than Google at detecting spam/SEO, it's because unlike Google (and other ad-supported search engines), they make money from result quality and have an interest in blocking spam.

Google on the other hand makes money off ads (whether on the search results page itself or on the spam sites), so spam sites are at best considered neutral and at worst considered beneficial (since they can embed Google ads/analytics, and make the ads on the search results page look relatively good compared to the spam).

Black-hat SEO has been around since the early days of search engines and they managed to keep it at bay just fine. What changed isn't that there was some sudden breakthrough in malicious SEO, it's that it was more profitable to keep the spammers around than to fight them, and with the entire tech industry settling on advertising/"engagement" as its business model, the risk of competition was nil because competitors with the same business model would end up making the same decision.

The same reason is behind the neutering of advanced search features. These have nothing to do with the supposed war on spam/SEO, so why were they removed? Oh yeah because you'd spend less time on the search results page and are less likely to click on an ad/sponsored result, so it's against Google's interests and was removed too.

ec109685 · 2023-12-31T05:07:24.000000Z

Kagi works because there is no incentive for SEO manipulators to target it since their market share is so small.

Super tinfoil hat to believe Google wants to send users to blog spam websites (e.g. beneficial to Google).

Anytime there is money to be made, there is an effectively infinite amount of people trying to game the system.

kristopolous · 2024-01-01T07:36:58.000000Z

That's why taking the money out of the click is effective.

There can be other models for making money, but methods that really on casting a wide net and driving low quality traffic is the thing that shouldn't be indexed or at least labeled as such

lanstin · 2023-12-31T05:21:52.000000Z

Google is a complex system so “want” can just include we are making money from the blog spam and while we don’t like it other things take priority over fighting it as effectively as we could.

ZeePelli · 2023-12-31T12:24:30.000000Z

It's never tinfoil-hat to assume that a corporation is, at very least, making sure not to fight too hard against any activity that brings it more revenue.

whakim · 2023-12-31T05:15:30.000000Z

But the author tried Kagi and the results don't appear to be noticeably different, filled with scammy adspam just like Google and Bing. Kagi's results seem to mostly aggregate existing search engines [1], so this isn't much of a surprise. Perhaps a subscription-based service that operates an index at Google's scale might help, but no such thing exists to my knowledge.

[1] https://help.kagi.com/kagi/search-details/search-sources.htm...

greggh · 2023-12-31T16:35:17.000000Z

Right, but Kagi has built in tools to make it easy to fix that. Blocking those spammy sites from ever showing up again. Moving certain sites up the ranking, and so on. These features mean that over time my Kagi results have become nearly perfect for myself.

whakim · 2023-12-31T19:39:13.000000Z

This is addressed in the article. As Hacker News readers and expert computer users, we have a bag of tricks that we can reach into in order to make our searches perform better. With a similar level of effort and an expert user's intuition you can get good results out of any search engine. Not so for the average user. In fact, again paraphrasing the article, Google's original claim to fame was that you didn't have to spend a lot of time doing exact keyword matching and fancy tricks in order to get good results.

teeray · 2023-12-31T04:04:29.000000Z

> it just takes too long to get info from videos.

I can’t wait until video transcripts get fed into LLMs just to eliminate the whole “This video is sponsored by something-completely-unrelated, more about them later. What’s up Youtube, remember to like, share, subscribe… 5 entire minutes pass on similar drivel… the actual thing you want, but stretched out to an agonizing length”

execat · 2023-12-31T04:16:29.000000Z

You need SponsorBlock.

Usually people leave a "highlight" marker which tells you where you're supposed to jump to. Along with the regular "This video was brought to you by <insert>VPN".

0x38B · 2023-12-31T04:30:46.000000Z

Re: Kagi, I heard about it on HN, tried it for 100 searches, then subscribed. When I search for random JS and CSS things, MDN is the first result, and if it isn't, I can downrank whatever spammy site(s) are on top.

---

I wish I had a local LLM trained to detect clickbait and or low-effort content. I imagine searching YouTube and having all the clickbait collapsed together (just like Kagi condenses listicles), with the remainder being potentially high-quality content. Don't know how feasible this is right now.

freeAgent · 2023-12-31T08:07:41.000000Z

Just use the Kagi Summarizer on YouTube videos and you don’t have to waste time watching trash. It’s a great life hack.

xigoi · 2023-12-31T16:54:31.000000Z

How does that work? Does it scrape the auto-generated captions?

freeAgent · 2024-01-01T05:24:25.000000Z

Yeah, it uses the auto-generated transcript.

shados · 2023-12-31T05:01:59.000000Z

I became a huge fan of Kagi after seeing it on hacker news too. It's amazing how good a search engine can be when it's not full of ads.

D13Fd · 2023-12-31T05:35:00.000000Z

Yeah. At first I primarily used Kagi to move away from Google as a company, hoping for results that were equally good. But Google search actually feels crappy now in comparison.

qudat · 2023-12-31T13:00:39.000000Z

Been paying for Kagi for 6+ months and very happy with it. I’m pretty anti subscriptions so that’s saying a lot for a service that is otherwise free.

I do have to dump into google for local searches every once in awhile, but otherwise happy with it.

0x38B · 2023-12-31T23:27:38.000000Z

I keep Google Maps around for a similar reason; Apple Maps works well, but things like business hours are wrong often enough for me to double-check in Google Maps.

qudat · 2024-01-01T00:03:18.000000Z

Yep, same. Yelp is not nearly as good in Apple Maps but for in town directions it works great for me.

mgerullis · 2024-01-02T10:00:08.000000Z

You use yelp for directions?

s3p · 2024-01-02T16:55:19.000000Z

No for business info. Apple pulls directly from Yelp. issue is when you want more info or you want to get a closer look at the business and tap any of its images, it will take you straight to the app store to download Yelp.

freediver · 2023-12-31T05:58:24.000000Z

Current Kagi results for those without an account to compare:

youtube downloader

https://kagi.com/search?q=youtube+downloader&r=us&sh=_szITdy...

ad blocker

https://kagi.com/search?q=Ad+blocker&r=us&sh=-BHzV2ZoCDpmgOu...

download Firefox

https://kagi.com/search?q=Download+Firefox&r=us&sh=zkkmc_EQX...

why do wider tires have better grip?

https://kagi.com/search?q=Why+do+wider+tires+have+better+gri...

why do they keep making cpu transistors smaller?

https://kagi.com/search?q=Why+do+they+keep+making+cpu+transi...

vancouver snow forecast winter 2023

https://kagi.com/search?q=Vancouver+snow+forecast+winter+202...

I agree with the author that there is too much spam on the web. I think Kagi in general does a pretty good job at downranking it (number of ads/trackers is a negative ranking signal on Kagi) but we can always do better. Kagi has special search modes like "Small Web" which virtually eliminates spam.

I welcome such scrutiny from the community. Please continue to keep us honest.

asah · 2023-12-31T11:03:08.000000Z

Kagi gives me websites that require more clicking; Google just gives me reasonable answers and I don't see spam in your examples.

"why do wider tires have better grip?"

Wider tires provide more grip due to a larger contact patch with the road. While it's true that friction is not directly dependent on surface area, a larger contact patch allows for more even weight distribution and better traction, particularly during cornering. This can result in improved handling and stability.

"why do they keep making cpu transistors smaller?"

Smaller transistors can do more calculations without overheating, which makes them more power efficient. It also allows for smaller die sizes, which reduce costs and can increase density, allowing more cores per chip.

"vancouver snow forecast winter 2023"

The forecast for the 2023/2024 season suggests that we can expect another winter marked by ample snowfall and temperatures hovering both slightly above and below the freezing mark. Be prepared ahead of time.

algas · 2023-12-31T22:44:44.000000Z

That first result re: tires is simply wrong. Wider tires don't have a larger contact patch; the size of the contact patch is determined by the weight of the car and the air pressure in the tires:

    A = W / P

So the reason wider tires improve handling is more complex and subtle. Also, FTA:

    Assuming a baseline of a moderately wide tire for the wheel size.
      - Scaling both of these to make both wider than the OEM tire (but still running a setup that fits in the car without serious modifications) generally gives better dry braking and better lap times.
      - In wet conditions, wider setups often have better braking distances (though this depends a lot on the specific setup) and better lap times, but also aquaplane at lower speeds.
      - Just increasing the wheel width and using the same tire generally gives you better lap times, within reason.
      - Just increasing the tire width and leaving wheel width fixed generally results in worse lap times.

A full accounting of the effects of changing tire width should explain all of these effects.

Torkel · 2024-01-01T01:55:22.000000Z

So, why is it?

Such a nerd snipe this one. 400+ comments and still could not get the answer.

kipchak · 2024-01-02T15:32:52.000000Z

As a partial explanation, maybe it's not the case that area = weight/tire pressure as an absolute rule? In which case we get back to the layman knowledge of "wider tires have a greater contact area."

As an extreme example, run flats at atmospheric tire pressure don't drastically change their area.

https://web.archive.org/web/20090327161537/http://performanc...

From https://www.6speedonline.com/forums/996-turbo-gt2/242759-imp...

poyu · 2023-12-31T12:25:17.000000Z

My only gripe with these quick answers or excerpts is sometimes it doesn’t fully give the context. After it’s given me couple wrong results I stop trusting it and always read the source. If I were to read the source might as well click into the website and read the whole thing.

pbronez · 2023-12-31T13:05:03.000000Z

I often default to using Kagi’s FastGPT when I want direct information instead of a starting point for web browsing. It combines web searches with LLM summarization to produce an integrated, sourced answer. Kinda like “New Bing” with a better interface. You can trigger this mode by adding “!fast” to your query, going to https://kagi.com/fastgpt, or using their API. Here’s what it returned for me on those specific queries: ========== "why do wider tires have better grip?" Wider tires provide better grip due to having a larger contact patch with the road surface. A larger contact patch means the tire can disperse the weight of the vehicle over a wider area, reducing pressure. This allows for more friction between the tire and road, improving traction and cornering ability. However, wider tires do not necessarily provide better grip in wet conditions. [1][2][3][4][5] 1. wheels - Does a wider tire always mean more grip/traction - Motor ... 2. Do Wider Tires Give You More Grip? Not All the Time 3. Learn Me: Wider tires = more Grip? = Faster cornering?| Grassroots ... 4. Wide or narrow tires? Here's how your choice will affect driving ... 5. VIDEO: Do wider tires offer more grip? - BMW M3 and BMW M4 Forum ========== "why do they keep making cpu transistors smaller?" There are a few key reasons why CPU manufacturers keep making transistors smaller: 1. Smaller transistors allow more transistors to be placed on a single chip, enabling increased processing power. More transistors results in more capabilities and performance. [1] 2. Smaller transistors use less power and produce less heat. This allows CPUs to operate faster and with more cores/threads within the same power and thermal envelope. [2] 3. Moore's Law, an observation made by Intel co-founder Gordon Moore in 1965, predicted that the number of transistors on integrated circuits doubles approximately every two years. Following this trend has driven the semiconductor industry to continue shrinking transistors. [3] 4. Smaller transistors can switch on and off faster, enabling increased clock speeds. This directly leads to increased performance. [1] However, we are rapidly approaching the fundamental physical limits of transistor shrinkage. Transistors cannot be smaller than an atom, which is around 0.2 nanometers for silicon. [4] Alternative technologies like carbon nanotube transistors may be needed to continue performance growth beyond this point. [5] 1. Why does computer chip process size have to keep getting smaller ... 2. Why does smaller = less heat? - CPUs, Motherboards, and Memory ... 3. What Is Moore's Law and Is It Still True? 4. Transistors are an essential building block used in almost every ... 5. These Transistor Gates Are Just One Carbon Atom Thick - IEEE ... ========== "vancouver snow forecast winter 2023" Based on the context provided: The winter of 2023-2024 in Vancouver is expected to see less snow and milder temperatures than typical winters. Forecasts predict a below average snowfall season with warmer weather [1][2][3][4][5][6]. Long range forecasts through December 2023 for southern British Columbia call for showers along the coast and less precipitation inland [7]. Metro Vancouver's fall and winter is also predicted to be a potentially 'down' snow year [4]. Confidence is high for a mild winter in western Canada with reduced snowfall amounts [5]. However, specific snowfall totals are uncertain given the long lead time [6]. 1. Vancouver winter weather: Less snow, milder temperatures ... 2. 2023-2024 British Columbia Winter Forecast Preview | OpenSnow 3. Snow Prediction Vancouver Winter 2023/24 — Alblaster Snow ... 4. Metro Vancouver's fall, winter forecast | CityNews Vancouver 5. What will this winter be like? Grab the hot cocoa — here's your 2023 ... 6. Canada's Winter Forecast: El Niño a critical factor for the season ... 7. 60-Day Extended Weather Forecast for Vancouver, BC | Almanac.com

virgildotcodes · 2023-12-31T05:27:18.000000Z

I really don't understand why anyone writing articles about ChatGPT uses 3.5. It's pretty misleading as to the results you can get out of (the best available version of) ChatGPT.

For comparison, here are all the author's questions posed against GPT4:

https://chat.openai.com/share/ed8695cf-132e-45f3-ad27-600da7...

latexr · 2023-12-31T13:28:38.000000Z

> I really don't understand why anyone writing articles about ChatGPT uses 3.5.

Because that’s what most people have access to. It’s absolutely worthless to most readers to talk about something they’ll never pay for and it’s not the job of random third-parties to incentivise others to send money to OpenAI.

What I really don’t understand is why anyone gets so hung up about it and blames the writer. If you’re bothered by people using 3.5 you should complain to OpenAI, not the people using the service they make freely available.

Anecdotally, I find this excessive fawning about 4 VS 3.5 to be unwarranted.

https://news.ycombinator.com/item?id=38304184

virgildotcodes · 2023-12-31T16:04:27.000000Z

> Because that’s what most people have access to.

I’d agree with this rationale if the author clearly communicated their choice of model and the consequences of that choice upfront.

In this post the table of results and the text of the post itself simply reads “ChatGPT” with no mention of 3.5 until the middle of a paragraph of text in the appendix.

> It’s absolutely worthless to most readers to talk about something they’ll never pay for and it’s not the job of random third-parties to incentivise others to send money to OpenAI.

The “worth” is in communicating an accurate representation of the capabilities of the technology being evaluated. If you’re using the less capable free version, then make that clear upfront, and there’s no problem.

If you were to write an article reviewing any other piece of software that has a much less capable free version available in addition to a paid version, then you would be expected to be clear upfront (not in a single sentence all the way down in the appendix) about which version you’re using, and if you’re using the free version what its limitations may be. To do otherwise would be misleading.

If you simply say “ChatGPT” it’s reasonable to infer that you’re evaluating the best possible version of “ChatGPT”, not the worst.

Accurate communication is literally the job of the author if they’re making money off the article (this one has a Patreon solicitation at the top of the page).

Whether or not "most readers" are ever going to pay for the software is totally orthogonal.

If using GPT4 vs 3.5 would create results so distinct from one another that it would serve to incentivize people to give money to OpenAI, well then that precisely supports the argument that the author’s approach is misleading when presenting their results as representative of the capabilities of “ChatGPT”.

> What I really don’t understand is why anyone gets so hung up about it and blames the writer.

Again, if they’re making money off their readers it’s their job to provide them with an accurate representation of the tech.

> Anecdotally, I find this excessive fawning about 4 VS 3.5 to be unwarranted. https://news.ycombinator.com/item?id=38304184

Did some part of my comment came across as “excessive fawning”? Regardless, if this “excessive fawning” is truly unwarranted, this would again undermine your statement that using GPT4 would “incentivize others to send money to OpenAI”.

In regards to your link, I’ll highlight what another commenter replied to you. What should ChatGPT say when prompted about various religious beliefs? Should it confidently tell the user that these beliefs are rooted in fantastical nonsense?

It seems in this case you’re holding ChatGPT to an arbitrary standard, not to mention one that the majority of humanity, including many of its brightest members, would fail to meet.

latexr · 2023-12-31T17:47:23.000000Z

> I’d agree with this rationale if the author clearly communicated their choice of model and the consequences of that choice upfront. (…) with no mention of 3.5 until the middle of a paragraph of text in the appendix.

You’re moving the goalposts. You went from criticising anyone using 3.5 and writing about it to saying it would’ve been OK if they had mentioned it where you think it’s acceptable. It’s debatable if the information needed to be more prominent; it is not debatable it is present.

> If you simply say “ChatGPT” it’s reasonable to infer that you’re evaluating the best possible version of “ChatGPT”, not the worst.

Alternatively, it you simply say “ChatGPT” it’s reasonable to infer that you’re evaluating the version most people have access to and can “play along” with the author.

> If using GPT4 vs 3.5 would create results so distinct from one another that it would serve to incentivize people to give money to OpenAI

Those are your words, not mine. I argued for the exact opposite.

> Again, if they’re making money off their readers it’s their job to provide them with an accurate representation of the tech.

I agree they should strive to provide accurate information. But I disagree that being paid has anything to do with it, and that their representation of the tech was inaccurate. Incomplete, maybe.

> Regardless, if this “excessive fawning” is truly unwarranted, this would again undermine your statement that using GPT4 would “incentivize others to send money to OpenAI”.

Again, I did not argue that, I argued the opposite. What I meant is that even if you believe that to be true, that still doesn’t mean random third-parties would have any obligation to do it.

> I’ll highlight what another commenter replied to you.

That comment has a reply, by another person, to which I didn’t feel the need to add.

> It seems in this case you’re holding ChatGPT to an arbitrary standard, not to mention one that the majority of humanity, including many of its brightest members, would fail to meet.

Machines and humans are not the same, not judged the same, don’t work the same, are not interpreted the same. Let’s please stop pretending there’s an equivalence.

Here’s a simple example: If someone tells you they can multiply any two numbers in their head and you give them 324543 and 976985, when they reply “317073642855” you’ll take out a calculator to confirm. If you had done the calculation first on a computer, you wouldn’t turn to the nearest human for them to confirm it in their head.

The problem with ChatGPT being wrong and misleading isn’t the information itself, but that people are taking it as correct because that’s what they’re used to and expect from machines. In addition, you don’t know when an answer is bullshit or not. With a human, not only can you catch clues regarding reliability of the information, you learn which human to trust with each information.

Everyone’s standard for ChatGPT, be it absolute omniscience, utter failure, or anything in between, is arbitrary. Comparing it to “the majority of humanity, including many of its brightest members” is certainly not an objective measurable standard.

virgildotcodes · 2024-01-01T00:50:13.000000Z

> You’re moving the goalposts. You went from criticising anyone using 3.5 and writing about it to saying it would’ve been OK if they had mentioned it where you think it’s acceptable.

There are no goalposts being moved. My original comment was "I really don't understand why anyone writing articles about ChatGPT uses 3.5. It's pretty misleading as to the results you can get out of (the best available version of) ChatGPT."

This is still the position I'm arguing. It's a criticism of authors who use the older, inferior version of ChatGPT, do not make that abundantly clear to their readers, and then use that to make statements about the capabilities of "ChatGPT", which ultimately misleads those readers as to the current capabilities of "ChatGPT".

> It’s debatable if the information needed to be more prominent; it is not debatable it is present.

I'm not debating whether or not it is present in the article, I'm the one who highlighted its presence. What I'm arguing is that omitting this information from every reference to ChatGPT in the entire body of the text, and the tables front and center representing the data, and then burying this extremely important detail in a single sentence in the middle of a paragraph in the appendix, is effectively misleading.

> Alternatively, it you simply say “ChatGPT” it’s reasonable to infer that you’re evaluating the version most people have access to and can “play along” with the author.

It's even more reasonable to infer that when you're evaluating the performance of "ChatGPT", you're using the latest version.

If you review a video game, you don't play the free demo then tell the audience that the game is too short and lacking in a ton of features.

If you're reviewing Microsoft Word, you're not going to leave out the all-important detail that you're actually evaluating Word version 6.0.

> Those are your words, not mine. I argued for the exact opposite.

Then I misunderstood your line "incentivize others to give money to OpenAI".

> I agree they should strive to provide accurate information. But I disagree that being paid has anything to do with it, and that their representation of the tech was inaccurate. Incomplete, maybe.

Agreed that all of humanity should strive for accuracy and honesty in all their communication with others, but I do feel this responsibility is even more explicit when you are a professional making money off your writing for ostensibly providing an objective assessment of some thing.

I maintain that it's inaccurate, misleading, etc etc to present these results as representative of the performance of ChatGPT without making it abundantly clear to the reader that it's 3.5, which is significantly less performant than the latest version.

> Again, I did not argue that, I argued the opposite. What I meant is that even if you believe that to be true, that still doesn’t mean random third-parties would have any obligation to do it.

Again, I'm confused by what you're saying here about third parties.

Are you arguing the opposite that GPT 4 is not far more capable than 3.5? Are you arguing that it is more capable but that advanced capability would not make it a more compelling product? I admit I don't understand either of these positions.

That 4 is far better than 3.5 is something you can readily observe yourself and find measured on countless metrics and or find support for through countless anecdotes. If do believe it is better then that seems like it would automatically always make it a more compelling product than 3.5, whether or not you want to argue that ChatGPT as a class of products is anywhere from hardly compelling at all to God's Own Perfect Product.

> That comment has a reply, by another person, to which I didn’t feel the need to add.

Ah, I somehow missed that.

So, I went ahead and asked GPT 4 your ghost question verbatim, and the first bullet point it gave me urged me to consider rational explanations for the phenomena.

I then went ahead and asked it a question about sin and God phrased with the implication that I was believer. Then a direct, neutral question about whether or not God exists.

I think it performed well in all these cases, and the nuance that is being glossed over is that it matters whether you are expressing an implied belief in something supernatural or asking in a neutral fashion about the topic.

It's clear to me that a universal policy of responding to all queries involving topics of faith of first encouraging the user to question the validity of their faith would be the wrong way to go, so again I see this as an exceptionally arbitrary standard that I don't feel could be satisfactorily defended as a standard nor actually met by most people to the satisfaction of most people.

https://chat.openai.com/share/2dc2d6eb-b3f6-4571-a75b-af698f...

> Machines and humans are not the same, not judged the same, don’t work the same, are not interpreted the same. Let’s please stop pretending there’s an equivalence.

The purpose of comparison is precisely to draw attention to the similarities and differences between two different things, nobody ever said there was an equivalence.

> Here’s a simple example: If someone tells you they can multiply any two numbers in their head and you give them 324543 and 976985, when they reply “317073642855” you’ll take out a calculator to confirm. If you had done the calculation first on a computer, you wouldn’t turn to the nearest human for them to confirm it in their head.

This is a perfectly defined problem with exactly one correct and easily verifiable answer. The other topics we were talking about are nothing like this.

> The problem with ChatGPT being wrong and misleading isn’t the information itself, but that people are taking it as correct because that’s what they’re used to and expect from machines. In addition, you don’t know when an answer is bullshit or not. With a human, not only can you catch clues regarding reliability of the information, you learn which human to trust with each information.

I completely agree that people need to be skeptical when using ChatGPT, and that this distrust of seemingly omniscient "AI" that can confidently and plausibly provide bullshit answers to any query is something that will need to be cultivated in humanity.

Is that the point of using 3.5 to make ChatGPT look worse than it is though? Should we achieve this cultivation by being intentionally misleading? Maybe the ends justify the means but I'm not sure this is a compelling argument. I'd much rather look at the most powerful version available and point out the very real flaws with it, there are no shortage and no need to get stuck on older generations of the tech.

> Everyone’s standard for ChatGPT, be it absolute omniscience, utter failure, or anything in between, is arbitrary. Comparing it to “the majority of humanity, including many of its brightest members” is certainly not an objective measurable standard.

I mean, yeah, but there is a spectrum of arbitrariness. Asking it to answer arithmetic accurately could be reasonably argued to be on the end of the spectrum labeled "objectively the right way to do this" and expecting it to know the one correct way to answer queries regarding fundamentally unknowable topics of faith that are mythically sensitive and controversial for the majority of humanity would be closer to the other end.

----

Look, I'm so tired of online debates like this at my age. I likely wouldn't even have engaged except your first response struck me as unnecessarily abrasive with phrases like "absolutely worthless" and "excessive fawning" which are an irresistable call to arms to my inner keyboard warrior.

I'd really like to not spend the rest of my life writing essays at each other on this topic so I'm happy to agree to disagree here.

Also, this has all left me with the impression that this is largely a branding issue. OpenAI does call all of their ChatGPT versions "ChatGPT". If they made unmistakable distinctions through their product line that would go a long way in addressing any confusion.

tedunangst · 2023-12-31T05:36:39.000000Z

Why does OpenAI continue to offer chatgpt 3.5 if it's so bad?

hannasanarion · 2023-12-31T05:46:32.000000Z

GPT 4 is THIRTY (30) times more expensive.

In the llm-assisted search spaces I'm involved in, a lot of folks are trying to build solutions based on fine tuning and support software surrounding 3.5, which is economical for a massive userbase, using 4 only as a testing judge for quality control.

antupis · 2023-12-31T06:29:12.000000Z

Chatgpt3.5 is good enought if can give context in query.

azinman2 · 2023-12-31T05:37:10.000000Z

Cheaper and faster.

refulgentis · 2023-12-31T05:28:23.000000Z

It's a bit hard to use for most, either $20/month fixed for a limited # of messages, or you need to be able to reason through how to get an API key, or get another 3rd party service with similar cost & limits.

simonw · 2023-12-31T05:32:05.000000Z

You can use GPT-4 for free via Bing - though I find it a little hard to explain to people how they can do that because I'm never sure what the rules are with regards to creating Microsoft accounts, whether you can use any browser or have to use Edge, what countries it's available in etc.

Actually maybe the recommendation should be to use GPT-4 for free via https://copilot.microsoft.com/ instead now.

(Except I can't tell which version of GPT that's using yet - there was a story on 5th December that said GPT-4 Turbo was "coming soon", not sure when "soon" is though: https://blogs.microsoft.com/blog/2023/12/05/celebrating-the-... )

vitorgrs · 2023-12-31T06:02:28.000000Z

FYI: Balanced doesn't run pure GPT4. Balanced uses a combination of multiple models. Precise and Creative is pure GPT4.

About GPT4 Turbo, to check if you are on Turbo, ctrl+U > ctrl+f > check if "dlgpt4t" exists. If it exists, you are running turbo.

You can also double-check by, well, asking stuff after 2021 knowledge cut-off as well ("What are the oscar winners?") with search disabled.

But you'll notice because turbo is much faster on bing (and better too).

apapapa · 2023-12-31T17:15:17.000000Z

But that gpt-4 says it can't code

airstrike · 2023-12-31T15:53:42.000000Z

IMHO TBF the "limited # of messages" is continously increasing, to the point I hardly remember it exists these days

infamia · 2023-12-31T03:39:37.000000Z

Try uBlacklist, it's like uBlock, but for search results.

https://addons.mozilla.org/en-US/firefox/addon/ublacklist/

https://chromewebstore.google.com/detail/ublacklist/pncfbmia...

You can sync the settings and your personal blocklist to either Dropbox or Google Drive. It also has the ability to subscribe to blocklists. Mind, you need to manually turn on search engines and subscribe to lists. The uBlacklist subscriptions setting doesn't have any built-in feeds yet though. :(

edit: THere are some feeds on the uBlacklist site though. https://iorate.github.io/ublacklist/subscriptions

edit edit: Found an even better list of feeds. https://github.com/quenhus/uBlock-Origin-dev-filter#other-fi...

bayindirh · 2023-12-31T15:43:39.000000Z

This is a feature of Kagi already. You can promote or blacklist domains in your search results.

KomoD · 2023-12-31T17:25:14.000000Z

But I can't do regexes, wildcards or anything like that as far as I can see, like I can in uBlacklist

And it seems like they also have a 1000 domain limit?

EA-3167 · 2023-12-31T15:58:07.000000Z

Kagi is just the best, it feels like Google did before a decade+ of enshittification and ad tech.

cratermoon · 2023-12-31T17:02:42.000000Z

Did anyone notice that Kagi showed as barely better than Google in the article?

_benj · 2023-12-31T17:19:04.000000Z

Yeah, for the the results of kagi are so much better than anything else, that it makes me wonder how objective can one be measuring search results.

I use google in a clients computer and it’s just horrible.

But it could also be a factor of the customizations I’ve made for my kagi. Ban quick a few paywalls sites, always put Wikipedia articles on top, prefer blogs than stackoverflow stuff…

bayindirh · 2024-01-02T11:42:04.000000Z

Even without customization, I did not change anything yet, Kagi provides far superior results to Google.

l8_to_catch_up · 2023-12-31T19:34:33.000000Z

I just tried it (free account), and it felt underwhelming, not many search results, or particularly interesting ones, for the image and video stuff I searched.

There was little to no spam, though, but not much to look at either. Maybe it might be useful when searching for stuff that usually has high amount of confusing spam, but otherwise not really useful for me...

RoyalHenOil · 2023-12-31T21:21:52.000000Z

Kagi is still very weak for searching for videos and images. For those, I still use Google.

Kagi really shines when you are doing a standard search, though, which is what most people do most of the time.

tentacleuno · 2023-12-31T04:28:26.000000Z

uBlacklist is absolutely excellent: I've been using it for a few years now, with absolutely no problems.

Quick tip: turn on the 'Skip the "Block this site" dialog', and disable 'Hide the "Block this site" links' settings -- they make it much quicker to block spam websites (of which there are many on regular search engines).

skygazer · 2023-12-31T04:44:51.000000Z

Just today I was looking for an extension just to block Quora from search results. (Talk about a useless site that seems to uselessly outrank Wikipedia on google lately — what on earth is Google up to?) I’m thankful I saw your and your parent’s post.

carlhjerpe · 2023-12-31T16:56:40.000000Z

When Quora was new I followed some topics, got to read interesting answers to interesting questions, but then some kind of enshittification happened. I've blocked it in Kagi now.

ic_fly2 · 2023-12-31T11:54:55.000000Z

This is amazing, I was maintaining my own custom solution that did this.

gzer0 · 2023-12-31T16:26:31.000000Z

Appreciate you sharing this; I've been searching for something similar for quite some time.

KomoD · 2023-12-31T17:29:50.000000Z

I use uBlacklist with my own blacklists and Google has been pretty usable, it's great.

brobinson · 2023-12-31T16:21:17.000000Z

Does this exist for DDG?

infamia · 2023-12-31T16:47:18.000000Z

Yes, it works for most search engines.

brobinson · 2023-12-31T20:24:01.000000Z

The addon you linked (on the Firefox version) only requests permissions on google.* sites so I don't think it will work for DDG. Is there a separate extension, or am I misunderstanding something?

rhabarba · 2024-01-01T04:38:03.000000Z

uBlacklist has a button to enable other search engines, like DuckDuckGo. Press it!

bambax · 2023-12-31T09:59:19.000000Z

I'm in the camp of those who think Google's results are still very good. I admit I use adblock (uBlock Origin) and won't even try to disable it.

I understand the author's point of turning off their ad blocker "to get the non-expert browsing experience" but then they could make a different test with uBlock on for every query and see how it goes.

It's also a bit inconsistent to expect results for downloading videos mentioning yt-dlp while trying to emulate "the non-expert browsing experience"... Yt-dlp is a command-line Python utility. Talk about non-expert! Most people don't know that videos are files that can be downloaded; of those who do, most don't know about the command line or Python.

Yet when searching for "how to download youtube videos" the first result I get on Google is a link to a service called "savefrom.net", which appears to work well and does not seem to be a scam. This would qualify as "very good" in my book.

When searching for "how to download youtube videos from the command line" the first few results are about youtube-dl, including links to github and superuser. Granted they don't mention yt-dlp, but youtube-dl is a good start.

gkbrk · 2023-12-31T11:30:47.000000Z

When I do a Google search in an Incognito tab for "how to download youtube videos", the first two results I get are the following.

- https://msunduziassociation.online/perfect-online-videos/

- https://gssaction.org/program-all-in-one-media-solutions/

I would certainly put those in the "Terrible" category like the author.

cj · 2023-12-31T11:59:20.000000Z

My top 2 (incognito) are blog posts from pcmag.com and zdnet.com listing 5 ways to download YT videos. Maybe it's blogspam, but the listed services seem valid at first glance.

savefrom.net is the 5th result (2nd page underneath 5 youtube videos)

Edit: This is from the US. If i had to guess, these are regional differences. What country are you in?

emmelaich · 2023-12-31T14:10:33.000000Z

I got similar to you; I'm in Australia.

sanderjd · 2023-12-31T13:46:33.000000Z

I'm curious: what is the rationale for "in an incognito tab" being part of the test harness?

It seems pretty arbitrary to me to disable one of the key features - in this case personalization - of the software being evaluated.

Or is the evaluation not between "search engines" but rather "search engines without personalization"? If so, then this restriction does make sense. But that is not the evaluation that "normal users" are interested in.

Majromax · 2023-12-31T13:58:06.000000Z

> I'm curious: what is the rationale for "in an incognito tab" being part of the test harness?

It's the closest we can easily get to the 'average user experience'. Someone who has a long account/cookie history with Google has plausibly trained the site to return more relevant results through implicit user-curation of avoiding obvious-to-them SEO-spam on other queries.

If we posit that every user eventually trains Google to avoid SEO spam, then this begs the question of why Google(/Bing) don't eliminate the SEO spam in the first place.

Besides that, it's not obvious why search engine personalization should dramatically change the basic utility of search results. We should expect personalization to mostly address ambiguities: is 'the best way to set up tables' asking about furniture assembly/carpentry or SQL? None of the author's queries for this article supported such ambiguities, and besides that the results returned (see the final appendix) aren't[†] valid answers to a different interpretation of the question.

[†] -- I think I'd quibble about the 'adblock' question, since a reasonable person might still find an adblocker that works but participates in the 'acceptable ads program' to be sufficient.

sanderjd · 2023-12-31T15:11:43.000000Z

> It's the closest we can easily get to the 'average user experience'.

Maybe it's the closest we can get (though I doubt it), but it definitely isn't close enough to tell us anything about the "average user experience".

The average user has been using google for years, without taking any steps to avoid personalization. An incognito session (on a browser / machine / network that is probably fingerprinted...) is pretty much the opposite of that typical usage pattern.

I recognize that just writing a blog post or comment on HN is not a research project so needs to do something quick, but I think it mostly invalidates the experiment. What would get closer would be to devise a few user personas and attempt to search and browse for awhile within those personas before trying the experiment. Or much better yet, put together a focus group comprised of real people within the personas you're interested in, and run the experiment using their real accounts.

> If we posit that every user eventually trains Google to avoid SEO spam

I don't think it's that, I think it's that every user trains it to return results more likely to improve the metric of "more likely to click one of the links", and I think that makes it more, not less, likely that they see what most of us here consider to be spam.

But I don't know! Maybe that's not what this experimental setup would show. But it would be a lot more enlightening than a setup using a fresh incognito window, which reflects the usage pattern of a proportion of search queries that is a tiny rounding error above zero.

SV_BubbleTime · 2023-12-31T15:36:59.000000Z

Why are you assuming all users are logged in to google all the time?

sanderjd · 2023-12-31T19:53:24.000000Z

Because it is objectively the case that the "average user" of the internet has a google cookie in their browser. It doesn't require that they be logged in - though I believe it's likely also the case that the "average user" is indeed logged into a google account - it just requires that they use google search without turning off cookies or specifically blocking google's. Essentially everybody uses google search and essentially nobody cares enough (or would know how) to turn off cookies or block google's cookie.

If this doesn't describe most people you know, you're in a very small bubble. (I'm somewhat in that bubble too, but I still have lots of family and friends who use the internet the normal way.)

nvm0n2 · 2023-12-31T18:22:56.000000Z

Google has billions of user accounts ....

Jcowell · 2023-12-31T14:20:27.000000Z

> It's the closest we can easily get to the 'average user experience'

You wouldn’t be really taking the average here though would you? You would be capture the experience someone might have if they were in incognito, using google for the very first time, or using google on another device for the every first time, but not the “average experience”.

gkbrk · 2023-12-31T14:04:05.000000Z

Google gets paid when you click on an ad. It's reasonable to guess you're not going to click on too many scam software ads with your software engineer profile. So naturally you'll be showed less of them.

In this thread we can see people both using incognito tabs seeing different results, it will only become worse to compare if they are using personalized results.

Dah00n · 2023-12-31T11:41:38.000000Z

I get savefrom.net in both Incognito and normal tabs, uBlock or not. I have no idea why you get crap results that are somehow different. uBlock doesn't change google results in Firefox for me at all. It seems you get crap added, not removed.

gkbrk · 2023-12-31T11:49:42.000000Z

I searched with Chrome, perhaps that's the difference. Firefox also blocks some ads out-of-the-box even without uBlock, so maybe it was already blocked.

It could also be related to targeting, like time zone, location, IP address, age group etc.

Dah00n · 2023-12-31T12:36:28.000000Z

I get the same search result in Edge as in Firefox. Can't test in Chrome, but something seems strange.

anonymoushn · 2023-12-31T19:41:07.000000Z

savefrom.net is a crap result.

Dah00n · 2024-01-02T12:01:51.000000Z

You seem to be missing the point of the discussion here, which was to compare results returned. Not to rate if one site is better than another.

londons_explore · 2023-12-31T12:29:55.000000Z

Did you click either of those links?

Both seem to do the job of downloading a youtube link to mp4 for free.

gkbrk · 2023-12-31T12:33:53.000000Z

Did you click either of those links? They are not YouTube video downloaders, they just link to another downloader. There is nowhere on those links to even put a YouTube URL.

Are you seriously suggesting that a website with the following "About us" with only a link to another YouTube video downloader is itself a good YouTube video downloader?

> Good Samaritan Support Action is to reawaken the Body of Christ to receiving the extravagant love of The Father, as well as our call to respond to this love by loving God with all of our hearts, souls, strengths, and minds. In order for people’s hearts to be linked to the heart of our Heavenly Father, we want to foster and facilitate the establishment of a culture of love in our churches and ministries.

londons_explore · 2023-12-31T15:47:01.000000Z

so, there is one extra click... But for the user, the site does the job and takes an extra 1 second.

Ideal? No. But it does the trick.

hamasho · 2023-12-31T16:18:58.000000Z

Not GP, but navigating to an unrelated scammy site just having a link to the actual site is a terrible and unethical job by Google. Imagine if you search "youtube" and the top result is not YouTube but some scammy site just having a link to YouTube. It's not about click counts, if the youtube downloader has bad UX and requires extra clicks, it's a bit inconvenient but ok.

tantalor · 2023-12-31T12:40:55.000000Z

Those are both garbage/scam sites

anonymoushn · 2023-12-31T13:00:26.000000Z

cross-posted: Did you try using savefrom.net? You can type "https://www.youtube.com/watch?v=IkYVmtgxebU" into the text box and hit "Download". Then you'll get a new tab that tries to get you to install malware. If you decline to install it, the new tab takes you to the malware's homepage. If you close the tab and go back to the original tab, savefrom.net presents you with an error message saying "The download link not found." and does not help you download the video.

vagrantJin · 2023-12-31T13:22:27.000000Z

savefrom.net used to be good but it seems they've switched their MO. plenty of decent alternatives filled the gap though.

anonymoushn · 2023-12-31T13:28:20.000000Z

Can you name the alternatives, and are they present in the search results?

bee_rider · 2023-12-31T15:41:35.000000Z

An adblocker is necessary, and IMO a script blocker as well. I feel vaguely like search has gotten worse over time, but it is not a huge problem—usually a good site is on the first page or two, and so I can just go check them out.

But if clicking a site meant I would be under attack, that really increases the stakes, I start to care strongly about the absence of bad sites, not just the existence of a good one.

Other than that, people need to be trained to not download programs from websites in general. I think this has gotten better over time? This is just a human mistake. Maybe Google could suppress sites that link to executables. It must, right?

pixl97 · 2023-12-31T17:03:46.000000Z

It would suppress linking to malware executables, but just general programs I don't see why they would.

bee_rider · 2023-12-31T17:21:07.000000Z

By the time you know enough about a site to download some random executable off it and run it, you know more than enough to just enter the URL, so there’s no point to having it show up in search results.

beezle · 2023-12-31T14:15:29.000000Z

Put me in the camp of google and the rest are horrible for all but very specific/unique technical terms, ie weak neutral currents. Anything that is more "everyday life" is an exercise in futility sorting through trash, often without even the terms you are looking for. And good luck with "verbatim" searches - either ignored or zero results.

omoikane · 2023-12-31T17:30:19.000000Z

> they could make a different test

The takeaway I got from the article is everyone can make their own test, as opposed to relying on other people's sentiments and memes about X is bad or Y is good.

Trying to emulate a non-expert experience without workarounds is not the common usage pattern since everyone familiar with their favorite tools have ways to get more value out of them, but this article presents a way of constructing an experiment (this is why I chose these queries, this is how I ranked scams, etc.), and I think people should follow this same spirit to evaluate if they are stuck in a local optimum with their current choice of tools.

mgerullis · 2024-01-02T10:05:28.000000Z

Yeah, author seems to heavily underestimate his own needs vs general needs. But for what Google et al know about me the results could indeed be more precise. I have developed a habit of appending “GitHub” in the search query for when I am actually looking for source code vs just trying to find a page that just downloads me a video.

teleforce · 2023-12-31T11:55:25.000000Z

I'm also in the same camp who think search results from Google is very good but ChatGPT based search with RAG is better, granted it's a paid version. The latter however is kind of experimental, personally would love to have another column on ChatGPT with RAG (Bing) and the fact the author ignored RAG is rather strange.

erybodyknows · 2023-12-31T14:37:16.000000Z

For those (like me) wondering what RAG means: “Retrieval Augmented Generation (RAG) represents a groundbreaking approach in information retrieval, where the accuracy of search results directly influences the quality of generated answers. In essence, RAG combines traditional search mechanisms with Large Language Model's ability to understand and generate answers.”

(https://www.linkedin.com/pulse/how-we-increased-search-accur....)

HarHarVeryFunny · 2023-12-31T18:48:24.000000Z

If you like Bing (ChatGPT with RAG), then also give perplexity.ai a try - similar concept, but IMO better executed.

https://www.perplexity.ai/

jll29 · 2023-12-31T15:24:41.000000Z

The topic of control (in ChatGPT like models) explained: https://arxiv.org/pdf/2311.11701.pdf

motoxpro · 2023-12-31T08:53:20.000000Z

This makes so much sense why people think search results are bad. Great results for "Download youtube videos" is "Ideally, the top hit would be yt-dlp or a thin, graphical, wrapper around yt-dlp"

Just give me a website where I can plug in the DL link and download it to my hard drive. I don't care what package they are using (I don't worry about malware like I did in the 90s). 99.999% of people are not programming tinkerers.

Just makes me realize how subjective search results are. All of their "Great" results are my "Terrible" results.

darkwater · 2023-12-31T09:43:10.000000Z

Malware or well, the actual viruses, in the '90s were a joke, especially because a computer was an isolated thing. Connected computers were the exception.

acdha · 2023-12-31T19:28:41.000000Z

In the early 90s, yes. By the turn of the century the current industry we see today existed in basic form: malware stole credit cards, compromised PCs were used to send spam as part of botnets, etc. The only major advance was when cryptocurrencies made it much easier to launder money and the professionalism went up accordingly.

carlosjobim · 2023-12-31T10:54:42.000000Z

The first result on Kagi is exactly this, just tried it a moment ago. It processed and downloaded the video extremely fast. Why would any reasonable person prefer youtube-dl?

ufmace · 2023-12-31T19:58:50.000000Z

IMO, if you're capable of running yt-dlp, it's far better than any website.

It's pretty simple to run these download tools as website, but it's expensive in terms of bandwidth and tends to attract legal attention. So a lot of websites go up supporting it, but even if they were started with good intentions, they will virtually all eventually add intrusive ads or other types of monetization just to break even. So there's never going to be a reliable website for it. If you're lucky, a search engine will send you to one that's working okay right now, but even odds you'll be fighting through a dozen malware nests.

Meanwhile, yt-dlp just works every time, with only an occasional pip upgrade to keep it up to date.

motoxpro · 2023-12-31T13:22:09.000000Z

Totally, As the sibling said, it is the same using Google. I am not sure, why anyone would want a programming package to accomplish a task that could be done in < 10 seconds.

But again, I guess that's why search is so hard is because I have to parse that intent from 3 words.

anonymoushn · 2023-12-31T21:03:03.000000Z

Over here the first result on Kagi is savefrom.net which variously tries to install malware or sell a paid subscription and does not download videos.

carlosjobim · 2024-01-01T16:10:56.000000Z

It was the same domain for me, but maybe I should have tried it without adblockers? The page downloaded the YT video completely fine for me.

Dah00n · 2023-12-31T11:47:22.000000Z

It is the same using Google.

haizhung · 2023-12-31T12:44:52.000000Z

What always confuses me about the „search has gotten so bad“ mentality is that it is often based on anecdotal evidence at best, and anecdotal recollection at worst.

Like, sure, I have the impression that search got worse over the last years, but .. has it really? How could you tell?

And, honestly, this should be a verifiable claim; you can just try the top N search terms from Google trends or whatever and see how they perform. It should be easy to make a benchmark, and yet no one (who complains about this issue) ever bothers to make one.

Dan at least started to provide actual evidence and criteria by which he would score results, but even he only looked at 5 examples. Which really is a small sample size to make any general claims.

So I am left to wonder why there are so many posts about the sentiment that search got worse without anyone ever verifying that claim.

marginalia_nu · 2023-12-31T12:50:44.000000Z

I think the point he's trying to make that the search results page from the mainstream search engines are a minefield of scams that a regular person would have difficulty navigating safely.

If he was looking at relevance, yours would be a solid point, but since most of the emphasis is on harm, a smaller sample works. Like "we found used needles in 3 out of 5 playgrounds" doesn't typically garner requests for p-values and error bars.

sanderjd · 2023-12-31T13:53:14.000000Z

I think this is a good illustration of my frustration with this discussion: I don't think search has gotten bad, I think the web has gotten bad. It's weird to even conceptualize it as a big graph of useful hypertext documents. That's just wikipedia. The broader web is this much noisier and dubious thing now.

That's bad for google though! Their model is very much predicated on the web having a lot of signal that they can find within the noise. But if it just ... doesn't actually have much signal, then what?

dpkirchner · 2023-12-31T15:50:17.000000Z

The web has gotten bad because of what big search engines have encouraged. If they stopped incentivizing publishing complete garbage (by ruthlessly delisting low quality sites regardless of their ad quantity, etc) then maybe we'd see a resurgence of good content.

sanderjd · 2023-12-31T19:46:15.000000Z

I don't think so. I think it's the inevitable outcome of giving all of humanity the ability to broadcast without curation.

Or maybe we're saying essentially the same thing, but you think search engines should be doing that curation. But that was never my conception of what search engines are for.

dpkirchner · 2023-12-31T20:10:13.000000Z

I think we are indeed saying the same thing. However, I would like search engines to do some curation -- specifically, to remove results that deliver malware, are clones of other sites, and are just entirely content free (eg Microsoft's forums).

I'll give Google credit: I haven't seen gitmemory or SO clones in a while. It took a few years but they seem to have dealt with them.

thejohnconway · 2024-01-01T06:19:16.000000Z

I disagree, the bad sites people are talking about are spam, not bad personal takes. They are written by people being paid to churn out content. This is now being done with AI. This is a result of search engines listing them.

sanderjd · 2024-01-01T23:58:01.000000Z

I don't think the definition of "spam" is nearly as objective as this suggests it is.

48864w6ui · 2023-12-31T17:10:52.000000Z

The web is bad because it is both popular and commercial. Every now and then I fantasize that just finding a sufficiently user-hostile corner would suffice to recreate the early internet experience of an online world nearly exclusively populated by anticommercial geeks.

eep_social · 2023-12-31T19:05:31.000000Z

I understand this is the tactic the Gemeni folks are using.

whakim · 2023-12-31T14:38:52.000000Z

But there's still plenty of signal. It isn't as if there are no working YouTube downloaders, or factually correct explanations of how transistors work. It's just that search engines don't know how to (or don't care enough about) disambiguating these good results from the mountains of spam or malware.

devinmcafee · 2023-12-31T15:48:33.000000Z

I think that both of you are correct. The internet has much more "noise" than in the past (partially due to websites gaming SEO to show up higher in Google's search results). As a result, Google's algorithm returns more "noise" per query now than it used to. It is a less effective filter through the noise.

Imagine Google were like a water filter you install on your kitchen faucet to filter out unwanted chemicals from your drinking water. If as the years progress your municipal tap water starts to contain a higher baseline of unwanted chemicals, and as a result the filter begins to let through more chemicals than it did before, you'd consider your filter pretty cruddy for its use case. At the bare minimum you'd call it outdated. That is what is happening to Google search

marginalia_nu · 2023-12-31T15:22:20.000000Z

On the one hand, I'm not sure the data corroborates that. If this is a web problem and not a search engine problem, then I'd expect every search engine to have the same pattern of scam results.

I'd also argue that finding relevant results among a sea of irrelevant results is the primary function of a search engine. This was as true in 1998 as it is today. In fact, it was Google's "killer feature", unlike Altavista and the likes it showed you far more relevant results.

pixl97 · 2023-12-31T17:13:31.000000Z

Relevant is a difficult concept to agree on. In 1998 it was more about X != Y, that is being shown legit pages that just were not the correct topic.

These days the results are apt to be the correct topic, but instead optimized for some other metric than what the user wants. For example downloading malware or showing as many crypto ads as possible.

I don't expect every search engine to have the same scam results. Scammers target individual search engines with particular methodologies. Google does a lot of work to prevent crap on their engines, the issue is the scammers in total do far more.

gmd63 · 2023-12-31T15:59:22.000000Z

If the web is being polluted by a nefarious search engine provider that is excluding the polluted pages from their algorithm, you wouldn't see the same pattern across search engines

Not saying or even suggesting that's happening, but the logic isn't airtight

marginalia_nu · 2023-12-31T16:20:50.000000Z

Well, there's always the Münchaussen trilemma, by which no reasoning is airtight.

rstuart4133 · 2024-01-01T01:58:54.000000Z

> I think the point he's trying to make that the search results page from the mainstream search engines are a minefield of scams that a regular person would have difficulty navigating safely.

Yes, and he makes the point well. It also means if you are part of the 0.49% of people who use Firefox on Android, he isn't talking about your experience. I find Firefox mobile remaining at 0.49% utterly inexplicable, which I guess just goes to show how out of touch with the mainstream I (and I assume most other people here) are.

It's not just ad blockers. My first attempt at a tyre width query got relevant results, mostly because "tyre grip" looked so bad as a search term so I used "traction" instead. In the mean time, friends of my age (60's) can't get an internet search for public toilets to return results they can understand. When I try to help them, their eyes glaze over in a short while and they wave me away in frustration. These mind games with google hold no interest for them.

I am regularly bitten with one thing he mentions: finding old results is hard, and getting harder. It makes it really hard to find historical trends ("am I wrong about what it was like back then?") really difficult.

hyperpape · 2023-12-31T13:56:14.000000Z

I agree we can say "this is a minefield of scams" without doing a comparison.

There still is a question about when it got bad--I think Dan mentions 2016 as a point of comparison, and there were plenty of scams back then, so you might wonder whether the days when a query wouldn't return many scams.

If you go back far enough, then there wasn't the same kind of SEO, and Internet scams were much smaller/less organized, but that's a long time ago.

pixl97 · 2023-12-31T17:16:43.000000Z

I think the automation tools for scams are what the major change is. In the distant past it was humans doing this, now I'm guessing there are a few larger businesses and likely nation states that have a point and click interface that removes 99% of the past work.

avsteele · 2023-12-31T12:56:14.000000Z

I don't think this is a fair criticism.

1) The step where you evaluate "how they perform" is necessarily subjective.

2) you could design a study and recruit participants but that isn't something a blogger is going to do.

3) He does link to polls where people agree with the idea the result have gotten worse. Yeah, there are sampling problems with a poll, but its better than nothing.

In this case especially, the writer is answering the question: "Whose results are best according to my tastes?"

narag · 2023-12-31T15:47:31.000000Z

What always confuses me about the „search has gotten so bad“ mentality is that it is often based on anecdotal evidence at best, and anecdotal recollection at worst.

I can't speak for anybody else, just trying to find stuff online, not writing a treatise about it or writing my own engine to outcompete Google. It's been asked many times here over the years and the answer was always explanations, never solutions.

Shittification does not happen overnight, but along many years. It started with Google deciding that some search terms weren't so popular: "did you mean...?" (forcing a second click to do what you intended to do in the first place) and went downhill when qualifiers to override that crap got ignored.

For me enough was enough when I realized that a simple query with three words, chosen carefully to point to the desired page, gave thousands of results, none of them relevant. YMMV.

williamcotton · 2023-12-31T15:19:19.000000Z

Dan approached the problem from a qualitative perspective. Perhaps if more people took this approach over quantitative maximalism we would actually have products that don’t drive us fucking insane.

All that matters is the overwhelming sentiment that search has gotten worse, not the same fucking spreadsheet that got us here in the first place!

arp242 · 2023-12-31T17:51:42.000000Z

To do this you would need to have a comprehensive definition of "quality", and that's anything but easy, and it will be at least partly subjective. It's also hard to include omissions in your definition of "quality" (and again, what should or should not be omitted is subjective as well).

For example, let's say I search for "Gaza"; on one extreme end some engines might only focus on recent events, whereas others may ignore recent events and includes only general information. Is one higher "quality" than the other? Not really – it depends what you're looking for innit?

All you can really do is make a subjective list of things you find important and rate things according to that, and this is basically just the same thing as an anecdotal account but with extra steps.

laserbeam · 2023-12-31T16:59:18.000000Z

Some things are easily quantifiable, but very few. Such as the number of ads per search. Back in the day google had at most 1 and it was visibly distinct from the rest of the links.

Otherwise, yeah, maybe search didn't degrade but the internet got more spammy. Or maybe users just got wiser and can see through the smoke screen better. Who knows...

Doesn't change the fact that today one has to know how to filter through pages of generic results made by low effort content farms. Results that are of dubious validity, which at best simply waste your time. Or through clones of other websites (i.e. Stackoverflow clones).

Search engines can choose to help with that (kagi certainly puts in the effort and I love it for that), or they can ignore the problem and milk you for ad clicks.

Anecdotal evidence is good enough for me.

jll29 · 2023-12-31T15:17:45.000000Z

> Dan at least started to provide actual evidence and criteria by which he would score results, but even he only looked at 5 examples. Which really is a small sample size to make any general claims.

US NIST, in their annual TREC evaluation of search systems in the scientific/academic world, use sets of 25 or 50 queries (confusingly called "topics" in the jargon).

For each, a mandated data collection is searched by retired intelligence analysts to find (almost) all relevant result, which are represented by document ID in general search and by a regular expression that matches the relevant answer for question answering (when that was evaluated, 1998-2006).

Such an approach is expensive but has the advantage of being reusable.

fumeux_fume · 2023-12-31T16:41:15.000000Z

So you're confused why other people aren't doing research for you and when they do provide some evidence, you dismiss it because it's not a large-scale scientific inquiry into search quality? Get frickin a grip.

Springtime · 2024-01-01T07:24:31.000000Z

Every time I encounter an egregious poor result in DDG I document it with images. I have a directory of them over the last few years. However I encounter so many now, while when I first began using DDG just a couple years prior to that it was less of an issue (and I fully switched at the time). So yeah, I don't have before/after comparisons but it's a little more solid than just 'I feel the results are worse' being characterized here.

There are particular search parameters that DDG changed the behavior of, including exclusion and double quoting, which are now, according to even their own docs, more a hint of the direction results should go rather than any explicit/literal command (ime these virtually never work, which was a motivation for documenting failures, and they actually removed them from their docs temporarily at one point earlier this year).

ta988 · 2023-12-31T17:23:41.000000Z

Yes to get an accurate comparison we would need to have results from queries 10 years ago.

I still remember myself having to really often go to page 3 and more of google searches to find things even really early on.

I think it has never been good, got a bit better before SEO farms took all the gain out. That's my feeling with nothing to back it.

bee_rider · 2023-12-31T15:34:22.000000Z

> So I am left to wonder why there are so many posts about the sentiment that search got worse without anyone ever verifying that claim.

I suspect it has gotten worse, so posts complaining about it resonate. But, it is not really a huge problem, and anyway it isn’t as if there’s much I can do about it, so I’m not going to bother collecting statistically valid data.

I think this is generally true about a lot of things. We should be OK with admitting that we aren’t all that data-driven and lots of our beliefs are based on anecdotes bouncing around in conversations. Lots of things are not really very important. And IMO we should better signal that our preferences and opinions aren’t facts; far too many people mix up the two from what I’ve seen.

pixl97 · 2023-12-31T17:20:18.000000Z

When it comes to human psychology what we believe tends to be more important than what is when it comes to future predictions of our actions. If people think search sucks then it's likely they'll use less of it in the future and it opens up companies like Google for disruption.

mgdlbp · 2023-12-31T14:56:58.000000Z

Internet Archive remembers. https://web.archive.org/web/*/google.com/search/%2A

Find a query of interest, see for yourself (and take a snapshot of the present state for posterity).

The api enables more powerful queries, https://web.archive.org/cdx/search/cdx?url=google.co.jp*&pag...

Also try other search engines and languages.

hn_throwaway_99 · 2023-12-31T16:58:22.000000Z

Even without looking at the subjective quality of search results, the sheer user hostility of the design of the Google search results page is an obvious, objective instance of how search has enshittified.

That is, in the early days, Google used to highlight that "search position couldn't be gamed/bought" as one of their primary differentiators, ads were clearly displayed with a distinct yellow background, and there weren't that many ads. Nowadays, when I do any remotely commercial search the entire first page and a half at least on mobile is ads, and the only thing that differentiates ads from organic results is a tiny piece of "Sponsored" text.

nvm0n2 · 2023-12-31T18:21:25.000000Z

> has it really? How could you tell?

Yes it has and for a certain class of queries it's not even open for debate, because Google themselves have stated they deliberately made it worse. And they really did, it's very noticeable.

This class of queries is for anything related to any perspective deemed "non authoritative". Try to find information that contradicts the US Government on medical questions, for example, and even when you know what page you're looking for you won't be able to find it except via the most specific forcing e.g. exact quoted substrings.

Likewise, try finding stories that are mostly covered by Breitbart on Google and you won't be able to. They suppress conservative news sites to stop them ranking.

15 years ago Google wasn't doing that. It would usually return what you were looking for regardless of topic. There are now many topics - which specifically is a secret - on which the result quality is deliberately trashed because they'd prefer to show you the wrong results in an attempt to change your mind about something, than the results you actually asked for.

anonymoushn · 2023-12-31T12:49:42.000000Z

Probably for the same reason that there are so many more posts about anything that make claims than that explore evidence systematically, especially when the people making the posts stand to gain nothing by spending their time that way.

I encounter claims that "protobuf is faster than json" pretty regularly but it seems like nobody has actually benchmarked this. Typical protobuf decoder benchmarks say that protobuf decodes ~5x slower than json, and I don't think it's ~5x smaller for the same document, but I'm also not dedicating my weekend to convincing other people about this.

ForkMeOnTinder · 2023-12-31T14:26:33.000000Z

The problem with benchmarking that claim is there's no one true "json decoder" that everyone uses. You choose one based on your language -- JSON.stringify if you're using JS, serde_json if you're using Rust, etc.

So what people are actually saying is, a typical protobuf implementation decodes faster than a typical JSON implementation for a typical serialized object -- and that's true in my experience.

Tying this back into the thread topic of search engine results, I googled "protobuf json benchmark" and the first result is this Golang benchmark which seems relevant. https://shijuvar.medium.com/benchmarking-protocol-buffers-js... Results for specific languages like "rust protobuf json benchmark" also look nice and relevant, but I'm not gonna click on all these links to verify.

In my experience programming searches tend to get much better results than other types of searches, so I think the article's claim still holds.

anonymoushn · 2023-12-31T14:37:48.000000Z

I agree. You wouldn't use encoding/json or serde-json if you had to deserialize a lot of json and you cared about latency, throughput, or power costs. A typical protobuf decoder would be better.

jimmytucson · 2023-12-31T04:59:53.000000Z

If you wanna know why Google (or any search engine) sucks, just look at how it measures its own search results. Most search companies do this “at scale” according to very specific guidelines, like what the author did here but on steroids. For example, take a look at Google’s 168-page instruction manual for search quality raters:

https://static.googleusercontent.com/media/guidelines.raterh...

It talks about figuring out a query’s meaning(s), judging the user’s intent (were they looking for some specific answer, etc.), evaluating the “quality” of a website, rating the site’s usefulness in relation to the query’s meaning/intent, etc.

All this is to say, it’s not that search companies don’t do exactly what the author did here, it’s just that they have different standards than the author. And I’d venture their standards match their users’ better than the author’s, but maybe not or not forever, anyway.

whakim · 2023-12-31T05:24:24.000000Z

I really don't think that's true. For example, page 29 of your link describes "Lowest Quality Content." Most of the search results that the author rated as spammy or scammy clearly fit these guidelines, which means that either (1) the raters aren't knowledgeable enough about the subject matter to determine that the website they're rating is harmful or misleading; or (2) the raters are rating these sites correctly, but it still isn't having the desired effect.