Generative AI could make search harder to trust

StableAlkyne · on Oct 5, 2023

I actually experienced this the other day. Bought the new Baldur's Gate and was wondering what items to keep or sell (don't judge me, I'm a pack rat in games!)

I had found some silver ingots. The top search result for "bg3 silver ingot" is a content farm article that very confidently claims you can use them at a workbench in Act 3 to upgrade your weapons.

Except this is a complete fabrication: silver ingots exist only to sell, and there is no workbench. There is no mechanic (short of mods) that allows you to change a weapon's stats.

I'm pretty sure an LLM "helped" write the article because it's a lot of trouble to go through just to be straight up wrong - if you're a low effort content farm, why in the world would you go through the trouble if fabricating an entire game mechanic instead of taking the low effort "They exist only to be sold" road?

This experience has caused me to start checking the date of search results: if it's 2022 and before, at least it's written by a human. If it's 2023 and on, I dust off my 90's "everything on the World Wide Web is wrong" glasses.

lhl · on Oct 6, 2023

I had a similar experience recently looking some stuff for Starfield. Content farms are obviously switching to LLM-generated (mostly hallucinated) articles and Google seems to be ranking them pretty highly atm.

Kagi's ability to manually downrank/remove those kinds of results from your searches (and their return to flat rate pricing) finally tipped the scales for me for subscribing/switching search.

bobbylarrybobby · on Oct 6, 2023

Speaking of Kagi, here is its “quick answer”:

> In Baldur's Gate 3, silver ingots are a common miscellaneous item that can be found in various locations such as chests, shops, and dropped by enemies.[1] Each silver ingot can be exchanged for 50 gold at merchants or traders.[2] While silver ingots do not have any crafting or upgrade uses currently in early access, they provide a reliable source of income early in the game before other money-making options become available.[3]

yoyohello13 · on Oct 6, 2023

I actually found my first AI YouTube channel today looking for starfield videos. I noticed the narrator sounded like text-to-speech. I went to the channel and it’s all just random videos for tones of different games with no real theme. Was a surreal experience.

myaccountonhn · on Oct 7, 2023

I had mine the other day when I bought an audiobook and noticed the price was way cheaper and sure enough, it was a very robotic voice with uncanny pauses reading it. Refunded instantly

arcanemachiner · on Oct 6, 2023

Finding about about their flat-rate pricing makes it a complete no-brainer for me to switch if I ever hit >100 searches per month (haven't gotten there... Yet).

Gigachad · on Oct 6, 2023

Probably in the future people will only trust sources of info that can't be monatised. If you want to know the answer to a game question you just got to the reddit or discord and ask, since there is no point autogenerating crap for discord when you can't put ads next to it and the mods can remove you.

heavyset_go · on Oct 6, 2023

Platforms will be happy to run bots they can portray as real humans to bolster engagement, make spaces seem popular and dynamic, trick advertisers and investors, etc.

Similarly, if it costs basically nothing to work your way into communities to astroturf with bots, it'll happen. You don't have to post about great sites to get free Viagra right away, you can build reputation and subtly astroturf. And you can use additional bots to build/portray consensus.

Reddit is already a problem because of actual humans doing the latter. It'll just get worse when it's automated further.

TerrifiedMouse · on Oct 6, 2023

Sadly Reddit is sort of monetized, as we have people selling accounts for what I believe are spamming and propaganda purposes.

grotorea · on Oct 6, 2023

Yes, but it's reddit monetizing, not the users. The users can be astroturfing, but there's no point astroturfing some types of content like game guides I hope.

TerrifiedMouse · on Oct 6, 2023

Some users monetize it by selling their karma rich accounts to those astrosurfers you mentioned - or to spammers.

They currently manufacture these karma rich accounts by reposting popular posts and comments. LLMs will soon be (or already are) another way to karma farm.

yellow_postit · on Oct 6, 2023

Discord has raised so much money and is part of the inflated valuation unicorns that I sadly wouldn’t be surprised if they somewhat get forced into Ads

Gigachad · on Oct 6, 2023

The point is the user posting isn't able to put their own ads in. So they can't profit by flooding the platform with crap.

loveparade · on Oct 6, 2023

It's only a matter of time before things change with Discord monetization. On reddit there is an incentive to create LLM-powered fake accounts with high karma and sell them. It's true that on Discord this incentive doesn't exist right now because no karma equivalent is associated with Discord accounts, but eventually that's going to change as Discord, as a company, will try to monetize their user data in various ways.

It's the typical overvalued VC-backed company dilemma that needs investor returns. Quora, Medium, and so on.

bennyg · on Oct 6, 2023

Just wait for elevated server roles where the pfp and account banner are ads

im3w1l · on Oct 6, 2023

There are quite a few people that use reddit to market their onlyfans.

jppittma · on Oct 6, 2023

I really like that service. I can't wait to see how they enshittify this one.

jayd16 · on Oct 6, 2023

We could just dump crowd sourcing and go back to well known and reliable sources of journalism.

RandomLensman · on Oct 6, 2023

I guess we need to again (and again) establish that having good information isn't necessarily cheap: could be a high quality vendor, policing a forum, maintaining search engine integrity, etc.

dfee · on Oct 6, 2023

Your response reminds me of Snowcrash :)

naillo · on Oct 6, 2023

It's funny that one argument openai used to keep their models closed and centralized is so they could prevent things like this. And yet they're doing basically nothing to stop it (and letting the web deteriorate) now that profit has come into play.

wildrhythms · on Oct 6, 2023

I don't understand how OpenAI thinks they could stop that from happening.

tornato7 · on Oct 6, 2023

Not saying they should, but if they wanted to they could have an API that allows you to check whether some text was generated by them or not. Then Google would be able to check search results and downrank.

simonhughes22 · on Oct 6, 2023

It's not that simple. Originally OpenAI released a model to try and detect whether some content was generated by an LLM or not. They later dropped the service as it wasn't accurate. Today's models are so good at text generation it's not possible in most cases to differentiate between a human and machine generated text.

naillo · on Oct 6, 2023

Well they could just not allow prompts that seem to participate in blogspam. If they wanted to stop it they definitely could.

Their argument is that since it's centralized, things like that are possible (while with llama2 you can't), they do "patch" things all the time. But since blobspam are contributing to paying back the billions microsoft expects they're not going to.

Meekro · on Oct 6, 2023

> Well they could just not allow prompts that seem to participate in blogspam.

Unfortunately, any question that lots of people legitimately ask will also be a prompt for blogspam.

kolinko · on Oct 7, 2023

It would be easy to workaround using other open source models. You use GPT-4 to generate content and then LLAMA-2 or sth else to change the style slightly.

Also, it would require OpenAI to store the history of everything that it's API has produced. That would be in contrast with their privacy policy and privacy protections.

tornato7 · on Oct 7, 2023

I mean they would literally just store hashes of all the content that they generate and compare it vs their database. No AI involved.

thewakalix · on Oct 7, 2023

If it's a straightforward hash, that's easy to evade by modifying the output slightly (even programmatically).

If it's a perceptual hash, that's easy to _exploit_: just ask the AI to repeat something back to you, and it typically does so with few errors. Now you can mark anything you like as "AI-generated". (Compare to Schneier's take on SmartWater at https://www.schneier.com/blog/archives/2008/03/the_security_...).

PaulHoule · on Oct 13, 2023

10 years ago I was a huge fan of GameFAQs, which, for any major game, had at least one and often several highly detailed documents describing a game, not just a walkthrough but a set of tables where you would find items.

Back then I was playing Hyperdimension Neptunia and almost tried “applying AI” in the old sense to the problem of “What do I have to do to craft item X?”. Those games all had good FAQs and extracting a knowledge graph and feeding it into some engine that could resolve dependencies wouldn’t have been too hard.

Today I am playing Atelier Sophie which has the same mechanic but is very cozy and doesn’t pose complex dependency problems and the FAQs for this game are atrocious, consisting of a walkthrough that way too prescriptive. If you ask some question like “Where do I get a Night Crystal?” on Google this is likely to turn up a question/answer pair on a forum which isn’t quite as good as having the structured FAQ.

YouTube walkthroughs really seemed to kill text walkthroughs, sometimes these are better (like when there is a jump in a dungeon that doesn’t look like you could make it but you can) but sometime they are much worse (there are 75 hour long videos in a walkthrough, you have to find that it is in video #33 and that you have to seek to 20:55.)

Maybe the proliferation of trash sites will motivate the creation of high quality FAQs but you’d better believe that the creator of these FAQs will be horribly afraid of being ripped off.

poulpy123 · on Oct 6, 2023

To be fair it was already the case before the boom of LLM, to the point I was obliged to add "reddit" or worse, "youtube", to my queries. Of course LLM make it easier and faster, so I guess the wrb search engines will have to be smarter if they want to keep being used

Semaphor · on Oct 6, 2023

Especially for BG3, most of the content sounds like it’s written by LLMs. Even the parts that are correct and possibly even checked by humans.

badwolf · on Oct 6, 2023

I had this exact experience, with this same question, and saw that same bullshit "article." Truly frustrating all around.

crakenzak · on Oct 6, 2023

> If it's 2023 and on, I dust off my 90's "everything on the World Wide Web is wrong" glasses.

because misinformation written by humans didn’t exist before LLMs?

StableAlkyne · on Oct 6, 2023

Back in the 90s when the Internet became a thing, it was common knowledge that because normal people made websites, that you should take things with a grain of salt. There was a bit of an overreaction to this, as the general feeling at the time was to trust nothing on the Internet.

In the 00s and 10s, the quality of discoverable content improved: reddit and stackechange had experts (at a higher rate than the rest of the net at least). It was the era where search was good, Google wasn't evil (good results that separated the ads are entirely why they won against AskJeeves and Yahoo), and SEO was still gestating in Adam Smith's wastebasket.

Now Google and Bing are polluted with SEO-optimized content farms designed to waste your time and show you as many ads as possible. They hunger for new content (for the SEO gods demand regular posts), and the cheapest way to do this is an underpaid "author" spewing out a GPT-created firehose of content.

SEO has ruined search, and content farms have made what few usable results there are even less trustworthy.

So yes, the Internet has fundamentally changed in the last 9 months.

Tao3300 · on Oct 6, 2023

Reddit never had experts. Maybe for half a minute. It became an echo chamber fast: fake internet points to be gained for saying what got upvoted last week, or to be lost for saying anything different.

rcoveson · on Oct 6, 2023

If all you do is browse (default) home or all, then sure, it's just a stupid echo chamber obsessed with hating the things its cool to hate. That's not where the value is on reddit, and it's not what people are referring to when they say the search reddit for answers. If you're looking for product reviews, it's not perfect but it's tough to find anywhere better unless you happen to know of exactly the right hidden gem of a forum to visit for your particular subtopic (and the link to that hidden gem of a forum is probably easier to find on the relevant subreddit than it is on google).

cheald · on Oct 6, 2023

Reddit may not have experts per set, but in the right subreddits it definitely has enthusiasts. In ages gone by, you'd find the same people on message boards or forums, talking up and comparing the minute details of this or that. There's obviously the same risk of cargo culting that there's always been, but there's genuinely useful information available from people who spend way more time than the common man on their area of interest.

Conscat · on Oct 6, 2023

I think, at least on programming language subreddits, there are people who deserve to be labeled experts. r/cpp has some frequent users who work on standards proposals or compiler features. There are also subreddits dedicated just to communicating with experts, like r/askdocs

potatolicious · on Oct 6, 2023

Two things can be true at the same time: "Reddit is prone to karma-driven bullshittery" and "Reddit content is generally significantly higher-quality than SEO content farms".

With Reddit you might get inane arguments and bandwagoning about what the best game strategy is, but you're exceedingly unlikely to read about a game mechanic that was straight-up hallucinated by a LLM.

ALittleLight · on Oct 6, 2023

For now. Wait till legions of bot redditors infiltrate real subs and make their own.

solardev · on Oct 6, 2023

Some subreddits at (like askscience) at least asked for a copy of your diploma (in a science field) if you wanted flair. It was actually an awesome reddit.

PawgerZ · on Oct 6, 2023

Hobby subreddits and technical subreddits had a lot of experts. I frequented r/excel for work a lot, and those guys were WIZARDS.

StableAlkyne · on Oct 6, 2023

I wanna politely disagree on that. I've found it to frequently be a resource on par with Stackechange.

coldbrewed · on Oct 6, 2023

Prior to LLMs, generating plausible-sounding misinformation took actual effort - not much effort, but the marginal cost was reasonably above free. With LLMs making the cost of bullshit vanishingly close to free we're going to tip into an era where uncurated LLM confabulation is going to dominate free information.

There's "one loony had a blog" levels of wrong, and then there's "industrial scale bullshit" levels of wrong and we are not prepared for the latter.

raincole · on Oct 6, 2023

Because hiring humans to write misinformation costs more money than $20/mo? Like what are you even trying to say?

Before LLMs, a $3000 camera had fake reviews on Amazon, and you got fake news about politicians. But you can safely assume "bg3 silver ingot" information is likely real, since hiring someone to make up silver ignot will never make the money back.

No any more.

grotorea · on Oct 6, 2023

The price to produce text without caring if it's true has gone down, leading to more production.

anigbrowl · on Oct 6, 2023

GP is literally reminding you of a time when online misinformation was rampant, but before search engines (temporarily) did a better job than overwhelmed curators.

pseudosavant · on Oct 5, 2023

I wonder if there will be a human information/knowledge equivalent of low-background steel (pre-WWII/nukes). Data from before a certain point won't be 'contaminated' with LLM stuff, but it'll be everywhere after that.

https://en.wikipedia.org/wiki/Low-background_steel

robinsonb5 · on Oct 5, 2023

I suspect in the coming years the Wayback Machine at archive.org will become ever more important - always assuming it's not lost as collateral damage in their copyright battles. Indexing that dataset and making it searchable would massively increase its value.

My inner conspiracy theorist can't help wonder if the continued reduction in search usefulness isn't part of an ongoing deliberate disempowerment of everyday people - but my rational side says it's merely an unfortunate emergent behaviour of the systems we've built.

__loam · on Oct 6, 2023

It's a consequence of running search as a for profit system. When you're optimizing for revenue, the system priorities are different than if you were optimizing for user experience or true knowledge. Arguably that means that search should be a public utility but then you have to be able to trust your government with your searches and privacy.

beebeepka · on Oct 5, 2023

Generative Wayback Machine

StableAlkyne · on Oct 6, 2023

Honestly, I would love to see a LLM based plugin that would take a website, remove all the tracking garbage and filler, and just give me back a no-frills static html+CSS site that looks like it was made in 1995.

"Oh, you want a guide to writing your own loss function for Tensorflow? Here's an FAQ that could have existed on comp.lang.python3.tensorflow"

creata · on Oct 6, 2023

So you want a... reader mode plugin? How would it use an LLM beneficially?

thewakalix · on Oct 7, 2023

Does reader mode eliminate "filler"?

zirgs · on Oct 6, 2023

LLMs can write summaries of articles.

Peritract · on Oct 6, 2023

More unnecessary technology doesn't solve the problem of too much unnecessary technology.

civilitty · on Oct 6, 2023

That would be the best thing ever. Just ask GPT to hallucinate an entire page based solely off the URL when it doesn’t exist in the database.

BitwiseFool · on Oct 5, 2023

Those simple web 1.0 sites made by college professors are a gold-standard in my book. I always enjoy finding them in search results. Although they are becoming increasingly rare.

StableAlkyne · on Oct 6, 2023

Paul's Notes remains the absolute best textbook Calculus and intro to DiffEQ! Not sure if it's been updated since 2005, but I mean, it's not like they're discovering new Calc II methods! Being that it's not a $200 textbook rehashing the same stuff as the last 10 editions, it's easily one of my favorite websites

heavyset_go · on Oct 5, 2023

Can't prove it, but it seems to me like black text on white background sites from the past are poorly ranked compared to sites with "modern" layouts.

hashtag-til · on Oct 5, 2023

Yes. I love black text on white background. A rare find these days.

Browsing today is like: “You ask for a spaghetti recipe and the page tell you the whole history of civilization.”

StableAlkyne · on Oct 6, 2023

Don't forget how Google will now drop search terms if it thinks you mean something else, or add unrelated synonyms to your search (presumably to "help" folks who aren't good at writing queries)

p_l · on Oct 6, 2023

At least it still lets you force whatever you write in...

Allegro (big polish auction/e-commerce site) in their mobile app will unconditionally rewrite the search terms instead of showing you no results

zeroonetwothree · on Oct 5, 2023

Thats specific to recipes because they can’t be copyrighted

grotorea · on Oct 6, 2023

I heard that before and have trouble believing this is the cause at least for Internet recipes. Sure for a recipe book in 1950, but are recipe content farms going to sue each other? Isn't the lawyer costs way more than could be gained?

hashtag-til · on Oct 5, 2023

I had a look and definitely learned something today so #til.

Also, note to self to collect my favourite recipes in markdown files from now on.

xkcd-sucks · on Oct 6, 2023

#til! https://www.justtherecipe.com/

semi · on Oct 6, 2023

They're even better than black text on white backgrounds. They're unstyled and use your browser default styling. granted it's rare anyone configured those so it's almost always black on white.. but for people who do specify their own preferences it's really nice to have them respected and not rely on hacking in my own css or js to override theirs

HappyDaoDude · on Oct 5, 2023

I have a website that is just a few black text on white HTML files I maintain in whatever text editor I have at hand. Loads lightning fast, and if you cannot view it, its not a web browser. Last I checked, the total site size was about 60KB.

As time goes on, even the amount of text I am putting out get trimmed down. Make the words count, don't count the words.

gorwell · on Oct 5, 2023

Further, is it true that Google factors in whether sites have ads?

dredmorbius · on Oct 5, 2023

Unfortunately, that's a trivial signal to emulate.

At a minimum, you'd have to validate them by confirming existence in the Wayback Machine.

Otherwise agreed that those are indeed high-signal documents. Increasing reliance on integrated educational software means that even such things as online syllabi are increasingly rare.

LordDragonfang · on Oct 5, 2023

The type of sites GP is talking about are typically hosted on .edu servers, under faculty webhosting (often featuring a "/~profname/" in the url). That's a non-trivial signal.

dredmorbius · on Oct 5, 2023

~/name at an edu is pretty attainable.

.edu domains can be had for any otherwise eligible "U.S.-based postsecondary institutions" per Educause: <https://net.educause.edu/eligibility.htm>

Pages at extant domains might variously be available to undergraduate or graduate students, faculty, staff, and adjuncts. Those might either directly host emulative material or be convinced or compromised into hosting content.

If there's one thing that the Internet's history to date has proved, its that perverse incentives lead to perverse consequences.

l33t7332273 · on Oct 5, 2023

It is not easy for a regular person to obtain access to a .edu webpage.

dredmorbius · on Oct 5, 2023

A "regular person" can:

- Enroll or be hired at an eligible institution. There are literally thousands of these.

- Bribe or compromise someone enrolled or hired at an eligible institution.

- Create a de novo eligible institution. For-profit colleges are not uncommon.

Someone motivated by profit or advantage would likely find virtually any of these options quite straightforward.

I'm ... somewhat pained that this needs to be spelled out.

l33t7332273 · on Oct 6, 2023

> Enroll or be hired at an eligible institution. There are literally thousands of these

You don’t generally get the kind of personal website being discussed to my understanding.

> Bribe or compromise someone enrolled or hired at an eligible institution

Finding a professor willing to stake their job and reputation for such a blatantly immoral scam seems hard.

> Create a de novo eligible institution

Is it actually easy for a regular person to create their own college?

I find the snarky finish to your comment obnoxious.

90-00-09 · on Oct 6, 2023

A "regular person" also can:

- start their own country and call it Edunistan

- bribe ICANN to take over the .edu TLD

- open a university in the new country

- spend 15 years earning a PhD at that university

- reserve ~/name and start posting LLM generated content

MrVandemar · on Oct 5, 2023

search.marginalia.nu is a great place to find those sites, and some more interesting stuff besides.

ryanklee · on Oct 5, 2023

People are vastly everestimating how unique this problem of hallucinations is.

It seems to me it relies mostly on discounting just how much we've already had to deal with this same problem in humans over the millenia.

The problem of proliferation of bad information might be getting worse, but this isn't native to generative AI. The entire informational ecosystem has to deal with this. GPTs compound the issue, but as far as I can tell, no where near what social media has forced us to deal with.

BobaFloutist · on Oct 5, 2023

The thing is when you call a human on bullshit, they usually can't back it up well enough to pass the smell test. When you call an AI on bullshit it can instantly fabricate plausible, credible seeming sources/evidence.

A human's lie is different than an AI's hallucination, since it's still based on (distorting) the truth, whereas the hallucination is based on an invented reality (yes I know it's applied statistics and there's no true model of the world in there, but it can report as if there is)

ryanklee · on Oct 5, 2023

Intelligent people can fill the void of ignorance with plausible sounding but factually incorrect information. They are apt to engage cognitive biases in such a way that the biases produce assertions that are deeply indistinquishable from factual assertions. They fool themselves in this way and they fool others. This happens all the time.

LLMs are no different in this respect.

TerrifiedMouse · on Oct 6, 2023

Sure people lie and spread misinformation (willingly or unwittingly). But coming up with plausible sounding lies take effort.

LLMs on the other hand are amazing and prolific liars and can produce a lot of bullshit for a price that’s effectively free - in fact it’s cheaper to create a LLM that’s inaccurate than one that’s … less inaccurate. The truth to lie ratio on the internet is about to take a huge hit. LLMs really are a Pandora’s Box.

gyudin · on Oct 5, 2023

It’s not a big deal, there are many ways to handle it. It just has some overhead costs. LLMs that are offered to general public are more of a POC and they are making sure to use as little resources as possible.

127 · on Oct 6, 2023

Evidence can easily be cross referenced and if the link LLM gives is dead it means it's lying.

cleandreams · on Oct 5, 2023

Uh, no. Generative AI does not have a standard of truth. It generates text according to the probabilities given what it learned from its data set. There is no systematic modelling of a world that it can test its assertions against. What are called hallucinations are a fundamental property of the approach. The hallucinations are probable -- just not true. The model has succeeded but the assertions are false. This has to be understood or the models will mess stuff up. A lot of checking is required.

Humans always assemble information according to a standard of truth. It is a big part of how humans learn. No method is perfect but the human method results in fewer routine hallucinations.

ryanklee · on Oct 6, 2023

> the human method results in fewer routine hallucinations.

I'd love you to produce data to back this up.

My guess is that you are wrong, on the basis of how often I discover that I'm full of shit and how often I discover other people are full of shit.

Humans are built for being wrong just as much as being right. We wouldn't have such complicated institutions and social structures built around controlling for those symmetric capacities if it weren't the case.

Even then, we find ourselves surrounded and overcome by falsehoods of our own design.

wavemode · on Oct 6, 2023

You could probably just look up data on mental illness and/or pathological lying? Most people will say "I don't really know the details of that" rather than write you an essay of seemingly plausible but completely made up nonsense. Not all people, sure, but most.

ryanklee · on Oct 6, 2023

The kind of incorrect information I am referencing is not pathological, but the type that is generated by cognitive bias and woven into the functional fabric of everyday life. It is pervasive and most people don't notice it (even when aware of specific bias types).

pixl97 · on Oct 5, 2023

A standard of truth is doing a lot of heavy lifting. Myself I would say we assemble information based on the limitations of the body we exist in. For example gravity is important to us because if we disobey its laws we may very well die.

Embodiment and multi modal AI will likely provide such filters or limits to AI in which it can derive truth.

blibble · on Oct 5, 2023

humans can only produce semi-convincing bullshit at a limited rate

with AI this limit is all but removed

all the human generated bullshit ever created will soon be dwarfed by what AI can vomit out in an hour

HappyDaoDude · on Oct 5, 2023

Like most things in the world. The problem isn't necessarily the technology but the scale at which it is implemented.

darkerside · on Oct 6, 2023

How do you know that? A human that produced fully convincing bullshit would be definitionally undetectable to you

TerrifiedMouse · on Oct 6, 2023

Because everyone has at least tried to lie - not necessarily for malicious reasons, e.g. white lies - a few times in their lives and know it’s not easy coming up with plausible sounding lies. It takes effort. Sure it might be effortless for some psychopaths but there are a limited number of such people. Overall the internet was surprisingly usable despite all the liars - even now with SEO and content farms; I think even SEOs realize that it’s better to provide information that’s true than lie if it cost them the same.

LLMs changes all that. They can produce content on a massive scale that can drown out everything else. It doesn’t help the fact that more inaccurate LLMs are cheaper and easier to create and run - think about all the ChatGPT3 level LLMs vs ChatGPT4 level LLMs; guess which ones SEOs will gravitate towards.

darkerside · on Oct 6, 2023

Again...

> Because everyone has at least tried to lie - not necessarily for malicious reasons, e.g. white lies - a few times in their lives and know it’s not easy coming up with plausible sounding lies.

How do you know that?

TerrifiedMouse · on Oct 6, 2023

> How do you know that?

Common knowledge.

Regardless, the amount of false information on the internet used to be limited by the amount of people producing it. Some of it is deliberately produced misinformation. Some of it are just mistakes due to carelessness, others due to willful negligence - example of the latter would be content farms who make their money off flooding search engine result pages and pushing ads in your face when you visit their website; who couldn’t care less what if you got what you came for.

Despite all that the internet was still useful. There are enough people posting accurate information such that the signal to noise ratio is good enough.

LLMs will change all that. They have the potential to flood the internet with hallucinated falsehood. And because bad LLMs are cheaper and easy to create and operate, those will be the majority of LLMs used by aforementioned content farms.

darkerside · on Oct 7, 2023

I guess I just disagree on either how capable people are of thinking on their feet, or what your definition of bullshit is. Still not sure which.

TerrifiedMouse · on Oct 7, 2023

Well, you can be sure the SEO creating content farms will pick the path of least resistance.

They won't make up a lie when telling the truth is far easier and when they do have to lie, it's a slight bit more effort.

But that's a moot point with LLMs ...

If/When they use LLMs, they will pick the cheapest LLM possible which will produce the most garbage - they might not even bother keeping it up to date; why bother when hallucinated BS sound just as convincing.

It has already started to happen.

gorwell · on Oct 5, 2023

If there's an upside here, it's that humans will now be forced to refine their BS detectors.

In doing so the species would improve critical thinking skills which can be applied to all information regardless of source. Which, I agree, was often BS to begin with. But in theory would be more difficult to skirt on by without notice if humanity upgraded their critical thinking.

mvdtnz · on Oct 5, 2023

Unfortunately that didn't happen with other innovations of misinformation at scale (such as social media) so I'm not so optimistic.

wellthisisgreat · on Oct 5, 2023

Yeah if you think about it, there is no history for example, as all we have in that domain is just someone’s perspective on some events. They may or may not have agenda but that’s beside the point.

That soft data could have never been trusted, rhe information that can be verified (calculations etc.) seems safe from LLM

mvdtnz · on Oct 5, 2023

Not one single solitary soul has ever made the claim that misinformation didn't exist before AI so it's not clear who you're arguing with. People are rightly concerned about the scale of misinformation that AI is unlocking.

ryanklee · on Oct 6, 2023

I didn't say that they did.

What I'm responding to is the strong tendency to discount our very long history of dealing with factually incorrect information and the ascertainment of truth from sources both dubious and trustworthy.

Entire institutions are set up in order to handle these very real problems, the set of which currently dwarfs the problem of hallucinations in GPT.

From a social perspective, non-GPT falsehoods are even more insidious, because we are inclined to trust and believe those whom we like and are like us.

Again, people are in the habit of discounting just how much we are wrong in our everyday lives. The hallucinations therefore appear more singular than they actually are.

cscurmudgeon · on Oct 5, 2023

How do we know you are not hallucinating this comment?

DayDollar · on Oct 5, 2023

There will be a web of trust, with a valuation of nodes by trustworthyness. And people will get only one id for this. Ones name is ones value and a reputation will be a hard earned thing again.

ratg13 · on Oct 5, 2023

This was how the "internet" functions in the book "Ender's Game".

There is a small sub-plot about how he had to give a fake persona credibility on the untrusted network in order to be able to leverage a creating a fake account on the trusted network.

dredmorbius · on Oct 5, 2023

I find the xkcd interpretation more realistic: <https://xkcd.com/635/>

Explained: <https://www.explainxkcd.com/wiki/index.php/635:_Locke_and_De...>

notahacker · on Oct 5, 2023

I love that interpretation, but in today's retweet driven world of politically commentary, I actually find it quite plausible that pseudonymous kids with no grasp of the real world who think rational political debate is the nonsensical slogans they're spouting on the internet become major Twitter influencers that actual politicians want to court for their "authenticity" and "willingness to say the unsayable", and maybe their dank memes.

dredmorbius · on Oct 5, 2023

The conceit of Ender's Game was that thoughtful discourse would be influential online.

Reality has largely demonstrated that far more thoughtless propaganda of the Big Lie, Firehose of Bullshit (or Falsehood), associated with Russia, floods of irrelevance which tend to bury more significant stories, favoured by China, and outrage / hot-button topics, which are common in US-centric media, though a timeless technique.

Memes and simple messages attract attention and spread. Complex narratives and analyses ... not so much.

But yes, voices that deserve no attention whatsover have dominated the media landscape of the past decade or so. Not that this is entirely novel.

acover · on Oct 5, 2023

Good and unlikely predictions can thrust you into popularity: see deepfuckingvalue.

grotorea · on Oct 6, 2023

Not disagreeing, but that's the end of anonymity.

But yeah, maybe the idea that you can even 1% trust random content on the Internet without having a source doesn't really make sense if you think about it IMHO. Either you do this web of trust, coming from a well know real world source, or be Wikipedia-like with linked reliable sources for the viewer to check.

By the way, wasn't this how Google ranked pages back in the day? Ranking pages that get linked to higher? And even before that there were P2P web rings.

carlosjobim · on Oct 5, 2023

Isn't this how it has been since the dawn of time?

thenickdude · on Oct 5, 2023

Will you join my Webring?

dotnet00 · on Oct 5, 2023

In some ways it already is that way. If I come across an artist I suspect is passing off AI generated stuff as their own (without using the tagging features the site has to indicate as much), an easy test is to just check if they've been posting since before ~2020. If they have, and the style has recognizable similarities, it's clear that it's honestly human made or at most blends characteristics of both together.

redstonefreedom · on Oct 5, 2023

I've said as much as to one extra incentive (besides retrain cost) as to why openai has frozen the training period for post-2022. I think it's trying to generate as much data as possible before itself has contaminated its training set. They'll effectively have a monopoly over it; it's interestingly a rare example of "we can only do this once, then it's forever degraded". You really want a clean & discrete start.

datadrivenangel · on Oct 5, 2023

There's the branch of philosophy called epistemology.

bpiche · on Oct 6, 2023

Hey, you're internet famous now. Check out the end of this boingboing post.

https://boingboing.net/2023/10/05/ai-search-chatbots-output-...

RandomWorker · on Oct 5, 2023

My sense is to avoid this have a personal blog.

That being said how many people write blogs with grammerly or chatgpt these days. The temptation to use these technologies all the time is too strong for even self preservation of your own (writers) voice.

My sense is that you use this technology you might be happy with the results at first but on later review you just notice something off in some sentences and maybe it just doesn’t flow right. I’m not convinced that it will replace writers jobs yet. Especially when you want to create something authentic and unique.

tredre3 · on Oct 5, 2023

> The temptation to use these technologies all the time is too strong for even self preservation of your own (writers) voice.

I don't know about that. I have played with ChatGPT/Copilot/etc enough to know what they're capable of doing. But the thing is, I enjoy programming. I enjoy breaking down a problem and solving it with code. I enjoy crafting elegant code. So I don't use AI even though I'm fully aware it could save me hours on projects. Why? Because I enjoy those hours very much.

Why am I telling you all this? Because I suspect many writers are the same and personal blogs are their canvas. They enjoy communicating. They enjoy crafting articles. They might have AI proof-read them, but they won't let them write everything. So, to me, there is hope that personal blogs will maintain their human element, as opposed to news websites or tabloids or learning platforms.

steelframe · on Oct 5, 2023

> So I don't use AI even though I'm fully aware it could save me hours on projects.

Enjoy this luxury while it lasts. Based on what I have seen in performance review committees for software developers, your peers who drive results faster than you do because they use AI will be rewarded more and will be more likely to survive rounds of layoffs when they inevitably happen.

JohnFen · on Oct 5, 2023

That's fine. I genuinely wouldn't want to continue working in an industry that worked like that anyway, so I'd just quit and keep on programming with my own projects. So that luxury will last as long as I want it to.

booleandilemma · on Oct 5, 2023

I've come to the same conclusion. I became a programmer because in school I loved programming and saw myself loving the job, and I do.

If the job changes that drastically I'll just have to quit and find something else.

EFreethought · on Oct 6, 2023

But what if you have to use an LLM at your new job?

SoftTalker · on Oct 5, 2023

Agree. I've never even looked at any of these AI tools. I enjoy the process and the challenge of programming, and the rewards of doing it well. I have no desire for someone or something else to write code for me.

pseudosavant · on Oct 5, 2023

Sometimes the value is specifically because my voice won't come through. When I'm stressed and being asked for unreasonable things at work, I know that I tend toward passive aggression. But professionally, that isn't the way I want my message to come across.

I use ChatGPT all the time to suggest how I could make sure something isn't passive aggressive. It'll point out parts that aggression and suggested changes. It can be for a short slack message, or a many paragraph message.

floren · on Oct 5, 2023

I have definitely read "blogs" written by stitching together LLM outputs. For years people were advised that a technical blog "looks good on a resume" so we saw lots of lightly rewritten Stackoverflow content. Now it's gotten easier.

thih9 · on Oct 5, 2023

I wonder how we'd test for AI contamination. And would there be attempts to sell a larger data set, one that pretends to be human generated, but instead is padded with some AI content.

Does this mean we'd end up with a finite set of verified human only data?

Would people start going through all kinds of offline archives via AI-gapped means, trying to uncover and document new sources of human input?

downboots · on Oct 6, 2023

https://en.wikipedia.org/wiki/World_Brain

But who's gonna pay for it?

dongping · on Oct 5, 2023

Actually, that's great in my opinion.

Assuming that semi-convincing misinformation spreads everywhere, people will finally have to find the original source of a certain statement, verify their "knowledge supply chain", and maybe use logic to evaluate every single statement made.

Agree2468 · on Oct 5, 2023

Right now is best time to buy encyclopedias.

visarga · on Oct 6, 2023

Why, you can dl any version of wikipedia from before 2023

carlosjobim · on Oct 5, 2023

The shadow libraries.

serf · on Oct 6, 2023

I think it'd be kind of neat in a backward way if we went back to the 'specialized encyclopedia' days of the 90s.

Web directories , 'Who's Who in Engineering' type lists, etc.

It's a step back from universal search engines being able to find stuff, but it's a step forward with regards to curation and quality of results; so i'm not sure if it's entirely a downgrade.

The early 90s 'website phonebook' type encyclopedias were interesting[0], but I always had to remind my mom "No, this isn't the entire internet, it's just a bunch of places that people like; the secret ones are 'unlisted'."

Note: I never say this is better than a search engine, it's just an interesting end-result after search engines got polluted and modified til the point of uselessness that we're at now with Google.

[0]: https://www.amazon.com/Internet-Directory-Guide-Usenet-Bitne...

armchairhacker · on Oct 6, 2023

It's already kind of like that for me, in that almost all of my searches fall into these categories:

- Wikipedia

- Online documentation for whatever language/framework/tool I'm using

- Stack Overflow / Stack Exchange for most technical questions

- Reddit if SO/SE doesn't work, and for opinionated questions (e.g. r/BuyItForLife)

- Hacker news for software recommendations and technical opinionated questions

- Arxiv or the ACM library if it's a research paper (99% of the time, whenever I google something niche the only relevant results are papers)

- Other sites like caniuse.com, university sites for health and nutritional info, old-style forums for specific software

For these searches I'm just using Google to bring me to the specific site I want, because it's faster than using the site's own search functionality. Then there are the times I literally just type in the website instead of the URL bar (e.g. "instacart"), or when I use Google maps, images, or reviews.

I'm always wary when Google returns an unfamiliar site because I'm skeptical of the results. ~70% of the time it's some blogspam which is at best accurate but overly wordy, and at worst inaccurate; sometimes it's a blog from some random individual who for whatever reason went into a deep dive trying to understand what I'm searching for, that actually turns out to be useful; the rest, idk.

TerrifiedMouse · on Oct 6, 2023

Sites like Reddit are kind of like web directories with its user base collectively being the curator but it can be gamed too.

salynchnew · on Oct 5, 2023

Recently an article came out where someone said that the company I work for is a big user of WebAssembly, but the reality is that we don't use it.

After finding the contributed article (on a well-known news site, not Wired though), it looks like a tech founder might've been using ChatGPT to write an article about the uses for WASM. The arguments were generally sound, but I don't think that anyone did the work to manually check any of the facts they presented in it.

notabee · on Oct 5, 2023

This is kind of like the advent of spellcheck, where a whole class of errors started to appear regularly in almost every article because publishers stopped paying for the human labor to manually review for things like homonym or word ordering errors. Except much worse, because it could allow spurious or even harmful facts to accrue and spread instead of just grammatical mistakes.

visarga · on Oct 6, 2023

> Except much worse, because it could allow spurious or even harmful facts to accrue

It already did, even in the "purely human" era. I think LLM text will gradually become more trustworthy than a random website by consistency filtering the training set.

kumarvvr · on Oct 6, 2023

Unfortunately, it is more than likely that the training inputs to upcoming LLMs will be partly from older LLM outputs.

nonrandomstring · on Oct 5, 2023

More amusing and frightening is when people search about themselves and turn up AI generated crap. Googling yourself was always a lucky grab bag, with the possibility of long-forgotten embarrassments being dragged up. But at least you'd have to face facts.

Now I hear of people discovering they're in prison, married to random people they've never met, or are actually already dead.

What is this going to do to recon on individuals (for example by employers, border agents or potential romantic partners) when there's a good chance the reputation raffle will report you as a serial rapist, kiddy-fiddler or Tory politician?

vorpalhex · on Oct 5, 2023

This is a new way to be anonymous too. Someone post something true but nasty about you? Have LLMs cook up dozens of preposterous stories - you're secretly a rodeo clown, you write childrens books, you built a castle in Rome, you once drank a goldfish, etc.

Increase noise to drown signal.

kr0bat · on Oct 5, 2023

This is essentially the service Reuptation.com claims to provide. Jon Ronson's "So You've Been Publicly Shamed" describes the site games SEO to flood the search results of controversial figures with banal nothing posts[1]. The difference being that actual humans had to create that content.

In the near future, the web could become opaque with LLM schlock, but at least it may grant people a right to be forgotten.

[1]https://www.businessinsider.com/lindsey-stone--so-youve-been...

acomjean · on Oct 5, 2023

I think Boris Johnson tried that by saying out of the blue: he makes model busses. There was some thinking at the time that he didn't want the brexit bus to show up in searches and was trying to game search results..

I don't think it worked.

JohnFen · on Oct 5, 2023

> Now I hear of people discovering they're in prison, married to random people they've never met, or are actually already dead.

My real name is very, very common -- so this has been my reality for my entire life.

These days, I have grown to appreciate it. It's like an invisibility superpower.

BeFlatXIII · on Oct 6, 2023

> What is this going to do to recon on individuals (for example by employers, border agents or potential romantic partners) when there's a good chance the reputation raffle will report you as a serial rapist, kiddy-fiddler or Tory politician?

I feel terrible for the first dozen people this happens to. That said, I look forward to this being the average case for most people. Bury the real malfeasance in AI-generated noise. Let employers and background checkers get dunked on.

ben_w · on Oct 6, 2023

> Now I hear of people discovering they're in prison, married to random people they've never met, or are actually already dead.

All three in my case, and that just for sites which predate AI.

lykahb · on Oct 5, 2023

The SEO garbage has been poisoning the search for years. Even before the chatbots it got to the point when most top results are crap. The LLM's can surely make it much worse, though.

hnick · on Oct 5, 2023

I was trying to do something with delegates in C# but couldn't remember what the various bits were called since it's been so long, so didn't have the magic words for Google to work. ChatGPT sorted me out with my vague question and one followup.

Ultimately, I would like to see more about the other side. If generative AI can make blog spam, then I think it can recognise blog spam. How far are we from implementing a reliable filter of useless spam sites from search results? I don't expect Google has a monetary incentive for this, but maybe someone else does. But from my story above maybe "search" is a thing of the past and for functional queries, we will just talk to the gatekeeper. Reading the actual words written will only be for leisure.

hashtag-til · on Oct 5, 2023

I think this is a given these days. LLMs likely will become the new single point of failure search.

This is too much of a temptation for the SEO scum to resist.

faizshah · on Oct 5, 2023

I started to go down a line of thinking where I think we might see a return to books in the next 3-5 years. The reason is that with a book it’s a big collection of knowledge and people can post reviews about the quality of the book whereas on the web you have no way of knowing what quality of an article will be anymore.

klyrs · on Oct 5, 2023

Only, amazon is now flooded with crapbooks written by artificial psychonauts and also reviews written by artificial psychonauts.

frabcus · on Oct 6, 2023

Yeah - I think books from non-Amazon publishers is probably the filter to use

zpeti · on Oct 5, 2023

Here's what people don't understand: this is mostly good for google.

The worse organic results are, the more people will click on paid links. This is WHY everyone on HN is complaining about search results, because google doesn't really have an incentive to give you really good results. They only need to be good enough to keep 95% of the population still using google, but mostly expecting the good results to be ads.

Google ads are the equivalent of verification on FB and X. They just call it something different. The verified, high quality results will be paid.

TeMPOraL · on Oct 6, 2023

Huh. How long until they simply change the label from "ad" to "verified"?

IronWolve · on Oct 5, 2023

it's almost like AI just repeats data its fed on, even incorrect data, without any real intelligence to determine if the data is correct.... /s

Its not simply garbage in garbage out. There is no logic to verify and analyze the data. You are simply told what is popular in the data.

smt88 · on Oct 5, 2023

AI doesn't "just" repeat data. You can feed a LLM 100% fact-checked data and it'll still hallucinate.

It's a core problem with generative AI and it can't be solved with better data.

aconsult1 · on Oct 5, 2023

All of a sudden the saying "eat your own dog food" takes a twist and is no longer fun.

lazide · on Oct 5, 2023

Unfortunately, that is also a sizable portion of the human population. AI definitely does it cheaper and at larger scale though!

yetanotherloser · on Oct 5, 2023

I've definitely met a lot of people who fail the GPT test.

gumballindie · on Oct 5, 2023

The correct term is spamming. People are using these text generators to spam everyone and everything under the sun. It will be detrimental to the internet as many people will just give on this huge pile of ... spam.

thenickdude · on Oct 5, 2023

Like the heat death of the universe, we'll have the spam death of the Internet.

tivert · on Oct 5, 2023

We did it guys! We're definitely heading into a new era, one perfected by software engineers. I can't wait!

Kon-Peki · on Oct 6, 2023

THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

jowea · on Oct 5, 2023

AI powered citogenesis!

I'm starting to wish articles had inline citations as a standard.

dredmorbius · on Oct 5, 2023

Inline as opposed to hyperlinks?

Or would footnotes / sidenotes be acceptable?

kiernanmcgowan · on Oct 5, 2023

Without naming the company, I have seen specific examples of blog posts being written by AI, hallucinating a "fact", and then that "fact" re-surfacing inside of Bard.

Its xkcd's Citogenesis automated and at internet scale https://xkcd.com/978/

danielsgriffin · on Oct 5, 2023

Someone on Twitter just tried asking Bard [are you familiar with a paper called "A Short History of Searching"] and Bard responded by linking Claude Shannon to a fabrication. https://twitter.com/nunohipolito/status/1710063374145343511

abruzzi · on Oct 5, 2023

I have to say--the opening paragraph doesn't describe a reality I'm familiar with:

>Web search is such a routine part of daily life that it’s easy to forget how marvelous it is. Type into a little text box and a complex array of technologies—vast data centers, ravenous web crawlers, and stacks of algorithms that poke and parse a query—spring into action to serve you a simple set of relevant results.

Web search has, for me, become a nasty twisted hall of mirrors well before generative AI. I almose never get fed relevant results, I alsmost always have to go back and quote all my search terms because the search engine decided it didn't really need to use all of them (usually just one.) The only difference is the poison was human generated. generative AI will simply erase the 5% of results that might give me an answer quickly.

jfengel · on Oct 5, 2023

I experience that when I try to google for technical problems I'm having at work, but otherwise searches still go pretty well for me.

I just had to google a bunch of races that I wanted to run. The top result was always the event's own web site.

When I google some news, relevant news articles always come up.

The last search I did was for how to display a ket vector in LaTeX. The top result was the StackExchange article with the right answer.

From what I see, certain domains seem to be targeted for exploitation. Programming questions seem to be high up on the list. I wonder if that skews HN readers' perceptions.

JSavageOne · on Oct 5, 2023

Google search to retrieve specific factual information is pretty good.

Google search to retrieve anything opinion related has been horrible and infested with blogspam for years (hence people searching Reddit to get that kind of info).

jamal-kumar · on Oct 5, 2023

Really? I've been finding it doesn't even find stuff it used to in certain documentation (I'm talking like things it found maybe a year ago), "searching in quotes for this stuff", things that other search engines (bing, kagi) are indexing just fine - And since I've switched to using these engines more when I'm searching things for programming work, it's definitely been a lot more helpful than google which often just seems to be missing a ton now

jfengel · on Oct 5, 2023

I suppose it never occurs to me to search for opinions. I'm not even sure how I'd got about it, even if search weren't broken. Blogspam is what I'd expect to see.

I'm more likely to start at a place that aggregates reviews and try to hallucinate which ones were written by people who know what they're talking about. That usually seems to work.

I imagine that somewhere out there is a person who bought the product and reviewed it on their blog or made some enthusiastic social media post about it, and that's what you'd want to locate were it not for the spam. But I don't expect any search engine to be able to find it for me.

fnordpiglet · on Oct 5, 2023

Google search to retrieve product marketing pages is pretty good. Specific factual information searches lead to product marketing pages. Opinion searches lead to product marketing pages.

Google is a giant adware tool that’s been taken over by adware SEO sites. The example given - find the product marketing pages for some races - falls directly in its sweet spot. If you venture outside it’ll do its best to get you back into the product marketing sweet spot, and the SEO companies of the world take care of the rest.

Search is a lost cause.

icyberbullyu · on Oct 5, 2023

As someone who has been using search engines since the 90's, I've found that the "old-school" way of formatting your search almost like a database query has gotten significantly worse. It seems like search engines are geared more towards natural language queries now; probably because the old Google-Fu way of doing things wasn't very friendly for people who didn't use computers regularly.

klyrs · on Oct 5, 2023

My understanding is that google went from a more traditional database style which supported such queries, to a newer "n-gram" index with a layer of semantic similarity. Notably, you can no longer put a sentence in quotes to only find pages that contain that exact phrase. Also, the order of words matters more now than it used to (where the old search engines treated a space as AND, so order was irrelevant outside of quotes)

saalweachter · on Oct 5, 2023

https://www.google.com/search?q=%22you+can+no+longer+put+a+s...

klyrs · on Oct 5, 2023

Hah, perhaps I should edit that to say "reliably."

interstice · on Oct 5, 2023

If someone brought back a search engine like this i'd happily use it

loupol · on Oct 5, 2023

Agreed that web search quality has been deteriorating since much earlier than LLMs gaining popularity.

Interestingly, we are in spot right now where I feel that for certain types of queries LLMs can outperform search engines. But from what is shown in the article, it seems like that state might only be temporary, and that in the same way that shitty content farms mastered SEO and polluted search results, we might see the same happening with LLMs that have access to the Internet.

meowface · on Oct 5, 2023

I've had the exact same experience. That said, when I do add all the right quotes and conditions to the query to filter out the blog/newsspam drivel, I still - usually - eventually - get pretty good results. Sometimes I have to switch to Bing or even Yandex, but it's rare.

Adding "reddit" to queries can be pretty useful. You're prone to get terrible, inaccurate information since it's just random people on an internet forum, but at least it's (usually) actual humans and not blogs trying to SEO-game. (Though one big caveat is searching for products/services. Lots of threads full of bot accounts writing "[link] has been the best [thing], in my experience". They're usually easy to spot, but sometimes they do seem pretty natural until you check the post history.)

ryandrake · on Oct 5, 2023

> You're prone to get terrible, inaccurate information since it's just random people on an internet forum, but at least it's (usually) actual humans and not blogs trying to SEO-game.

Less and less so. Reddit has always had a bot problem, but it seems to be getting exponentially worse lately. Not just article reposters, but comment reposters, bots that reverse images and videos just to repost, seems like it's at least 75% bot content now.

bnralt · on Oct 5, 2023

Not only that, but you're also left with the issue of parsing what someone else has written. Even when using answers I find from web searches, I often drop results into ChatGPT so I can get a rough idea of what the person is trying to say first, or check if it agrees with my understanding of what's being said.

heavyset_go · on Oct 5, 2023

Sounds like a success if that means people see more ads while trying to find what they actually searched for.

__loam · on Oct 5, 2023

Nationalize Google.

Nothing will change as long as search is optimized for revenue over user value.

notamy · on Oct 5, 2023

https://archive.ph/2023.10.05-165142/https://www.wired.com/s...

michaelteter · on Oct 6, 2023

The signal to noise ratio of web search results has been trending toward utter uselessness for years; so while AI content will make it worse, it won't make it dramatically worse. We'll just advance toward useless at a higher rate.

loveparade · on Oct 6, 2023

To me, "advance towards useless at a (possibly exponentially) higher rate" means that it makes it dramatically worse. Also, the convergence point is difference. At some point human-generated spam content is no longer worth it because the costs exceed the profits. With AI making the process cheaper, you get to a much higher ratio of spam.

qwerty456127 · on Oct 6, 2023

You should never have had been trusting what you find on the web, let alone other media. I hope widespread usage of generative AIs including deepfakes will finally force the masses to start thinking more critically.

p_j_w · on Oct 6, 2023

This is an absurd proposition. How is one supposed to think critically when there's nowhere to go for reliable information? We'll be left just taking shots in the dark, what little amount of informed reasoning we had before will be replaced by pure conjecture. There won't be any increase in critical thinking.

qwerty456127 · on Oct 6, 2023

It has always been this way, we just didnt know. Now we can be sure.

However, there always is a sufficiently reliable/checkable fact: "a specific media wrote...". Whatever they actually wrote is not a fact but they wrote it - this is. This said you can then ask youself why they most likely did, keeping in mind other observations of yours.

shinycode · on Oct 6, 2023

Also it might reinforce cognitive biais as pure truths to a lot of people who rely only on themselves if no other information is reliable

abujazar · on Oct 5, 2023

«Could»? Google has already been doing this for quite some time, at least in my region (Norway), and I’d say more than half of the suggestions Google provides as top results are false.

liampulles · on Oct 6, 2023

I think the insufficient accuracy in the output of LLMs is going to lead them to be a lot more niche then the current hype is hoping for. I think most people care that someone is taking accountability for what they are reading - not that it is necessarily correct but at least that someone thinks it is correct (and that someone can be taken to task if it is inaccurate).

If LLM usage in media becomes widespread, I'd pay for a service that identifies and hides the LLM shit for me.

hypnoosi · on Oct 6, 2023

It's great to hear that someone finds Replit's AI capabilities useful for accelerating their learning... However, I must point out that the post sounds like an advertisement for an overpriced product. Not poking the AI assistance in coding, it definitely can be valuable, but the effectiveness vs. cost-efficiency varies. 20 dollars per month for coding a chatbot, not worth it for me...

anigbrowl · on Oct 6, 2023

Surely, just as content farms have gradually trashed the quality of search results on major platforms. There's also an over-reliance on raw quantities; I often get irrelevant and unwanted news articles from India simply because the huge population of that country coupled with widespread use of English outweighs US content on the social graph.

whb101 · on Oct 6, 2023

This is a pretty intractable problem and my app is in the alpha-est of stages, but I built something for this purpose. It maps creators on a 2D grid (using React-flow) based on subject and lets users vote on their trustworthiness. https://www.graphting.org

Condition1952 · on Oct 5, 2023

Please get your answers from Anna’s Library

kordlessagain · on Oct 6, 2023

It'll make it easier to trust if you have your own index of documents. That's why I built this: https://mitta.us/

There still exists a problem that users need to run and manage their own indexes, at times.

figassis · on Oct 5, 2023

I think we all saw this coming, talked about it, articles were published even...but now its news

ironborn123 · on Oct 6, 2023

Wasnt there a paper a few months back, Textbooks are all you need. yes found it https://arxiv.org/abs/2306.11644

So search engines in their traditional sense will be obsolete anyway.

1) GPT-4 and other such LLMs will generate textbooks and manuals for every conceivable topic.

2) These textbooks will be 'dehallucinated' and curated by known experts on particular topics, who have reputations to maintain. The experts' names will be advertised by the LLM provider.

3) People will search for stuff by chatting with the LLMs, which will in turn provide citations for the chat output from the curated textbooks.