Hacker News new | past | comments | ask | show | jobs | submit login

Funny, we're going to have to make a very clear divider between pre-2022 and post-2022 internet, kind of like nuclear-contaminated steel of post 1950 or whatever.

Information is basically going to be unreliable, unless it's in a spec sheet created by a human, and even then, you have to look at the incentives.




If you think that's crazy, think again. Just yesterday was trying to learn more about Chinese medicine and landed on this page I thoroughly read before noticing the disclaimer at the top.

"The articles on this database are automatically generated by our AI system" https://www.digicomply.com/dietary-supplements-database/pana...

Is the information on that page correct? I'm not sure but as soon as I noticed it was AI generated I lost all trust. And that's because they bothered to include the warning.


You shouldn't have had any trust to begin with; I don't know why we are so quick to hold up humans as bastions of truth and integrity.

This is stereotypical Gell-Mann amnesia - you have to validate information, for yourself, within your own model of the world. You need the tools to be able to verify information that's important to you, whether it's research or knowing which experts or sources are likely to be trustworthy.

With AI video and audio on the horizon, you're left with having to determine for yourself whether to trust any given piece of media, and the only thing you'll know for sure is your own experience of events in the real world.

That doesn't mean you need to discard all information online as untrustworthy. It just means we're going to need better tools and webs of trust based on repeated good-faith interactions.

It's likely I can trust that information posted by individuals on HN will be of a higher quality than the comments section in YouTube or some random newspaper site. I don't need more than a superficial confirmation that information provided here is true - but if it's important, then I will want corroboration from many sources, with validation by an expert extant human.

There's no downside in trusting the information you're provided by AI just as much as any piece of information provided by a human, if you're reasonable about it. Right now is as bad as they'll ever be, and all sorts of development is going in to making them more reliable, factual, and verifiable, with appropriately sourced validation.

Based on my own knowledge of ginseng and a superficial verification of what that site says, it's more or less as correct as any copy produced by a human copy writer would be. It tracks with wikipedia and numerous other sources.

All that said, however, I think the killer app for AI will be e-butlers that interface with content for us, extracting meaningful information, identifying biases, ulterior motives, political and commercial influences, providing background research, and local indexing so that we can offload much of the uncertainty and work required to sift the content we want from the SEO boilerplate garbage pit that is the internet.


> This is stereotypical Gell-Mann amnesia - you have to validate information, for yourself, within your own model of the world. You need the tools to be able to verify information that's important to you, whether it's research or knowing which experts or sources are likely to be trustworthy.

Except anthropologically speaking we still live in trust-based society. We trust water to be available. We trust the grocery stores to be stocked. We trust that our Government institutions are always going to be there.

All this to say we have a moral obligation not to let AI spam off the hook as "trust but verify". It is fucked up that people make money abusing innate trust-based mechanism that society depends on to be society.


Oh, for sure - I'm not saying don't do anything about it. I'm just saying you should have been treating all information online like this anyway.

The lesson from Gell-Mann is that you should bring the same level of skepticism to bear on any source of information that you would on an article where you have expertise and can identify bad information, sloppy thinking, or other significant problems you're particularly qualified to spot.

The mistake was ever not using "Trust but verify" as the default mode. AI is just scaling the problem up, but then again, millions of bots online and troll farms aren't exactly new, either.

So yes, don't let AI off the hook, but also, if AI is used to good purposes, with repeatable positive results, then don't dismiss something merely because AI is being used. AI being involved in the pipeline isn't a good proxy for quality or authenticity, and AI is only going to get better than it is now.


And most importantly we trust money to not only be paper or bits


To be clear, information on the internet has always been assumed unreliable. It isn't like you typically click on only the very first Google link because 1) Google is that good (they aren't) 2) the data is reliable without corroboration.


> It isn't like you typically click on only the very first Google link because 1) Google is that good (they aren't)

I know it's popular to hate Google around here, but yes they are. It's their core competency. You can argue that they're doing a bad job of it, or get bogged down in an argument about SEO, or the morality and economics of AdWords, but outside of our bubble here, there are billions of people who type Facebook into Google to get to the Facebook login in screen, and pick that first result. Or Bank of America, or $city property taxes. (Probably not those, specifically, because the majority of the world's population speaks languages other than English.)


It's not a binary reliable/unreliable.

AI just introduces another layer of mistrust to a system with a lot of perverse incentives.

In other words, if the information was also unreliable in the past, it doesn't mean it can't get much worse in the future.

At some point, even experts will be overwhelmed with the amount of data to sift through, because the generated data is going to be optimized for "looking" correct, not "being" correct.


This is a matter of signal-noise. What people are saying when they complain about this is that the cost of producing noise that looks like signal has gone down dramatically.


depends on what your personal filters are - i've always felt like a large amount of the things i see on the internet are clearly shaped in some artificial way.

either by a "raid" by some organized group seeking to shape discourse or just accidentally by someone creating the right conditions via entertainment. With enough digging into names/phrases you can backtrack to the source.

LLMs trained on these sources are gonna have the same biases inherently. This is before considering the idea that the people training these things could just obfuscate a particularly biased node and claim innocence.


I was thinking the exact same thing last month[1]! It's really interesting what the implications of this might be, and how valuable human-derived content might become. There's still this idea of model collapse, whereby the output of LLMs trained repeatedly on artificial content descends into what we think is gibberish, so however realistic ChatGPT appears, there are still significant differences between its writing and ours.

[1]: https://www.glfharris.com/posts/2024/low-background-lexicogr...


> and even then, you have to look at the incentives.

This has always been true but I think you’re right that there has been a clear division pre and post 2022




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: