I wonder if the main problem holding AI back will be that there will be so much ...

pixl97 · on Feb 15, 2023

I would say this is what we get for asking non-expert data sources for information we want to present as authoritative.

Let's say we go back before the time of the internet and asked 100,000 random individuals for factual information on random subjects. You'd have a corpus of facts, but you'd also have tons of old wives tales and information that is just wrong.

The internet democratized posting information, but I would say it also did the same with stupidity. Random sites, reddit posts, and stuff we read from hacker news doesn't have to have anything at all to do with the truth.

Maybe pushing models to have some factual information bases that are weighted heavier will help, but I don't see how AI in its current form will come off any better than a person that reads tons of bad information and buys into it.

Gigachad · on Feb 15, 2023

The problem is we could be going backwards. ChatGPT is working off a pre LLM internet and it works surprisingly well. If we scrape the internet again in 5 years, could we even get a model as good as the one today?