Hacker News new | past | comments | ask | show | jobs | submit login

Sounds like he’s pretty well informed then. That’s good. 0% error rate is unrealistic in AI models, humans must curate.



Honestly I'm actually kind of impressed the ChatGPT (and its competitors) are as accurate as they are, considering that they are still kind of fundamentally fancy-autocomplete [1]. The fact that when I ask a question to ChatGPT and it's generally "accurate enough" is really quite impressive.

I'm not claiming it's perfect, and like Tim Cook I'm not 100% sure it's actually possible to get to a 0% error rate, particularly considering it's trained on data written by humans on the internet, and humans write a lot of really dumb shit on the internet all the time, and human curation can't possibly stop all of it. It's easy to tell the bot to not scrape obviously sketchy sites like Infowars or something, and maybe block some specific subreddits, but there's still always going to be very dumb people posting very dumb opinions [2] in the "mainstream" areas as well.

It's not weird to see really stupid stuff on actual news websites like CNN talking about "ghost sightings", and I'm not sure how you correct for that in your training. You could block CNN, but that's a pretty big repository of news that you're losing training on, and moreover how do you actually block a news website impartially?

[1] Not to undermine the cool stuff that's being done in the space, just my rough understanding of how the algo actually works.

[2] My HN history is probably included in this.


A good AI system is one where humans can intervene easily and the intervention makes the system learn and be better.


A 0% error rate is unrealistic in humans as well. Impossible really.


But the mistakes made will be different, and historically you’d have a source to consider. Here there is just one global source telling you to add glue to your pizza to make the cheese stick.


There are models that can provide references if you’d like


And they'll happily make up those references as well


Nope, that’s not how it works. Those references aren’t generated in such systems, they are retrieved. They might not provide references to all the sources, of course, same as humans.


Exactly. Right now if I google something (ai overview aside), I’m linked to a source. That source may or may not include its sources, but its provenance tells me a lot. If I’m reading info linked off Mayo Clinic, their reputation compels the information to be judged of high quality. If they start putting in a bunch of garbage, their reputation gets shot and will cause me to look elsewhere. With LLMs there is no such choice, and it will spew everything from high to low quality (to dangerously wrong) info.


LLMs can provide references. There’s no limitation on that. Even GPT4 includes references sometimes when it deems them beneficial.


An LLM itself cannot provide references with any integrity. They’re autogressive probabilistic models. They’ll happily make something up, and you can even try and train a reference with it, but as the article states this is very very far from a guarantee. What you can do is a kind of RAG situation where you have some existing database that you include into the prompt to ground it, but that’s not inherent into the model itself.


Sure, but it can semantically query a vector DB on the side and use the results to generate a "grounded" response.


“LLMs can provide references. There’s no limitation on that.”

That’s not really an LLM providing references, but a separate db as an extra step providing the references.


All outputs of an LLM are generated by LLM. LLMs today can and do use external data sources. Applied to humans, what you're saying is like saying that it's not humans who provide references because they copy the bibtex for them from Arxiv.


But if you’re using an external data source and putting it into the context then it’s the external data source that’s providing the reference — the LLM is just asked to regurgitate it. The large language model, pretrained on trillions of tokens of text, is unable to provide those references.

If I take llama3, for example, and ask it to provide a reference.. it will just make something up. Sometimes these things happen to exist, often times they don’t. And that’s the fundamental problem - they hallucinate. This is well understood.


Those aren't sources. They aren't where the model got the information (that info isn't stored in an LLM).

Rather, such AI systems have two parts: an LLM that writes whatever, and a separate system that searches for links related to what the LLM wrote.


It’s encouraging that we can get domains like aviation to 5 or 6 9s of reliability with human scale systemization. To me, that implies we can get there with AI systems at least in certain domains.


Could you elaborate on why that implication might be true?

I can think of a counter-point:

Humans can deal with 'error-cases' pretty readily. The effort of humans to get those last few 9's might be a linear effort. For example, one extra checklist might be the thing that was needed, or simply adding a whole extra human to avoid a problem. OTOH, computers to get that last 0.001% correct might need magnitudes more effort to get right. The effort of humans vs computers does not scale at the same rate. Why therefore should we think that something that humans can do with 2x effort would not require 2000x better AI?

There are certainly cases where the inverse would be true. Where human effort to get better reliability would be more than what is needed for a computer (monitoring and control systems are good examples; eg: radar operators, nuclear power plants). Though, in cases where computer effort scales better than human effort - it's very likely those efforts have been automated already. That high level of reliability in aviation is likely thanks in part due to automation of tasks that computers are good at.


> 2000x better AI?

Even if it does, that puts AI ahead in what, 22 years with 2x improvements every 2 years? The simple problem with us humans, we haven't notably improved in inteligence for the last 100.000 years and we'll be beat eventually. It's not even a question barring some end of the world event, we already know it's completely possible because we are living proof that 20 W can be at least this smart.

And it's really just that the upfront R&D is expensive, silicon is cheap. Humans are ultra expensive all round constantly, so the longer you use the result, the more that initial seemingly ludicrous cost amortizes to near zero.


If I can paraphrase, you believe that humans are "intelligence level" 100, and AI is essentially somewhere at like 10 or 20, and is doubling every 2 years.

First, is AI actually improving 2x every 2 years? Can you link to some studies that show the benchmarks? AFAIK OpenAI with ChatGPT was something of an 8 or 10 year project. It being released seemingly "out of nowhere" really biases the perception that things are happening faster than they actually are.

Second, is human intelligence and AI even in the same domain? How many dimensions of intelligence are there? Out of those dimensions, which ones have AI at a zero so far? Which of those dimensions are even entirely impossible for AI? If the answer is "AI" can be just as smart as humans in every way, and we still don't even understand that much about human intelligence and cognition, let alone that of other animals... I'm skeptical the answer is yes. (Thinking about sciences view of actual intelligence, animals, and for a long time the thought was animals are biological automotons, I think shows that we don't even understand intelligence, let alone how to build AGI).

Next, even if the intelligence raise is single dimension and is actually the same sport & playing field, what is to say that the exponential growth you describe will be consistent? Could it not be the case that the 1000x to 1001x improvement might be just as hard as all of the first 1000x improvement? What is to say the complexity increase is not also exponential, or even a combinatoric growth?

> 20 W can be at least this smart.

20 W? I'm not familiar with it.


Because humans have 0% error rate? Whether or not the error function can be reduced to zero, I anticipate that human meddling will soon present more of a wrench in the gears of machine cognition than a source of error correction. We see this already with the lobotomization of gpt models per "safety"/copyright concerns.


A key difference is humans can validate, perform orthogonal checks. We can prove things. A LLM that is essentially just a NLP, is picking a probability for "what should follow this word, when given a question that 'looks' like this." Once the answer is chosen, AI so far is left with no other options. If someone says the choice is wrong, what can AI do? Chose a less likely option?

For example, humans can prove that 5 times 8 is 40 in a variety of ways. While you might be wrong in arithmetic, you can check your answer. AI can't check its answer, it does not know when it is wrong (it picked an answer it 'thought' was right, ergo it has no ability to consider that as a wrong answer, otherwise it would have chosen a different answer).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: