Sounds like he’s pretty well informed then. That’s good. 0% error rate is unreal...

tombert · 2024-06-17T18:12:24.000000Z

Honestly I'm actually kind of impressed the ChatGPT (and its competitors) are as accurate as they are, considering that they are still kind of fundamentally fancy-autocomplete [1]. The fact that when I ask a question to ChatGPT and it's generally "accurate enough" is really quite impressive.

I'm not claiming it's perfect, and like Tim Cook I'm not 100% sure it's actually possible to get to a 0% error rate, particularly considering it's trained on data written by humans on the internet, and humans write a lot of really dumb shit on the internet all the time, and human curation can't possibly stop all of it. It's easy to tell the bot to not scrape obviously sketchy sites like Infowars or something, and maybe block some specific subreddits, but there's still always going to be very dumb people posting very dumb opinions [2] in the "mainstream" areas as well.

It's not weird to see really stupid stuff on actual news websites like CNN talking about "ghost sightings", and I'm not sure how you correct for that in your training. You could block CNN, but that's a pretty big repository of news that you're losing training on, and moreover how do you actually block a news website impartially?

[1] Not to undermine the cool stuff that's being done in the space, just my rough understanding of how the algo actually works.

[2] My HN history is probably included in this.

darth_avocado · 2024-06-17T18:02:10.000000Z

A good AI system is one where humans can intervene easily and the intervention makes the system learn and be better.

moffkalast · 2024-06-17T17:58:48.000000Z

A 0% error rate is unrealistic in humans as well. Impossible really.

azinman2 · 2024-06-17T18:01:40.000000Z

But the mistakes made will be different, and historically you’d have a source to consider. Here there is just one global source telling you to add glue to your pizza to make the cheese stick.

ein0p · 2024-06-17T18:12:38.000000Z

There are models that can provide references if you’d like

malfist · 2024-06-17T18:23:08.000000Z

And they'll happily make up those references as well

ein0p · 2024-06-17T18:28:57.000000Z

Nope, that’s not how it works. Those references aren’t generated in such systems, they are retrieved. They might not provide references to all the sources, of course, same as humans.

azinman2 · 2024-06-17T19:35:29.000000Z

Exactly. Right now if I google something (ai overview aside), I’m linked to a source. That source may or may not include its sources, but its provenance tells me a lot. If I’m reading info linked off Mayo Clinic, their reputation compels the information to be judged of high quality. If they start putting in a bunch of garbage, their reputation gets shot and will cause me to look elsewhere. With LLMs there is no such choice, and it will spew everything from high to low quality (to dangerously wrong) info.

ein0p · 2024-06-17T19:54:15.000000Z

LLMs can provide references. There’s no limitation on that. Even GPT4 includes references sometimes when it deems them beneficial.

azinman2 · 2024-06-18T05:24:53.000000Z

An LLM itself cannot provide references with any integrity. They’re autogressive probabilistic models. They’ll happily make something up, and you can even try and train a reference with it, but as the article states this is very very far from a guarantee. What you can do is a kind of RAG situation where you have some existing database that you include into the prompt to ground it, but that’s not inherent into the model itself.

ein0p · 2024-06-18T07:35:09.000000Z

Sure, but it can semantically query a vector DB on the side and use the results to generate a "grounded" response.

azinman2 · 2024-06-18T13:57:39.000000Z

“LLMs can provide references. There’s no limitation on that.”

That’s not really an LLM providing references, but a separate db as an extra step providing the references.

ein0p · 2024-06-18T15:21:12.000000Z

All outputs of an LLM are generated by LLM. LLMs today can and do use external data sources. Applied to humans, what you're saying is like saying that it's not humans who provide references because they copy the bibtex for them from Arxiv.

azinman2 · 2024-06-18T17:43:52.000000Z

But if you’re using an external data source and putting it into the context then it’s the external data source that’s providing the reference — the LLM is just asked to regurgitate it. The large language model, pretrained on trillions of tokens of text, is unable to provide those references.

If I take llama3, for example, and ask it to provide a reference.. it will just make something up. Sometimes these things happen to exist, often times they don’t. And that’s the fundamental problem - they hallucinate. This is well understood.

leereeves · 2024-06-18T07:34:46.000000Z

Those aren't sources. They aren't where the model got the information (that info isn't stored in an LLM).

Rather, such AI systems have two parts: an LLM that writes whatever, and a separate system that searches for links related to what the LLM wrote.

rgbrgb · 2024-06-17T18:07:30.000000Z

It’s encouraging that we can get domains like aviation to 5 or 6 9s of reliability with human scale systemization. To me, that implies we can get there with AI systems at least in certain domains.

seadan83 · 2024-06-17T18:46:07.000000Z

Could you elaborate on why that implication might be true?

I can think of a counter-point:

Humans can deal with 'error-cases' pretty readily. The effort of humans to get those last few 9's might be a linear effort. For example, one extra checklist might be the thing that was needed, or simply adding a whole extra human to avoid a problem. OTOH, computers to get that last 0.001% correct might need magnitudes more effort to get right. The effort of humans vs computers does not scale at the same rate. Why therefore should we think that something that humans can do with 2x effort would not require 2000x better AI?

There are certainly cases where the inverse would be true. Where human effort to get better reliability would be more than what is needed for a computer (monitoring and control systems are good examples; eg: radar operators, nuclear power plants). Though, in cases where computer effort scales better than human effort - it's very likely those efforts have been automated already. That high level of reliability in aviation is likely thanks in part due to automation of tasks that computers are good at.

moffkalast · 2024-06-17T18:56:13.000000Z

> 2000x better AI?

Even if it does, that puts AI ahead in what, 22 years with 2x improvements every 2 years? The simple problem with us humans, we haven't notably improved in inteligence for the last 100.000 years and we'll be beat eventually. It's not even a question barring some end of the world event, we already know it's completely possible because we are living proof that 20 W can be at least this smart.

And it's really just that the upfront R&D is expensive, silicon is cheap. Humans are ultra expensive all round constantly, so the longer you use the result, the more that initial seemingly ludicrous cost amortizes to near zero.

seadan83 · 2024-06-20T17:38:24.000000Z

If I can paraphrase, you believe that humans are "intelligence level" 100, and AI is essentially somewhere at like 10 or 20, and is doubling every 2 years.

First, is AI actually improving 2x every 2 years? Can you link to some studies that show the benchmarks? AFAIK OpenAI with ChatGPT was something of an 8 or 10 year project. It being released seemingly "out of nowhere" really biases the perception that things are happening faster than they actually are.

Second, is human intelligence and AI even in the same domain? How many dimensions of intelligence are there? Out of those dimensions, which ones have AI at a zero so far? Which of those dimensions are even entirely impossible for AI? If the answer is "AI" can be just as smart as humans in every way, and we still don't even understand that much about human intelligence and cognition, let alone that of other animals... I'm skeptical the answer is yes. (Thinking about sciences view of actual intelligence, animals, and for a long time the thought was animals are biological automotons, I think shows that we don't even understand intelligence, let alone how to build AGI).

Next, even if the intelligence raise is single dimension and is actually the same sport & playing field, what is to say that the exponential growth you describe will be consistent? Could it not be the case that the 1000x to 1001x improvement might be just as hard as all of the first 1000x improvement? What is to say the complexity increase is not also exponential, or even a combinatoric growth?

> 20 W can be at least this smart.

20 W? I'm not familiar with it.

exo-pla-net · 2024-06-17T18:19:49.000000Z

Because humans have 0% error rate? Whether or not the error function can be reduced to zero, I anticipate that human meddling will soon present more of a wrench in the gears of machine cognition than a source of error correction. We see this already with the lobotomization of gpt models per "safety"/copyright concerns.

seadan83 · 2024-06-20T17:56:37.000000Z

A key difference is humans can validate, perform orthogonal checks. We can prove things. A LLM that is essentially just a NLP, is picking a probability for "what should follow this word, when given a question that 'looks' like this." Once the answer is chosen, AI so far is left with no other options. If someone says the choice is wrong, what can AI do? Chose a less likely option?

For example, humans can prove that 5 times 8 is 40 in a variety of ways. While you might be wrong in arithmetic, you can check your answer. AI can't check its answer, it does not know when it is wrong (it picked an answer it 'thought' was right, ergo it has no ability to consider that as a wrong answer, otherwise it would have chosen a different answer).