>Ilya claims the transformer architecture, with some modification for efficiency...

tinco · on Nov 18, 2023

Humans don't work this way either. You don't need the LLM to do the logic, you just need the LLM to prepare the information so it can be fed into a logic engine. Just like humans do when they shut down their system 1 brain and go into system 2 slow mode.

I'm in the definitely ready for AGI camp. But it's not going to be a single model that's going to do the AGI magic trick, it's going to be an engineered system consisting of multiple communicating models hooked up using traditional engineering techniques.

denton-scratch · on Nov 18, 2023

> You don't need the LLM to do the logic, you just need the LLM to prepare the information so it can be fed into a logic engine.

This is my view!

Expert Systems went nowhere, because you have to sit a domain expert down with a knowledge engineer for months, encoding the expertise. And then you get a system that is expert in a specific domain. So if you can get an LLM to distil a corpus (library, or whatever) into a collection of "facts" attributed to specific authors, you could stream those facts into an expert system, that could make deductions, and explain its reasoning.

So I don't think these LLMs lead directly to AGI (or any kind of AI). They are text-retrieval systems, a bit like search engines but cleverer. But used as an input-filter for a reasoning engine such as an expert system, you could end up with a system that starts to approach what I'd call "intelligence".

If someone is trying to develop such a system, I'd like to know.

lijok · on Nov 18, 2023

They should fire Ilya and get you in there

MediumD · on Nov 18, 2023

> We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as "Uriah Hawthorne is the composer of 'Abyssal Melodies'" and showing that they fail to correctly answer "Who composed 'Abyssal Melodies?'". The Reversal Curse is robust across model sizes and model families and is not alleviated by data augmentation.

This just proves that the LLMs available to them, with the training and augmentation methods they employed, aren't able to generalize. This doesn't prove that it is impossible for future LLMs or novel training and augmentation techniques will be unable to generalize.

visarga · on Nov 18, 2023

No, if you read this article it shows there were some issues with the way they tested.

> The claim that GPT-4 can’t make B to A generalizations is false. And not what the authors were claiming. They were talking about these kinds of generalizations from pre and post training.

> When you divide data into prompt and completion pairs and the completions never reference the prompts or even hint at it, you’ve successfully trained a prompt completion A is B model but not one that will readily go from B is A. LLMs trained on “A is B” fail to learn “B is A” when the training date is split into prompt and completion pairs

Simple fix - put prompt and completion together, don't do gradients just for the completion, but also for the prompt. Or just make sure the model trains on data going in both directions by augmenting it pre-training.

https://andrewmayne.com/2023/11/14/is-the-reversal-curse-rea...

razodactyl · on Nov 22, 2023

LLMs can't generalise no, but the meta-architecture around them can 100%.

Think about the RLHF component that trains LLMs. It's the training itself that generalises - not the final model that becomes a static component.

justrealist · on Nov 18, 2023

[flagged]

cedws · on Nov 18, 2023

There is lack of humility in making such an assertion about a path to AGI.

cscurmudgeon · on Nov 18, 2023

It is funny believing we have AGI and yet resorting ad-hominem to prove/defend that!

At some point in time Ilya was a nobody going against the gods of AI/ML. Just slightly over a decade ago neural networks were a joke in AI.