Hacker News new | past | comments | ask | show | jobs | submit login

A bit nitpicking. I do not think it is quite right to say that current large language models learn, we infuse them with knowledge. On the one hand it is almost just a technicality that the usage of large language models and the training process are two separate processes, on the other hand it is a really important limitation. If you tell a large language model something new, it will be forgotten once that information leaves the context window. Maybe to be added back later on during a training run using that conversation as training data.

Building an AI that can actually learn the way humans learn instead of slightly nudging the output in one direction with countless examples would be a major leap forward, I would guess. I have no good idea how far we are away from that, but it seems not the easiest thing to do with the way we currently build those systems. Or maybe the way we currently train these models turns out to be good enough and there is not much to be gained from a more human like learning process.




LLMs need a lot of GPU power to learn. I'm not sure it's correct to say that they don't learn, it's just a question of them being unable to learn anything more than a very small context window in real-time on presently available/economical hardware. But if you have GPUs with terabytes of VRAM and you feed experience into them, it will learn. It's still questionable if that's enough for true AGI, but I think the inability to learn in real-time is clearly a hardware limitation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: