Hacker News new | past | comments | ask | show | jobs | submit login

The biggest problem with LLMs are that they are simply regressions of writings of millions of humans on the internet. It's very much a machine that is trained to mimic how humans write, nothing more.

The bigger concern is that we don't have the training data required for it to mimic something smarter than a human (assuming that a human is given sufficient time to work on the same problem).

I have no doubt ML will exceed human intelligence, but the current bottleneck is figuring out how to get a model to train itself against its own output without human intervention, instead of doing a regression against the entire world's collective writings. Remember, Ramanujan was, with no formal training and only access to a few math books, able to achieve brilliant discoveries in mathematics. In terms of training data used by ML models, that is basically nothing.




LLMs are increasingly trained on multimodal data besides human written text, such as videos showing complex real-world phenomena that humans may also observe, but not understand, or time series data that humans may observe but be incapable of predicting. I don't see a fundamental reason why the LLM that is trained on such data may not already become a superhuman predictor of e. g. time-series data, with the right model architecture, amount of data, and compute.

In a way, next-token prediction is also a time-series prediction problem, and one that is arguably more difficult than producing the time-series/text in the first place. That is since to predict what a human will say about a topic, you don't just need to model the topic, but also the mental processes of the human talking about it, adding another layer of abstraction.


In this case, what are we hoping to get from the LLM that, say, a simulation or domain specific analysis would not already give us? Also why does it need to be an LLM that is doing these things?

Wouldn't it make more sense to delegate such a pattern finding intelligence? Why does the LLM have to do everything these days?


> Also why does it need to be an LLM that is doing these things?

Well, if you ask me it's pretty clear why: we are in the hype cycle of a very specific new type of hammers, so, of course everything must be nails. Another one is that our understanding of LLMs is limited, which makes them limitless and constantly shape/purpose-shifting for those whose job it is to sell them to us. As an Engineer, I don't know what to do with them: they have no "datasheet" describing their field and range of applicability, and I feel there are many of us annoyed and tired of being constantly pulled into discussions about how to use them at all cost.

(Yes, I understand it was rhetorical)


> Why does the LLM have to do everything these days?

Because why not? It feels like we've stumbled on a first actually general ML model, and we're testing the limits of this approach - throwing more and more complex tasks at them. And so far, surprisingly, LLMs seem to manage. The more they do, the more interesting it is to see how far they can be pushed until they finally break.

> Wouldn't it make more sense to delegate such a pattern finding intelligence?

Maybe. But then, if an LLM can do that specific thing comparably well, and it can do a hundred other similarly specific tasks with acceptable results, then there's also a whole space between those tasks, where they can be combined and blended, which none of the specialized models can access. Call it "synergy", "cross-pollination", "being a generalist".

EDIT:

As a random example: speech recognition models were really bad until recently[0], because it turns out that having an actual understanding of the language is extremely helpful for recognizing speech correctly. That's why LLM (or the multi-modal variant, or some future better general-purpose model) has to do everything - because seemingly separate skills reinforce each other.

--

[0] - Direct and powerful evidence: compare your experience with voice assistants and dictation keyboards, vs. the conversation mode in ChatGPT app. The latter can easily handle casual speech with weird accents delivered outside on a windy day, near busy street, with near-100% accuracy. It's a really spectacular jump in capabilities.


> Because why not? It feels like we've stumbled on a first actually general ML model, and we're testing the limits of this approach - throwing more and more complex tasks at them. And so far, surprisingly, LLMs seem to manage.

We live in completely different worlds. Every LLM I've tried manages to nothing except spout bullshit. If you job is to create bullshit, an LLM is useful. If your job requires anything approximating correctness, LLMs are useless.


>may not already become a superhuman predictor of e. g. time-series data, with the right model architecture, amount of data, and compute.

We have that (to the degree possible) with non-LLM analysis - all kinds of regression, machine learning techniques, etc. LLM might not even be applicable here, or have any edge over something specilialized.

And usually this breaks down because the world (e.g. weather or stock market prices) are not predictable enough to the granularity we're interested at, but full of chaotic behavior and non-linearities.


Right, I think the insight that LLMs "create" though is via compression of knowledge in new ways.

It's like teaching one undergraduate (reading some blog on the internet and training on next word prediction), then another undergraduate (training on another blog), then another one, etc. And finding a way to store what's in common among all these undergraduates. Over time, it starts to see commonalities and to become more powerful than a human. Or a human who had taught a million undergraduates. But it's not just at the undergraduate level. It's the most abstract/efficient way to represent the essence of their ideas.

And in math that's the whole ball game.


> the current bottleneck is figuring out how to get a model to train itself against its own output without human intervention

Isn't that what AlphaZero was known for? After all, the 'zero' part of the name was a play on 'zero-shot learning' which is the concept you're talking about. So we have a proof of existence already, and there's been recent papers showing progress in some mathematical domains (though simplified ones so far, such as Euclidean geometry - which does however feature complex proofs).


Yes, but AlphaZero is based on reinforcement learning, where there is a simple cost function to optimize. There hasn't been much progress in applying reinforcement learning to LLMs to get them to self improve. I agree with the quote that this will be necessary to get superhuman performance in mathematics, and Lean may very well play a role there since it can help provide a cost function by checking correctness objectively.

> but the current bottleneck is figuring out how to get a model to train itself against its own output without human intervention

this is sort of humorous, because the entire current generation of Transformer-based encoders was built specifically to get "a model to train itself without human intervention"

ref. arXiv:2304.12210v2 [cs.LG] 28 Jun 2023 * A Cookbook of Self-Supervised Learning


Yes, thank you!! I have been trying to make this point for years. It seems self-evident to me, but it's hard to convince people of this.

But I think it's worth mentioning that what we have is already useful for research and making people smarter. Even Terence Tao has blogged that there were times where an LLM suggested an approach to a problem that hadn't occurred to him. This also happened to me once, enabling me to solve a problem that I'd failed to figure out on my own.


Does that mean ChatGPT is able to prove something, or summarize something, simply because some training materials prove or summarize similar things? What if the thing I asked it to summarize never existed in the training material? Is it capable to find similar materials and apply the same summarization?


Ramanujan attributed his success in no small part to divine assistance. I wonder if sentient AI would do the same :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: