I have the entirely unrefined notion, that, surely, lack of data is not what is ...

thfuran · on Oct 6, 2023

Yeah, no person has ever read anything like every textbook ever written, but that's pretty much table stakes for training sets. Clearly there's something missing aside from more virtual reading. (I suspect it has something to do with the half a billion years of pre-training baked into the human neural architecture and the few extra orders of magnitude in scale but who knows)

HDThoreaun · on Oct 6, 2023

People do analyze gigantic amounts of data constantly though. Sure it's not textbooks but the constant sensory data might be a big deal.

thfuran · on Oct 7, 2023

I don't think annotated video datasets tend to be as large relative to human experience, but they'd still be decades or perhaps centuries of video.

all2 · on Oct 6, 2023

I'm sure I've read about specialized neural networks being created. The human brain has (apparently) a bunch of different kinds of neurons in it that specialize in processing different information. I'm not sure how that would work with our current architectures, though.

bpiche · on Oct 6, 2023

Well Jeff Hawkins has been working on this for a while, in terms of biomimetic neural networks. They've done some great work but they don't have anything like modern language models in terms of abilities + performance.

https://www.youtube.com/watch?v=cz-3WDdqbj0&list=PLX9JDz3uBp...

quickthrower2 · on Oct 7, 2023

In addition for animals/humans there are no software/hardware boundaries or any kind of static fixed deployment of either hardware or software.

og_kalu · on Oct 6, 2023

>It feels like computers should be able to get much more out of what is already available

I mean why ? It took millions of years of optimization for humanity to get to the competence level they're currently at. If you think you're "starting from scratch", you really aren't. Keep in mind LLMs can use significantly less data (but still a lot) when you're not trying to force feed the sum total of human knowledge.

So should they be able to get more out of it ? or is this par the course for NNs?

marcosdumay · on Oct 6, 2023

> If you think you're "starting from scratch", you really aren't.

Our genomes have ~600MB, so where are you claiming that humans transmit millions of years of language optimization between generations?

lukeschlather · on Oct 7, 2023

That 600MB is the result of millions of years of optimization. For every human genome that exists today there are many other genomes which were tried and discarded over the years.

It also contains a remarkable amount of compression. Even if you assume that a genome contains a complete template for a human (it likely doesn't) the fact that the compressed version is 600MB doesn't really relate to the unpacked amount of information. Especially since the compression has seen millions of years of optimization.

zo1 · on Oct 7, 2023

You could store a pretty gigantic graph using 600MB.

They're not saying we store the weights, they're saying that we store the "architecture" that you overlay the weights on with training.

kelseyfrog · on Oct 6, 2023

Because humans with less language data outperform LLMs with more language data.

This either says we need better models, not more data.

Or, the human ability to be multi-modal augments our ability to perform language tasks in which case, we need to pump LLMs with much more image and video input than we currently do.

og_kalu · on Oct 6, 2023

The point I'm making is that humans do not in fact have "less language data". We're heavily predisposed to learning languages. We don't start with random weights.

GPT has no such predisposition.

kelseyfrog · on Oct 6, 2023

Better init and architecture aren't what people think of when they think of "giving models more data" - they mean a larger training set.

og_kalu · on Oct 6, 2023

You're not making much sense here. The better init comes from training and little else.

GPT needing lots of training data doesn't mean we need a better architecture. You would expect it to have a lot of training because humans have a lot of training too, spanning millions of years..

kelseyfrog · on Oct 6, 2023

Init means the distribution of weights prior to training.

Human training begins at birth.

Evolution might result in better architecture and init(inductive biases), but that's a separate thing than training.

magicalhippo · on Oct 6, 2023

Evolution has determined a good architecture. The weight training is then just the final tweak to get everything running smoothly.

No reason beyond compute we couldn't do something similar. Ie find good architectures by evaluating them using multiple random weights, and evolve those archigectures that on average gives the best results.

Then over time add a short training step before evaluating.

HDThoreaun · on Oct 6, 2023

> Human training begins at birth.

Is this true? My understanding is that people are born with many pre trained weights. Was the evolutionary convergence of those weights not itself training?

kelseyfrog · on Oct 7, 2023

> Was the evolutionary convergence of those weights not itself training?

No, inductive biases are not training.

I'm saying that better models(ie: better inductive biases) or non-language data is needed to advance LLMs and somehow we've arrived at "evolution is training." I'm not sure how that's relevant to the point.

og_kalu · on Oct 7, 2023

Evolutionary training is not just inductive bias. They're not comparable at all lol.

And the more inductive bias we've shoved into models, the worse they've performed. Transformers have a lot less bias than either RNNs or CNNs and are better for it. Same story with what preceded both.

kelseyfrog · on Oct 7, 2023

This is a good time to stop and ask, what point do you think I'm making?

stevenhuang · on Oct 7, 2023

Pretty much all of your assertions in this thread and https://news.ycombinator.com/item?id=37797108 in particular are what we're disagreeing with.

kelseyfrog · on Oct 7, 2023

Right, I'm checking that you understand what point I'm making by asking you to reflect it.

blovescoffee · on Oct 6, 2023

You're comparing humans - a multimodal model with billions of years of training epochs - to a unimodal language model that's been around for a few months.

quickthrower2 · on Oct 7, 2023

Let alone no hardware squeeze and boundless energy supply through the food chain. Remember for most of the billions of years we were not human and were low powered

kelseyfrog · on Oct 6, 2023

Correct. That's a crucial part if the point I'm making.

bamboozled · on Oct 6, 2023

Yes but heavily inspired by the billion year old model ?

ToValueFunfetti · on Oct 6, 2023

Millions of years of evolutionary computation is a fairly small amount of computational time. LLMs also benefit from decades of neurological computation in that their structure was invented and optimized by humans, which is already orders of magnitude faster than evolution.

cjohnson318 · on Oct 6, 2023

I've found that Google's chat thing is wrong 90% of the time with coding questions. Yesterday I asked how to "crop" a geopandas dataframe to a specific area of interest, a lat/lng box, and it told me to use a dataframe function that's not even in the API. The "highest probability string" is useless if it's just dead wrong.

robbrown451 · on Oct 6, 2023

I've had very different results.

ChatGPT today is like a backhoe compared to a team of human with shovels. You still need a person who knows how to operate it, and their skills are different from those who dig with shovels. A bad backhoe operator is worse than any number of humans with shovels.

Pretty soon it will be able to learn by running its own code and testing it by looking at its output, including with its "vision."

cjohnson318 · on Oct 6, 2023

> I've had very different results.

That is very interesting. I can't think of a single time the Google built-in LLM has worked for me, let alone surprised and delighted me with a technical answer. I'm sure it's great at a lot of things, but it's not a replacement for SO yet.

robbrown451 · on Oct 6, 2023

Oh sorry you said Google. Yes I am speaking of ChatGPT, and I pay for GPT-4. It surprises and delights me on a regular basis. I have no doubt Google will catch up, but right now I think OpenAI is far out front.

cjohnson318 · on Oct 7, 2023

I paid for ChatGPT for a while, but it was hit or miss with some Django stuff. I tried Copilot for the first time today, and I was absolutely blown away. I swear it's like it was reading my mind. I guess I wasn't feeding ChatGPT enough context.

airstrike · on Oct 6, 2023

Same. GPT-4 is amazing for a majority of coding tasks I throw at it.

oezi · on Oct 6, 2023

With ChatGPT-4 I have stopped Googling and using SO for 95% of all programming related queries.

ChatGPT not only gets my specific problem but can produce workable code in many cases.

thfuran · on Oct 6, 2023

The remarkable thing about the current llms is that they're usable at all. For as much pushback as the idea seems to get, they really are a lot more like Markov chain generators than expert systems.

goodluckchuck · on Oct 6, 2023

I think you’re largely right, and that current GPT results may over-represent the model’s learning ability.

A couple of speakers from Microsoft at the MPPC 2023 this week indicated that OpenAI’s models were not merely exposed to e.g. poetry, programming, etc. and learned those fields.

Rather they were saying that the model is more of a composite of skills that were specifically trained, building on word identification, sentences, grammar, ultimately moving on to higher order skills.

Perhaps this isn’t a secret (or perhaps I misunderstood), but it means the model’s ability to perform self-directed learning is much less than I previously though.

huytersd · on Oct 6, 2023

That sounds like pre optimization. In my opinion both things should happen in tandem. GPT4 is way, way above basic competency, I have no idea what you’re referring to.

jstummbillig · on Oct 6, 2023

By "competent", I mean pretty much what you would expect when you talk about a "competent programmer": A somewhat vague concept, yet fairly obvious when working with someone who whats up.

If you would judge GPT4 to be a competent programmer your experience is wildly different from mine. (I am not sure why you felt the need to put a "basic" in there in reference to what I wrote, since that is not what I wrote).

oezi · on Oct 6, 2023

It is on the level of a novice programmer from a skill level, but the breath of knowledge is definitely compensating. It knows xpath as well as SQL as well as your favorite esoteric language.

huytersd · on Oct 6, 2023

GPT4 is more than a competent programmer. It’s way, way above even a rockstar dev.

sgarland · on Oct 6, 2023

I couldn’t get it to write a working B+tree implementation in Python. There was always some bug that would make it fail at some point.

It’s good, don’t get me wrong, but if you go deep it’s usually incorrect somewhere.

afpx · on Oct 9, 2023

It's pretty goood for me. It's saved me literally thosands of hours of work already. I ran a bunch of problems from Leetcode into it, and it got most of them right.

Here's the b+tree implementation it gave me. I haven't checked if it's right. But, I was just curious what it'd come up with.

https://chat.openai.com/share/a582aa43-cca8-426a-a4de-f45fdb...

RandomLensman · on Oct 6, 2023

A lot of high quality information and data is not in the public (and not even for sale).