Hacker News new | past | comments | ask | show | jobs | submit login

You're not making much sense here. The better init comes from training and little else.

GPT needing lots of training data doesn't mean we need a better architecture. You would expect it to have a lot of training because humans have a lot of training too, spanning millions of years..




Init means the distribution of weights prior to training.

Human training begins at birth.

Evolution might result in better architecture and init(inductive biases), but that's a separate thing than training.


Evolution has determined a good architecture. The weight training is then just the final tweak to get everything running smoothly.

No reason beyond compute we couldn't do something similar. Ie find good architectures by evaluating them using multiple random weights, and evolve those archigectures that on average gives the best results.

Then over time add a short training step before evaluating.


> Human training begins at birth.

Is this true? My understanding is that people are born with many pre trained weights. Was the evolutionary convergence of those weights not itself training?


> Was the evolutionary convergence of those weights not itself training?

No, inductive biases are not training.

I'm saying that better models(ie: better inductive biases) or non-language data is needed to advance LLMs and somehow we've arrived at "evolution is training." I'm not sure how that's relevant to the point.


Evolutionary training is not just inductive bias. They're not comparable at all lol.

And the more inductive bias we've shoved into models, the worse they've performed. Transformers have a lot less bias than either RNNs or CNNs and are better for it. Same story with what preceded both.


This is a good time to stop and ask, what point do you think I'm making?


Pretty much all of your assertions in this thread and https://news.ycombinator.com/item?id=37797108 in particular are what we're disagreeing with.


Right, I'm checking that you understand what point I'm making by asking you to reflect it.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: