Hacker News new | past | comments | ask | show | jobs | submit login

That's the brute force approach anyway. Even if you do take an inordinate amount of data to sample from you'll likely get something that's woefully impractical to operate, even if it produces vaguely human like responses.

We do on the other hand know for a fact it's possible to run an instance of consciousness in a volume of about a liter that consumes like 20 watts (aka your average human brain), so there's something probably wrong with our general approach to the matter. GPT-3 already uses about twice as many parameters as our organic counterparts do, with much worse results. And it even doesn't have to process a ridiculously large stream of sensor data and run an entire body of muscle actuators at the same time.




> GPT-3 already uses about twice as many parameters

This isn't accurate. GPT-3 has 175B parameters. The human brain has ~175B cells (neurons, glia, etc.) The analog to GPT-3's parameter count would be synapses, not neurons, where even conservative estimates put the human brain at several orders of magnitude larger. It's likely that >90% of the 175B could be pruned with little change in performance. That changes the synapse ratios since we know the brain is quite a bit sparser. In addition, the training dataset is likely broader than the majority of Internet users. Basically, its not an apples-to-apples comparison.

That said, I agree that simply scaling model and data is the naive approach.


OPT-6.7B is good, but not even close to GPT-3.

If you can get GPT-like performance out of a 17B model, you should publish that.


I’m referring to post-training pruning not smaller models. This is already well-studied but it’s not as useful as it could be on current hardware. (Deep learning currently works better with the extra parameters at training time).

Retrieval models (again, lots of published examples: RETRO, etc.) that externalize their data will bring the sizes down by about that order as well.


I agree that RETRO is cool. I think you might be stretching it a bit with the applicability, but I take your point.


Is evolution not brute-force?


Naturally, but it takes a few billion years. Not sure about you but I don't really feel like waiting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: