Hacker News new | past | comments | ask | show | jobs | submit login

Every deep learning tech had an exponential growth phase, followed by a slowdown followed by a platou the nothing could break until a fundamentally new architecture came along. People get excited about the first part, project it into the second and start companies by the time we're well into the third.



Yes, but how do they know where we are in that growth phase? If you confidently tell me that "GPT-4 is the plateau," I want to know how you know that, specifically. "Well, because all deep learning technologies eventually slow down" is not a good argument. You need to show me the diminishing returns, or give me a good theoretical argument for why we're reaching them.


The more compute you need to get state of the art performance the closer to the plateau you are. If you didn't need the compute researchers would be getting better results with smarter training. Given that the gpt family of models need more energy to train than Nevada needs to keep the lights on they are very much on the flat part of the logistic growth curve.


This isn't true. In fact, it's the reverse of true. If you think a bit more carefully about your argument, you'll realize that you've asserted that the single most revolutionary advance of modern deep networks (i.e. network architectures whose performance scales neatly with their parameter counts & training epochs) automatically portends "the plateau of forward progress."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: