Hacker News new | past | comments | ask | show | jobs | submit login

I suppose with a blanket statement like "nothing is happening" I'm practically begging to be contradicted :)

Some of this stuff looks very cool. Have we started getting RL to work generally? I remember my last impression being that we were in the "promising idea but struggling to get it working in practice" phase but that was some time ago now.






Edit: after writing what follows, I realized you might have been asking about RL apply to LLMs. I don't know if anyone has made any progress there yet.

Depends on what you mean by 'generally?' It won't be able to solve all kinds of problems, since e.g. you need problems with well-defined cost functions like win probability for a board game. But AlphaGo and AlphaZero have outstripped the best Go players, whereas before this happened people didn't expect computers to surpass human play any time soon.

For AlphaFold, it has literally revolutionized the field of structural biology. Deepmind has produced a catalog of the structure of nearly every known protein (see https://www.theverge.com/2022/7/28/23280743/deepmind-alphafo..., and also https://www.nature.com/articles/d41586-020-03348-4 if you can get past the paywall). It's probably the biggest thing to happen to the field since cryoelectron microscopy and maybe since X-ray crystallography. It may be a long time before we see commercially available products from this breakthrough, but the novel pharmaceuticals are coming.

Those are the biggest success stories in RL that I can think of, where they are not just theoretical but have lead to tangible outcomes. I'm sure there are many more, as well as examples where reinforcement learning still struggles. Mathematics is still in the latter category, which is probably why Terence Tao doesn't mention it in this post. But I think these count as expanding the set of things humans can do, or could do if they ever work :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: