Hacker News new | past | comments | ask | show | jobs | submit login
Aligning Language Models to Follow Instructions (openai.com)
59 points by todsacerdoti on Jan 27, 2022 | hide | past | favorite | 6 comments



I love that OpenAI is adopting the prompt as a way to "program" GPT-3. It's one of the most intuitive approaches to setting up a model and I love it when non-developers can go in and pleasantly surprise themselves by "training" a model.

InstructGPT is really exciting from this perspective.


I’ve been playing around with the instruct models the last month or so, and it’s quite impressive. Is able to answer so many basic and even intermediate questions with reasonable and sometimes thoughtful answers. Still makes funny mistakes sometimes though. Feels kind of like what I imagined google search of the entire internet could be someday. Amazing that such a model with so much knowledge of the world and human emotions could be fit on a single micro sd card!


How did you get access?


It’s been open to the public. That model type was available as a drop down.


I didn't quite get why RL was used instead of just collecting same data from labelers and either prompt-tuning, p-tuning or fine-tuning on that.


Read the paper. They did finetune from labeled data and set it as baseline. RL outperformed baseline.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: