Aligning Language Models to Follow Instructions

cl42 · on Jan 27, 2022

I love that OpenAI is adopting the prompt as a way to "program" GPT-3. It's one of the most intuitive approaches to setting up a model and I love it when non-developers can go in and pleasantly surprise themselves by "training" a model.

InstructGPT is really exciting from this perspective.

tehsauce · on Jan 28, 2022

I’ve been playing around with the instruct models the last month or so, and it’s quite impressive. Is able to answer so many basic and even intermediate questions with reasonable and sometimes thoughtful answers. Still makes funny mistakes sometimes though. Feels kind of like what I imagined google search of the entire internet could be someday. Amazing that such a model with so much knowledge of the world and human emotions could be fit on a single micro sd card!

axg11 · on Jan 28, 2022

How did you get access?

tehsauce · on Jan 28, 2022

It’s been open to the public. That model type was available as a drop down.

rllearneratwork · on Jan 28, 2022

I didn't quite get why RL was used instead of just collecting same data from labelers and either prompt-tuning, p-tuning or fine-tuning on that.

sanxiyn · on Jan 30, 2022

Read the paper. They did finetune from labeled data and set it as baseline. RL outperformed baseline.