>It’s that it didn’t obey what it was told. I find you basically have to stop th...

moffkalast · on May 13, 2023

So the correct way to configure LLMs is to look at them sternly and yell "BAD DOG!" when they don't follow instructions and give them treats when they do?

sebzim4500 · on May 13, 2023

The technical term is Reinforcement Learning from Human Feedback (RLHF) but yes, that's basically what you do.

moffkalast · on May 13, 2023

Ha I suppose it's exactly that.

williamdclt · on May 13, 2023

The way to “configure” LLMs is training, yes!