> Activists will fight fiercely to control the model's output on controversial topics,
They already do. I'd love to know how much "brain damage" RLHF and other censorship techniques cause to the general purpose reasoning abilities of models. (Human reasoning ability is also harmed by lying.) We know the damage is nontrivial.
They already do. I'd love to know how much "brain damage" RLHF and other censorship techniques cause to the general purpose reasoning abilities of models. (Human reasoning ability is also harmed by lying.) We know the damage is nontrivial.