If you finetune it in formats that humans likes, it actually gains similar biases.
I believe the ‘sparks of AGI’ talks about this, where models can much more accurately predict the probability of events than humans.
After RLHF, it mimicks human bias. So it might be that we can create these models, but we just don’t like them/like to use them.
What, did you think the plebs would get access to the real deal?
If you finetune it in formats that humans likes, it actually gains similar biases.
I believe the ‘sparks of AGI’ talks about this, where models can much more accurately predict the probability of events than humans.
After RLHF, it mimicks human bias. So it might be that we can create these models, but we just don’t like them/like to use them.