Anthropic/Claude does not use any RLHF.

teqsun · 2024-08-27T19:40:36 1724787636

Is that a claim they've made or has that been externally proven?

cjbillington · 2024-08-27T19:49:09 1724788149

What do they do instead? Given we're not talking to a base model.

tqi · 2024-08-27T20:44:00 1724791440

Supposedly they use "RLAIF", but honestly given that the first step is to "generate responses... using a helpful-only AI assistant" it kinda sounds like RLHF with more steps.

https://www.anthropic.com/research/constitutional-ai-harmles...