>You might be lowering the cost of your training corpus by a few million dollars...

visarga · 2025-01-30T05:31:32 1738215092

> the ability for a company like OpenAI to collect novel data from interacting with real humans is a material advantage over their competition

It's different kind of data from the R1 reasoning chains. When LLMs have human in the loop, the human provides help based off their personal experience and real world validation. Sometimes users take an idea from the LLM and try it in real life. Then come back later and discuss the outcomes. This is a real world testing loop.

In order to judge if an AI response was useful, you can look at the following messages with a judge LLM. Using hindsight helps a lot here. Maybe it doesn't pan out and the user tries another approach, or maybe some innocuous idea was key to success later. It's hard to tell in the moment, but easy when you see what followed after that.

This scales well - OpenAI has 300M users, I estimate up to 1 Trillion interactive tokens/day. The user base is very diverse, problems are diverse, and feedback comes from user experience and actual testing. They form an experience flywheel, the more problem solving they do, the smarter it gets, attracting more users.