This is why the big bet for AI-assisted AI-development long term is synthetic da...

rdedev · 2024-02-22T07:22:45 1708586565

I wouldn't count aplha zero since it's reinforcement learning. That technique you can generate high quality data all the time since the rules are fixed. Not everything can be trained using that way

lucubratory · 2024-02-22T08:36:48 1708591008

The chess knowledge and skills of LLMs comes from them ingesting a sufficient number of chess games in text format (the amount will be proportional to both other data you have and the compute you have), same with the ability of LLMs to play other games or solve other fixed rule/perfect knowledge puzzles. AlphaZero and its cousins showed that you can generate an effectively infinite quantity of extremely high-quality data in those domains. There is a possibility that the benefit to an LLM's general intelligence from giving it e.g. one billion ~4600 ELO level games is only in improving its ability to play chess. Given the results many studies have reported in cross-learning with LLMs, I doubt that though. The potential is that generating a lot of extremely high level logic and puzzle solving and providing it as extremely high quality synthetic data to an LLM can improve its general reasoning and logic capabilities - that would be huge, and is one of the promises of synthetic data.