Hacker News new | past | comments | ask | show | jobs | submit login

tldr: using large expensive models to auto-label data to train small cheap models.

(I find the 'mechanical turk' framing here to be much more confusing & misleading than clever or helpful, and to make it harder to compare to the considerable number of other papers on using language models to generate new datasets & do self-distillation.)




>to make it harder to compare to the considerable number of other papers

Naturally. There's a reason AI papers are not published in respected journals a significant proportion of the time.


Isn't this just "transfer learning"? Surely there has to be a better way than "momma bird pukes into baby bird's mouth" type of training


No. Transfer usually means using the same NN model (eg. GPT-3 checkpoints being retrained on Github and then called 'Codex'), or possibly some sort of distillation/sparsifying approach. This is about auto-generating training data, maybe not even meant to be used by a neural net at all.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: