I've always thought it was abundantly clear how to make smaller models perform a...

heyitsguay · on Dec 22, 2022

People have tried (and continue to try) that human-in-the-loop data growth. Basically any applied AI company is doing something like that every day, if they're getting their own training data in the course of business. It helps but it won't turn your bag-of-words model into GPT3.

Companies like Google have even spent huge amounts of time and money on enormous labeled datasets -- JFT-300M or something like that for computer vision tasks, as you might guess, ~300M labeled images. It creates value, but it creates more value for larger models with higher capacity.

z3c0 · on Dec 23, 2022

I "have tried (and continue to try) that human-in-the-loop data growth" to enormous success, bringing logistic regression models to greater than 99% accuracy. And you can chain vectorization strategies to create more input features than simply a bag-of-words, like morphology, shape, etc. We (the software company that I work for) don't need GPT-3, because it is a specialized model geared towards generating human-like text. Most NLP problems are just parsing text for actionable information, and oftentimes, supervised models can be chained to create something far more effective towards your needs than trying to shoehorn a massive general-purpose unsupervised model into a specialized problem.

janef0421 · on Dec 22, 2022

Supervised models would also require a lot more human labour, and the goal of most machine learning projects is to achieve cost-savings by eliminating human labour.

z3c0 · on Dec 22, 2022

Up front, yes, but long term, I wholly disagree. A model that performs at 95% or higher will assuredly eliminate human work, no matter how many interns you enlist to label the data.