I wonder if an expert can "impose" weights onto a model and the model will opt t...

astrange · on Sept 22, 2023

That would make it less efficient - since learning is compression, a less compressed model will also learn less at the same size.

Jensson · on Sept 23, 2023

Humans have much more compressed models, trying to transfer human learning to machine learning could potentially be a way to get more efficient models.

The way we do that currently is by labeling data for training, but maybe there are better ways to do it. Like some semi code to write with hints for the model. Like, instead of labeled data, could have a series of "lectures" of labeled data that would lead to a good end state, instead of training on all the data in parallel.

You don't teach a child calculus by showing them a million calculus problems after all, you ramp up starting with simple numbers and then slowly ramping up with more concepts. But to do that we would need to change how we train models.

Edit: By doing it that way you could see the skill of the model after each lecture, and update the lecture to try to make the model learn better. Not sure how to do that, but such ways to work with parts of models is a potential way forward.

astrange · on Sept 23, 2023

There are techniques for this called "curriculum learning" and "textbooks".

I'm not sure exactly what is in the textbooks since I admit to not reading the papers yet.

I'm personally wondering if you could increase the reliability of training on web data by labeling it with where each document came from, so it knows different authors disagree on things. But this brings back the issue where people don't like it if a model can write "in the style of Author Name"…

Supply5411 · on Sept 23, 2023

Might be a worthwhile tradeoff. Less efficient model but a clear control interface, versus more efficient model but it's a blackbox.