You can make up for a lot of smartness by having a smarter model generate all th...

staticman2 · 2024-02-27T21:28:49 1709069329

Could you point me to evidence what you just wrote is true?

I just asked mistral 7b to provide 5 sentences that end with Apple. It couldn't do it. I then provided 5 examples generated from ChatGPT 4 and asked it to generate 5 more. It still couldn't do it. 10 ChatGPT examples- still couldn't do it.

You seem to be saying the models can generalize on the entire context size, that I should keep provided examples up to the token limit because this will make the model smarter. Is there evidence of that?

astrange · 2024-02-27T21:38:09 1709069889

I didn't say to provide examples. What it can do is search inside its window if you have an excessive amount of text.

It can do some extrapolation on tasks it's already been proven to do, but that's not every task.