Hacker News new | past | comments | ask | show | jobs | submit login

You can make up for a lot of smartness by having a smarter model generate all the text you put in that giant context window.



Could you point me to evidence what you just wrote is true?

I just asked mistral 7b to provide 5 sentences that end with Apple. It couldn't do it. I then provided 5 examples generated from ChatGPT 4 and asked it to generate 5 more. It still couldn't do it. 10 ChatGPT examples- still couldn't do it.

You seem to be saying the models can generalize on the entire context size, that I should keep provided examples up to the token limit because this will make the model smarter. Is there evidence of that?


I didn't say to provide examples. What it can do is search inside its window if you have an excessive amount of text.

It can do some extrapolation on tasks it's already been proven to do, but that's not every task.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: