Hacker News new | past | comments | ask | show | jobs | submit login

A number of people in my lab do research into long context evaluation of LLMs for works of fiction. The likelihood is very high that Moby Dick is in the training data. Instead the people in my lab have explored recently published books to avoid these issues.

See BooookScore (https://openreview.net/forum?id=7Ttk3RzDeu) which was just presented at ICLR last week and FABLES (https://arxiv.org/abs/2404.01261) a recent preprint.




I suppose the question then is - if you finetune on your own data (eg internal wiki) does it then retain the near-perfect recall?

Could be a simpler setup than RAG for slow-changing documentation, especially for read-heavy cases.


"if you finetune on your own data (eg internal wiki) does it then retain the near-perfect recall"

No, that's one of the primary reasons for RAG.


I think you are misunderstanding. This post is about new capabilities in GPT-4o. So the existing reasons for RAG may not hold for the new model.

Unless you have some evals showing that the previous results justifying RAG also apply to GPT-4o?


I’m not involved in the space, but it seems to me that having a model, in particular a massive model, exposed to a corpus of text like a book in the training data would have very minimal impact. I’m aware that people have been able to return data ‘out of the shadows’ pf the training data but to my mind a model being mildly influenced by the weights between different words in this text hardly constitute hard recall, if anything it now ‘knows’ a little of the linguistic style of the authour.

How far off am I?


It depends on how many times it had seen that text during training. For example, GPT-4 can reproduce ayats from the Quran word for word in both Arabic and English. It can also reproduce the Navy SEAL copypasta complete with all the typos.


Poe's "The Raven" also.


Brothers in username.. :-)


Remember, it's also trained on countless internet discussions and papers on the book.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: