Hacker News new | past | comments | ask | show | jobs | submit login

But this is becaude ASIOF was in the training dataset. Chatgpt wouldn't be able to say anything about this book if it wasn't in his dataset, and you wouldn't be able to have enough tokens to present the whole book to chatgpt.



Thinking of it as "the training dataset" vs "the context window" is the wrong way of looking at it.

There's a bunch of prior art for adaption techniques for getting new data into a trained model (fine tuning, RLHF etc). There's no real reason to think there won't be more techniques that turn what think of now as the context window into something that alters the weights in the model and is serialized back to disk.


It's a reasonable way to look at it given that's how pretty much all 'deployed' versions of LLM's work?


Exactly.

But also, not just ASIOF is in the training set, but presumably lots of discussion about it and all the interesting events in the book.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: