Hacker News new | past | comments | ask | show | jobs | submit login

It only had part of the internet, OpenAI is nowhere near as comprehensive at web scraping as Google, I don't think they actually scraped at all for this, using existing data like CommonCrawl.

The other thing you are not understanding is that it did not memorized these things, it built representations for predicting the most likely next token. This is why it hallucinates and makes up numbers and web links or citations that do not exist.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: