Hacker News new | past | comments | ask | show | jobs | submit login

"This is more computationally efficient than performing a full content-based lookup across an entire memory buffer for each step in the future, and could be one step towards drastically increasing the context-length available for making a prediction."

Is this how they get a context window of 10 million tokens? Or are they refering to even longer context windows in the future?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: