Hacker News new | past | comments | ask | show | jobs | submit login

The tokens are the structure over which the attention mechanism is permutation equivariant. This structure permeates the forward pass, its important at every layer and will be until we find something better than attention.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: